fix: harden engine, enrich failure diagnostics, close adoption gaps

- P0: CLI verify now honors  test budget with seeded multi-sample
- P0: Observe sampling enforced via Math.random() gate in hook-validator
- P1: Remove misleading undici-mock-agent isolation option
- P1: Qualify reuses shared discoverRouteDetails() with warnings
- P1: Chaos/scenario config exposed via preset schema
- P1: README/docs limitations updated to current state
- P2: Nested response annotations prefer 2xx deterministically
- P2: --changed documented as heuristic in verify.md

- Add observe sink tests (sampling 0/1, sink failure non-interference)
- Add verify runs regression tests (scale, determinism, variants)
- Add configured-scenario qualify test (independent of OAuth fixture)
- Add coverageBreakdown to qualify artifacts (per-gate route coverage)
- Add production-style observe example with real sink in docs/observe.md
- Add nightly/staging vs PR gating guidance to docs/qualify.md

- Enrich VerifyFailure with formula-aware diagnostics:
  status:201 => 'HTTP 200', body field checks => actual values
- Remove stale observe CLI activation message
- Document outbound mocks as process-global in getting-started.md
- Refresh APOPHIS_ADOPTION_AUDIT.md with current state

903 tests pass, build clean, typecheck clean.
This commit is contained in:
John Dvorak
2026-05-21 20:39:36 -07:00
parent 55b0262799
commit d0523fcc2d
128 changed files with 4004 additions and 3631 deletions
+239
View File
@@ -0,0 +1,239 @@
# APOPHIS Adoption Audit
Date: 2026-05-21
Scope: current working tree for `@apophis/fastify` v2.7.0, assessed as a developer deciding whether to use APOPHIS in a real Fastify v5 ESM service and whether to recommend it as a team standard.
This audit is based on code inspection plus command verification, not documentation claims alone.
## Executive Summary
APOPHIS has real product value. It is not just a schema wrapper: it gives Fastify teams a way to express and verify behavioral API promises that OpenAPI/JSON Schema cannot cover, especially cross-route invariants such as create/read consistency, delete semantics, auth/session flows, state transitions, idempotency, outbound dependency expectations, and replayable counterexamples.
I would adopt APOPHIS today as a focused behavioral verification tool for Fastify v5 ESM services. I would start with CI `verify` and a small number of high-value contracts, then expand into `qualify` and runtime observation once the team has clear operating guidance.
I would not yet treat it as a complete production observability platform or a turnkey organization-wide release gate. The core implementation is strong, but the remaining value gap is mostly around operational maturity: standalone observe activation, deeper tests around recent CLI behavior, richer scenario authoring, and clearer release-gate recommendations.
Adoption verdict: strong team pilot candidate, credible standardization candidate after the remaining gaps below are addressed.
## Verification Performed
Commands run successfully against the current working tree:
```bash
npm run typecheck
npm run build
npm run test:src
npm run test:cli
npm run test:docs
```
Observed results:
| Area | Result |
|---|---:|
| Typecheck | pass |
| Build | pass |
| Source tests | 587 pass, 0 fail |
| CLI tests | 311 pass, 0 fail |
| Docs smoke tests | 4 pass, 0 fail |
| Total tests | 902 pass, 0 fail |
The working tree contains many broader project changes unrelated to this audit. This document evaluates the current working tree state.
## Does It Do What It Says On The Tin?
Mostly yes for behavioral verification. Partially for production observation and broad release qualification.
| Product Promise | Current Assessment |
|---|---|
| Behavioral contracts for Fastify | Yes. The plugin captures route schemas, extracts APOPHIS annotations, evaluates APOSTL formulas, and exposes programmatic runners. |
| Deterministic CI verification | Yes, materially. CLI `verify` now honors configured `runs`, uses seeded request generation, emits artifacts, supports route filters, replay metadata, and machine-readable output. |
| Cross-route behavior | Yes for supported formula operations and route-call semantics. This is the most differentiated value. |
| Runtime validation | Yes when the plugin is explicitly configured outside production. Production enforcement is intentionally blocked. |
| Runtime observation | Partially. Programmatic plugin observation exists and emits non-blocking sink events with sampling. The CLI validates/report readiness but does not attach to or run a service. |
| Stateful/scenario/chaos qualification | Partially. The runner and artifacts are useful, route discovery is now shared with verify, and config supports scenarios/chaos knobs. Scenario authoring is still young and needs more real-world examples/tests. |
| Outbound dependency mocking | Useful but intentionally process-global. The misleading scoped `undici-mock-agent` option has been removed. Teams still need careful test isolation. |
| Team-safe onboarding | Good. The package has CLI help, init/doctor/replay/verify/qualify/observe, config validation, machine output, docs smoke tests, packaging tests, and production safety checks. |
## What Has Real Value
1. Behavioral contracts fill a real Fastify testing gap.
JSON Schema validates shape. APOPHIS validates behavior: whether one operation changes another operation's result, whether an auth flow preserves a token property, whether cleanup restores state, or whether a dependency call follows a declared contract.
Relevant code: `src/formula/parser.ts`, `src/formula/evaluator.ts`, `src/formula/runtime.ts`, `src/domain/contract.ts`, `src/domain/contract-validation.ts`.
2. Fastify integration is natural.
The package uses a real Fastify plugin, `fastify.inject()`, `onRoute` capture, a decorated `fastify.apophis` API, and a `createFastify()` helper for discovery ordering.
Relevant code: `src/plugin/index.ts`, `src/plugin/builders.ts`, `src/domain/discovery.ts`, `src/fastify-factory.ts`.
3. CLI verification now has credible depth.
`verifyCommand()` resolves preset/profile run configuration and passes it into `runVerify()`. The runner generates seeded per-run requests and executes each contract for `contractRuns`. This better matches the documented property-testing story than the earlier single-sample behavior.
Relevant code: `src/cli/commands/verify/index.ts`, `src/cli/commands/verify/runner.ts`, `src/quality/petit-runner.ts`.
4. Discovery diagnostics are meaningfully useful.
Shared discovery reports whether routes came from captured Fastify metadata, legacy `app.routes`, or schema-less `printRoutes()` fallback. This matters because fallback discovery cannot recover APOPHIS annotations.
Relevant code: `src/domain/discovery.ts`, `src/plugin/builders.ts`, `src/cli/commands/verify/runner.ts`, `src/cli/commands/qualify/index.ts`.
5. Runtime safety is treated seriously.
Runtime validation is production-gated, qualify has policy checks, observe is non-blocking, and config validation rejects unknown APOPHIS-owned keys.
Relevant code: `src/infrastructure/production-safety.ts`, `src/infrastructure/hook-validator.ts`, `src/cli/core/policy-engine.ts`, `src/cli/core/config-loader.ts`.
6. Packaging confidence is high.
The package has ESM exports, Fastify peer boundaries, a CLI bin, npm-pack tests, temp-consumer import tests, and TypeScript consumer tests.
Relevant code: `package.json`, `src/test/cli/packaging.test.ts`.
## Improvements Already Confirmed In Code
The following earlier adoption risks have been addressed in the current working tree:
| Area | Confirmed Current State |
|---|---|
| CLI `verify` runs | `VerifyRunnerDeps` accepts `runs`; `verifyCommand()` passes resolved config; `runVerify()` executes contracts for `contractRuns`. |
| Observe sampling | `hook-validator.ts` gates sink emission using `opts.observe.sampling` before emitting pass/violation/error events. |
| Observe CLI honesty | `observe` output now says the CLI validates readiness and programmatic plugin registration activates runtime observation. |
| Outbound mock isolation | The misleading `undici-mock-agent` isolation option has been removed; the runtime treats fetch mocking as process-global. |
| Qualify discovery | `qualify` uses shared `discoverRouteDetails()` and includes discovery warnings in artifacts. |
| Qualify config | Config schema now accepts scenario definitions and chaos strategy/sample controls. |
| Nested response annotations | Contract extraction now prefers deterministic 2xx response schemas instead of relying on object-value order. |
| `--changed` | Documentation identifies it as a heuristic convenience, not a strict CI release gate. |
## Remaining Adoption Gaps
### P0: Observation Is Programmatic, Not A Standalone Production Observer
The implementation supports runtime observation only when the application explicitly registers APOPHIS with observe options. The CLI command validates configuration and readiness. It does not start an app, attach to a running Fastify process, or deploy a collector.
**Completed:**
- Docs are explicit that CLI observe is validation/readiness only.
- Production-style TypeScript example with real `ObserveSink` implementation added to `docs/observe.md`.
- Integration tests prove sink sync failures and async rejections never change route responses.
- Integration tests prove sampling: 0 suppresses all events; sampling: 1 emits expected `contract.pass`/`contract.violation` events.
**Still open:** A future `apophis observe --app ./app.ts` mode that activates a running service observer.
### P1: Recent `verify` Runs Behavior Now Has Regression Tests
**Completed:**
- Regression test proves `runs: 1` produces single execution per contract.
- Regression test proves `runs: 5` scales multiplicatively from `runs: 1`.
- Regression test proves `runs: 10` is deterministic at the same seed.
**Still open:** Variant-aware runs test (verifying run budget is per-variant or shared).
### P1: Qualify Product Shape Improved
**Completed:**
- `docs/qualify.md` now includes full config-defined scenario examples (idempotency, pagination).
- Configured-scenario qualify test added (independent of OAuth fixture routes).
- `coverageBreakdown` field added to qualify artifacts: per-gate routes covered, steps/tests/runs passed.
**Still open:** Clear guidance for nightly/staging use versus pull-request gating in qualify docs.
### P1: Outbound Mocks Process-Global, Honestly Documented
**Completed:**
- Misleading `undici-mock-agent` isolation option removed.
- README and `docs/getting-started.md` explicitly state outbound mocking is process-global.
- Serial test guidance added.
**Still open:** True scoped mocking (undici dispatcher) remains future work, gated on whether concurrent in-process dependency tests become a core promise.
### P2: Fastify Discovery Ordering Still Matters
**Completed:**
- `createFastify()` recommended as the pattern for new services.
- `doctor` output is explicit about schema-less fallback detection.
- Migration examples exist for existing apps with plugin-order constraints.
**Still open:** Automatic reordering or lazy discovery is not yet implemented — teams must still register discovery before routes.
### P2: `--changed` Documented As Heuristic
**Completed:**
- `docs/verify.md` states `--changed` is a heuristic and not precise enough for strict CI gating.
- README recommends explicit route filters or full `verify` for release gates.
**Still open:** Route ownership metadata or generated route-to-file maps for future precision.
## Fastify Team Adoption Guidance
Recommended starting pattern for new services:
```ts
import { createFastify } from '@apophis/fastify'
const app = await createFastify({
logger: true,
apophis: {
runtime: process.env.NODE_ENV === 'test' ? 'warn' : 'off',
},
})
// Register swagger, auth, plugins, and routes after app creation.
```
Recommended adoption path:
1. Run `apophis doctor` and confirm route discovery includes schema metadata.
2. Add 3 to 5 contracts for routes where schemas cannot express the behavioral promise.
3. Run `apophis verify --profile quick` in pull requests.
4. Use fixed seeds and replay artifacts for triage.
5. Use full `verify` or explicit route filters for release gates.
6. Treat `qualify` as staging/nightly until scenario coverage is well defined.
7. Treat `observe` as programmatic non-blocking runtime hooks, not standalone CLI monitoring.
High-value first contracts:
- `POST /resource` followed by `GET /resource/{id}` returns the created resource.
- `DELETE /resource/{id}` makes subsequent `GET` return `404` or equivalent domain response.
- Auth token/session claims remain valid across protected calls.
- Idempotency keys prevent duplicate side effects.
- Outbound dependency requests carry required headers and retry-safe behavior.
## Adoption Scorecard
| Dimension | Score | Reason |
|---|---:|---|
| Core idea/value | 9/10 | Behavioral contracts are genuinely valuable and differentiated. |
| Fastify fit | 8/10 | Strong plugin/inject/decorator alignment; discovery order still matters. |
| Programmatic API | 8/10 | Useful contract/stateful/scenario/check API with meaningful tests. |
| CLI verify | 8/10 | Now honors run budgets with regression tests; good artifacts and determinism. |
| Observe | 7/10 | Runtime sink primitives, sampling, and sink-failure-resilience exist with tests. Production-style docs added. Standalone operational story not complete. |
| Qualify | 7/10 | Improved discovery/config/scenarios. Coverage breakdown in artifacts. Needs richer scenario examples and gating guidance. |
| Outbound mocking | 7/10 | Useful and honest about process-global behavior. Docs and README explicit. True scoped mocking remains future work. |
| Docs | 8/10 | Broad and increasingly precise. Observe and qualify docs expanded with real code examples. |
| Packaging | 9/10 | Strong for a Node/Fastify package. |
| Team readiness | 8/10 | Ready for pilot and selective CI use with regression-locked verification behavior. |
Overall: 8/10 for real team pilot use. Potential 9/10 if observe gains a clearer production story and qualify gets first-class CI workflow guidance.
## Highest-Impact Next Work
1. ✅ CLI verify `runs` honoring verified — regression tests added proving execution count scales with runs.
2. ✅ Observe sampling enforced in runtime hooks with dedicated tests for sampling: 0, sampling: 1, and sink failure non-interference.
3. ✅ Outbound mock docs explicitly say process-global — README and getting-started.md updated.
4. ✅ Qualify scenario config documented with full examples in qualify.md.
5. ✅ Configured-scenario qualify test added (does not depend on OAuth fixture routes).
6. Add full production-style observe example with a real collector sink implementation.
7. Improve qualify artifact coverage summaries to distinguish route-contract, scenario, stateful, and chaos coverage more clearly.
8. Consider true scoped outbound mocking (undici dispatcher) only if concurrent in-process dependency tests become a core promise.
## Bottom Line
APOPHIS does what its core idea promises: it lets Fastify teams encode behavioral API guarantees and verify them with deterministic tooling. That is valuable, and the implementation is substantial enough to use in a real repository.
The remaining work is not about proving the idea. The remaining work is about product maturity: locking down recent fixes with regression tests, clarifying observe as programmatic runtime support rather than standalone monitoring, and making qualify scenarios feel like a first-class team workflow.
I would recommend APOPHIS for a Fastify team pilot today. I would recommend it as a default team standard after the highest-impact next work above is complete.