feat: plugin contract e2e, qualify --changed, production observe, regressions

This commit is contained in:
John Dvorak
2026-05-22 11:05:52 -07:00
parent d0523fcc2d
commit 1de735ee08
34 changed files with 1392 additions and 122 deletions
+23 -14
View File
@@ -8,11 +8,11 @@ This audit is based on code inspection plus command verification, not documentat
## Executive Summary
APOPHIS has real product value. It is not just a schema wrapper: it gives Fastify teams a way to express and verify behavioral API promises that OpenAPI/JSON Schema cannot cover, especially cross-route invariants such as create/read consistency, delete semantics, auth/session flows, state transitions, idempotency, outbound dependency expectations, and replayable counterexamples.
APOPHIS has real product value. It is not just a schema wrapper: it gives Fastify teams a way to express, verify, and observe behavioral API promises that OpenAPI/JSON Schema cannot cover, especially cross-route invariants such as create/read consistency, delete semantics, auth/session flows, state transitions, idempotency, outbound dependency expectations, and replayable counterexamples.
I would adopt APOPHIS today as a focused behavioral verification tool for Fastify v5 ESM services. I would start with CI `verify` and a small number of high-value contracts, then expand into `qualify` and runtime observation once the team has clear operating guidance.
I would not yet treat it as a complete production observability platform or a turnkey organization-wide release gate. The core implementation is strong, but the remaining value gap is mostly around operational maturity: standalone observe activation, deeper tests around recent CLI behavior, richer scenario authoring, and clearer release-gate recommendations.
I would not yet treat it as a complete production observability platform or a turnkey organization-wide release gate. The core implementation is strong, but the remaining value gap is mostly around operational maturity: standalone observe process management, richer scenario authoring, and organization-specific release-gate policy.
Adoption verdict: strong team pilot candidate, credible standardization candidate after the remaining gaps below are addressed.
@@ -34,10 +34,10 @@ Observed results:
|---|---:|
| Typecheck | pass |
| Build | pass |
| Source tests | 587 pass, 0 fail |
| CLI tests | 311 pass, 0 fail |
| Source tests | 590 pass, 0 fail |
| CLI tests | 320 pass, 0 fail |
| Docs smoke tests | 4 pass, 0 fail |
| Total tests | 902 pass, 0 fail |
| Total tests | 921 pass, 0 fail |
The working tree contains many broader project changes unrelated to this audit. This document evaluates the current working tree state.
@@ -51,7 +51,7 @@ Mostly yes for behavioral verification. Partially for production observation and
| Deterministic CI verification | Yes, materially. CLI `verify` now honors configured `runs`, uses seeded request generation, emits artifacts, supports route filters, replay metadata, and machine-readable output. |
| Cross-route behavior | Yes for supported formula operations and route-call semantics. This is the most differentiated value. |
| Runtime validation | Yes when the plugin is explicitly configured outside production. Production enforcement is intentionally blocked. |
| Runtime observation | Partially. Programmatic plugin observation exists and emits non-blocking sink events with sampling. The CLI validates/report readiness but does not attach to or run a service. |
| Runtime observation | Yes for programmatic production-safe hooks. APOPHIS emits non-blocking sink events with sampling in production when `observe.enabled` and sinks are configured. The CLI validates/report readiness but does not attach to or run a service. |
| Stateful/scenario/chaos qualification | Partially. The runner and artifacts are useful, route discovery is now shared with verify, and config supports scenarios/chaos knobs. Scenario authoring is still young and needs more real-world examples/tests. |
| Outbound dependency mocking | Useful but intentionally process-global. The misleading scoped `undici-mock-agent` option has been removed. Teams still need careful test isolation. |
| Team-safe onboarding | Good. The package has CLI help, init/doctor/replay/verify/qualify/observe, config validation, machine output, docs smoke tests, packaging tests, and production safety checks. |
@@ -102,12 +102,17 @@ The following earlier adoption risks have been addressed in the current working
|---|---|
| CLI `verify` runs | `VerifyRunnerDeps` accepts `runs`; `verifyCommand()` passes resolved config; `runVerify()` executes contracts for `contractRuns`. |
| Observe sampling | `hook-validator.ts` gates sink emission using `opts.observe.sampling` before emitting pass/violation/error events. |
| Production observe activation | `apophisPlugin` now keeps blocking runtime validation disabled in production while allowing non-blocking observe sinks to emit pass/violation/error events. |
| Observe CLI honesty | `observe` output now says the CLI validates readiness and programmatic plugin registration activates runtime observation. |
| Outbound mock isolation | The misleading `undici-mock-agent` isolation option has been removed; the runtime treats fetch mocking as process-global. |
| Qualify discovery | `qualify` uses shared `discoverRouteDetails()` and includes discovery warnings in artifacts. |
| Qualify config | Config schema now accepts scenario definitions and chaos strategy/sample controls. |
| Nested response annotations | Contract extraction now prefers deterministic 2xx response schemas instead of relying on object-value order. |
| `--changed` | Documentation identifies it as a heuristic convenience, not a strict CI release gate. |
| Plugin contracts (end-to-end) | Full pipeline: config schema, plugin registration, compose+merge in all runners, precondition→skip, auto-inject headers, source attribution (`formulaSources`), failure counting, `drainWarnings()` collection, production safety. Wired through verify, qualify (scenario/stateful/chaos), and replay. |
| Artifact pipeline CI/CD | 6 CI-facing regression tests: json-summary parseable, ndjson-summary parseable, `--quiet` persistence, skipped field presence, exit code 0 on pass, qualify json-summary. Verify→replay round-trip test with plugin contracts. |
| CLI output hygiene | Console.warn bleeding fixed (`drainWarnings`); `json-summary``human` format normalization bug fixed; `--quiet` no longer suppresses machine format output. |
| Qualify --changed | Qualify now supports `--changed` flag with same git-diff heuristic as verify. Prints match count, exits 0 when no changed routes. |
## Remaining Adoption Gaps
@@ -121,7 +126,7 @@ The implementation supports runtime observation only when the application explic
- Integration tests prove sink sync failures and async rejections never change route responses.
- Integration tests prove sampling: 0 suppresses all events; sampling: 1 emits expected `contract.pass`/`contract.violation` events.
**Still open:** A future `apophis observe --app ./app.ts` mode that activates a running service observer.
**Still open:** A future `apophis observe --app ./app.ts` mode that imports and starts a service for local/staging smoke observation. Production observation itself is now programmatic and active through plugin registration.
### P1: Recent `verify` Runs Behavior Now Has Regression Tests
@@ -130,7 +135,7 @@ The implementation supports runtime observation only when the application explic
- Regression test proves `runs: 5` scales multiplicatively from `runs: 1`.
- Regression test proves `runs: 10` is deterministic at the same seed.
**Still open:** Variant-aware runs test (verifying run budget is per-variant or shared).
**Completed:** Variant-aware runs regression proves the run budget is applied per variant.
### P1: Qualify Product Shape Improved
@@ -139,7 +144,7 @@ The implementation supports runtime observation only when the application explic
- Configured-scenario qualify test added (independent of OAuth fixture routes).
- `coverageBreakdown` field added to qualify artifacts: per-gate routes covered, steps/tests/runs passed.
**Still open:** Clear guidance for nightly/staging use versus pull-request gating in qualify docs.
**Completed:** `docs/qualify.md` now documents pull-request versus nightly/staging gate guidance.
### P1: Outbound Mocks Process-Global, Honestly Documented
@@ -210,14 +215,14 @@ High-value first contracts:
| Fastify fit | 8/10 | Strong plugin/inject/decorator alignment; discovery order still matters. |
| Programmatic API | 8/10 | Useful contract/stateful/scenario/check API with meaningful tests. |
| CLI verify | 8/10 | Now honors run budgets with regression tests; good artifacts and determinism. |
| Observe | 7/10 | Runtime sink primitives, sampling, and sink-failure-resilience exist with tests. Production-style docs added. Standalone operational story not complete. |
| Observe | 8/10 | Production-safe non-blocking sink emission, sampling, and sink-failure-resilience exist with tests. Standalone process-management story is still future work. |
| Qualify | 7/10 | Improved discovery/config/scenarios. Coverage breakdown in artifacts. Needs richer scenario examples and gating guidance. |
| Outbound mocking | 7/10 | Useful and honest about process-global behavior. Docs and README explicit. True scoped mocking remains future work. |
| Docs | 8/10 | Broad and increasingly precise. Observe and qualify docs expanded with real code examples. |
| Packaging | 9/10 | Strong for a Node/Fastify package. |
| Team readiness | 8/10 | Ready for pilot and selective CI use with regression-locked verification behavior. |
Overall: 8/10 for real team pilot use. Potential 9/10 if observe gains a clearer production story and qualify gets first-class CI workflow guidance.
Overall: 8.5/10 for real team pilot use. Potential 9/10 if standalone observe process management and richer scenario libraries become first-class.
## Highest-Impact Next Work
@@ -226,9 +231,13 @@ Overall: 8/10 for real team pilot use. Potential 9/10 if observe gains a clearer
3. ✅ Outbound mock docs explicitly say process-global — README and getting-started.md updated.
4. ✅ Qualify scenario config documented with full examples in qualify.md.
5. ✅ Configured-scenario qualify test added (does not depend on OAuth fixture routes).
6. Add full production-style observe example with a real collector sink implementation.
7. Improve qualify artifact coverage summaries to distinguish route-contract, scenario, stateful, and chaos coverage more clearly.
8. Consider true scoped outbound mocking (undici dispatcher) only if concurrent in-process dependency tests become a core promise.
6. ✅ Full production-style observe example with real collector sink implementation added to docs/observe.md.
7. ✅ Plugin contract support end-to-end: docs, tests, all runners wired.
8. ✅ Artifact pipeline CI/CD regression tests: json-summary, ndjson-summary, --quiet, skipped field, exit codes.
9. ✅ Qualify --changed implemented.
10. Add standalone observe process management (`apophis observe --app ./app.ts`) for local/staging observation.
11. Add route ownership / file-to-route maps for precise `--changed` filtering.
12. Consider true scoped outbound mocking (undici dispatcher) only if concurrent in-process dependency tests become a core promise.
## Bottom Line