Files
apophis-fastify/docs/observe.md
T
John Dvorak d0523fcc2d fix: harden engine, enrich failure diagnostics, close adoption gaps
- P0: CLI verify now honors  test budget with seeded multi-sample
- P0: Observe sampling enforced via Math.random() gate in hook-validator
- P1: Remove misleading undici-mock-agent isolation option
- P1: Qualify reuses shared discoverRouteDetails() with warnings
- P1: Chaos/scenario config exposed via preset schema
- P1: README/docs limitations updated to current state
- P2: Nested response annotations prefer 2xx deterministically
- P2: --changed documented as heuristic in verify.md

- Add observe sink tests (sampling 0/1, sink failure non-interference)
- Add verify runs regression tests (scale, determinism, variants)
- Add configured-scenario qualify test (independent of OAuth fixture)
- Add coverageBreakdown to qualify artifacts (per-gate route coverage)
- Add production-style observe example with real sink in docs/observe.md
- Add nightly/staging vs PR gating guidance to docs/qualify.md

- Enrich VerifyFailure with formula-aware diagnostics:
  status:201 => 'HTTP 200', body field checks => actual values
- Remove stale observe CLI activation message
- Document outbound mocks as process-global in getting-started.md
- Refresh APOPHIS_ADOPTION_AUDIT.md with current state

903 tests pass, build clean, typecheck clean.
2026-05-21 20:39:36 -07:00

6.5 KiB

Observe Mode

Runtime visibility and drift detection without blocking by default.

APOPHIS observe has two paths:

  1. CLI apophis observe: Validates observe configuration readiness (policy, sinks, sampling, safety boundaries). Introduces no service process or runtime hooks. Use this for CI config validation before deployment.

  2. Programmatic runtime observation: Register the APOPHIS plugin with observe.enabled: true and observe.sinks to emit contract pass/violation/error events from live traffic without blocking responses. Sampling controls the fraction of observed requests.

When to Use It

  • Staging: Validate observe config before promoting to production
  • Production: Monitor contract drift without affecting requests
  • Platform teams: Centralized visibility across services

Safety Boundaries

Observe mode is non-blocking by default:

  • Non-blocking by default: Contract violations are logged, not thrown
  • No request failures in non-blocking mode: Violations are reported instead of thrown
  • Explicit opt-in for blocking: Requires allowBlocking: true in environment policy
  • Production gating: Blocking behavior is blocked in production by default

Sink Configuration

Observe mode requires a reporting sink. Configure it in your environment policy:

environments: {
  staging: {
    name: 'staging',
    allowVerify: true,
    allowObserve: true,
    allowQualify: false,
    allowChaos: false,
    allowBlocking: false,
    requireSink: true
  }
}

APOPHIS supports these sink types:

  • Logs: Structured logging of contract violations
  • Metrics: Counter and histogram metrics for violation rates
  • Traces: Distributed tracing integration for violation context

Sampling

Control observation overhead with sampling:

profiles: {
  'staging-observe': {
    name: 'staging-observe',
    mode: 'observe',
    preset: 'platform-observe',
    routes: []
  }
}

The platform-observe preset enables sampling. Configure the rate explicitly:

profiles: {
  'staging-observe': {
    mode: 'observe',
    preset: 'platform-observe',
    routes: [],
    sampling: 1.0  // 100% of requests observed
  }
}

Staging vs Production

Environment Blocking Sampling Sink Required
Staging No (default) 100% Yes
Production No (default) 100% Yes

Default is 1.0 (100%). Configure lower rates for production explicitly:

profiles: {
  'prod-observe': {
    mode: 'observe',
    preset: 'platform-observe',
    routes: [],
    sampling: 0.1  // 10% of requests observed
  }
}

--check-config Flag

Validate config without activating observe mode:

apophis observe --profile staging-observe --check-config

This is useful in CI to ensure observe config is valid before deployment.

Exit Codes

Code Meaning
0 Observe config is valid and safe
2 Safety violation or invalid config

Config Example

// apophis.config.js
export default {
  mode: 'observe',
  profile: 'staging-observe',
  profiles: {
    'staging-observe': {
      name: 'staging-observe',
      mode: 'observe',
      preset: 'platform-observe',
      routes: []
    }
  },
  presets: {
    'platform-observe': {
      name: 'platform-observe',
      timeout: 10000,
      parallel: true,
      chaos: false,
      observe: true
    }
  },
  environments: {
    staging: {
      name: 'staging',
      allowVerify: true,
      allowObserve: true,
      allowQualify: false,
      allowChaos: false,
      allowBlocking: false,
      requireSink: true
    },
    production: {
      name: 'production',
      allowVerify: true,
      allowObserve: true,
      allowQualify: false,
      allowChaos: false,
      allowBlocking: false,
      requireSink: true
    }
  }
};

Programmatic Runtime Activation

The CLI only validates configuration. To activate runtime observation, register APOPHIS with observe options in your application:

import Fastify from 'fastify'
import apophisPlugin from '@apophis/fastify'

const app = Fastify({ logger: true })

// Register APOPHIS with observe enabled.
// This emits non-blocking contract pass/violation/error events
// for every covered request, gated by sampling.
await app.register(apophisPlugin, {
  runtime: 'warn',
  observe: {
    enabled: true,
    sampling: 0.1,               // observe 10% of requests
    sinks: [metricsSink],
  },
})

// Implement the ObserveSink interface.
// Capture events to your preferred observability backend.
import type { ObserveSink, ObserveEvent } from '@apophis/fastify'

const metricsSink: ObserveSink = {
  emit(event: ObserveEvent) {
    // Emit a counter for each contract evaluation
    myMetrics.increment(`apophis.contract.${event.type}`, {
      route: event.route,
      formula: event.formula,
    })

    // Record duration as a histogram
    myMetrics.histogram('apophis.contract.duration_ms', event.durationMs, {
      route: event.route,
    })

    // Log high-signal violations for immediate triage
    if (event.type === 'contract.violation') {
      logger.warn({ event }, 'APOPHIS contract violation')
    }
  },
}

Key constraints:

  • Sink emit() can be sync or async (returns void | Promise<void>).
  • Sink rejections and thrown errors are silently caught — they never affect the route response or status code.
  • Sampling is applied per-formula evaluation via Math.random() < sampling. At sampling: 1 every formula is emitted. At sampling: 0 nothing is emitted.
  • Only routes with APOPHIS annotations (x-ensures, x-requires) produce events. Routes without annotations are not evaluated in observe mode.

Sink Implementations

APOPHIS does not ship with built-in sinks. The ObserveSink interface lets you plug in any backend. Common patterns:

  • OpenTelemetry: emit counters and histograms via @opentelemetry/api.
  • pino logger: emit structured log records via pino.info() / pino.warn().
  • Internal metrics service: POST events to an internal collector endpoint.
  • In-memory ring buffer: capture recent events for diagnostics endpoints.

Monorepo Validation

For monorepos, use apophis doctor --workspace to validate observe configuration across all workspace packages. observe itself does not support --workspace; use doctor to check config in each package.

Mode Mismatch

Profiles configured for verify mode will be rejected by apophis observe. Only profiles with mode: 'observe' are valid.