257 lines
7.2 KiB
Markdown
257 lines
7.2 KiB
Markdown
# Observe Mode
|
|
|
|
Runtime visibility and drift detection without blocking by default.
|
|
|
|
APOPHIS observe has two paths:
|
|
|
|
1. **CLI `apophis observe`**: Validates observe configuration readiness (policy, sinks, sampling, safety boundaries). Introduces no service process or runtime hooks. Use this for CI config validation before deployment.
|
|
|
|
2. **Programmatic runtime observation**: Register the APOPHIS plugin with `observe.enabled: true` and `observe.sinks` to emit contract pass/violation/error events from live traffic without blocking responses. Sampling controls the fraction of observed requests.
|
|
|
|
## When to Use It
|
|
|
|
- **Staging**: Validate observe config before promoting to production
|
|
- **Production**: Monitor contract drift without affecting requests
|
|
- **Platform teams**: Centralized visibility across services
|
|
|
|
## Safety Boundaries
|
|
|
|
Observe mode is non-blocking by default:
|
|
|
|
- **Non-blocking by default**: Contract violations are logged, not thrown
|
|
- **No request failures in non-blocking mode**: Violations are reported instead of thrown
|
|
- **Explicit opt-in for blocking**: Requires `allowBlocking: true` in environment policy
|
|
- **Production gating**: Blocking behavior is blocked in production by default
|
|
|
|
## Sink Configuration
|
|
|
|
Observe mode requires a reporting sink. Configure it in your environment policy:
|
|
|
|
```javascript
|
|
environments: {
|
|
staging: {
|
|
name: 'staging',
|
|
allowVerify: true,
|
|
allowObserve: true,
|
|
allowQualify: false,
|
|
allowChaos: false,
|
|
allowBlocking: false,
|
|
requireSink: true
|
|
}
|
|
}
|
|
```
|
|
|
|
APOPHIS supports these sink types:
|
|
|
|
- **Logs**: Structured logging of contract violations
|
|
- **Metrics**: Counter and histogram metrics for violation rates
|
|
- **Traces**: Distributed tracing integration for violation context
|
|
|
|
## Sampling
|
|
|
|
Control observation overhead with sampling:
|
|
|
|
```javascript
|
|
profiles: {
|
|
'staging-observe': {
|
|
name: 'staging-observe',
|
|
mode: 'observe',
|
|
preset: 'platform-observe',
|
|
routes: []
|
|
}
|
|
}
|
|
```
|
|
|
|
The `platform-observe` preset enables sampling. Configure the rate explicitly:
|
|
|
|
```javascript
|
|
profiles: {
|
|
'staging-observe': {
|
|
mode: 'observe',
|
|
preset: 'platform-observe',
|
|
routes: [],
|
|
sampling: 1.0 // 100% of requests observed
|
|
}
|
|
}
|
|
```
|
|
|
|
## Staging vs Production
|
|
|
|
| Environment | Blocking | Sampling | Sink Required |
|
|
|---|---|---|---|
|
|
| Staging | No (default) | 100% | Yes |
|
|
| Production | No (default) | 100% | Yes |
|
|
|
|
Default is `1.0` (100%). Configure lower rates for production explicitly:
|
|
|
|
```javascript
|
|
profiles: {
|
|
'prod-observe': {
|
|
mode: 'observe',
|
|
preset: 'platform-observe',
|
|
routes: [],
|
|
sampling: 0.1 // 10% of requests observed
|
|
}
|
|
}
|
|
```
|
|
|
|
## `--check-config` Flag
|
|
|
|
Validate config without activating observe mode:
|
|
|
|
```bash
|
|
apophis observe --profile staging-observe --check-config
|
|
```
|
|
|
|
This is useful in CI to ensure observe config is valid before deployment.
|
|
|
|
## Exit Codes
|
|
|
|
| Code | Meaning |
|
|
|---|---|
|
|
| 0 | Observe config is valid and safe |
|
|
| 2 | Safety violation or invalid config |
|
|
|
|
## Config Example
|
|
|
|
```javascript
|
|
// apophis.config.js
|
|
export default {
|
|
mode: 'observe',
|
|
profile: 'staging-observe',
|
|
profiles: {
|
|
'staging-observe': {
|
|
name: 'staging-observe',
|
|
mode: 'observe',
|
|
preset: 'platform-observe',
|
|
routes: []
|
|
}
|
|
},
|
|
presets: {
|
|
'platform-observe': {
|
|
name: 'platform-observe',
|
|
timeout: 10000,
|
|
parallel: true,
|
|
chaos: false,
|
|
observe: true
|
|
}
|
|
},
|
|
environments: {
|
|
staging: {
|
|
name: 'staging',
|
|
allowVerify: true,
|
|
allowObserve: true,
|
|
allowQualify: false,
|
|
allowChaos: false,
|
|
allowBlocking: false,
|
|
requireSink: true
|
|
},
|
|
production: {
|
|
name: 'production',
|
|
allowVerify: true,
|
|
allowObserve: true,
|
|
allowQualify: false,
|
|
allowChaos: false,
|
|
allowBlocking: false,
|
|
requireSink: true
|
|
}
|
|
}
|
|
};
|
|
```
|
|
|
|
## Programmatic Runtime Activation
|
|
|
|
The CLI only validates configuration. To activate runtime observation, register
|
|
APOPHIS with observe options in your application before routes are registered.
|
|
Observe remains active in production because it is non-blocking; blocking
|
|
runtime validation still stays disabled in production.
|
|
|
|
```typescript
|
|
import Fastify from 'fastify'
|
|
import apophisPlugin from '@apophis/fastify'
|
|
import type { ObserveSink, ObserveEvent } from '@apophis/fastify'
|
|
|
|
const app = Fastify({ logger: true })
|
|
|
|
// Implement the ObserveSink interface.
|
|
// Capture events to your preferred observability backend.
|
|
const metricsSink: ObserveSink = {
|
|
emit(event: ObserveEvent) {
|
|
// Emit a counter for each contract evaluation
|
|
myMetrics.increment(`apophis.contract.${event.type}`, {
|
|
route: event.route,
|
|
formula: event.formula,
|
|
})
|
|
|
|
// Record duration as a histogram
|
|
myMetrics.histogram('apophis.contract.duration_ms', event.durationMs, {
|
|
route: event.route,
|
|
})
|
|
|
|
// Log high-signal violations for immediate triage
|
|
if (event.type === 'contract.violation') {
|
|
logger.warn({ event }, 'APOPHIS contract violation')
|
|
}
|
|
},
|
|
}
|
|
|
|
// Register APOPHIS with observe enabled.
|
|
// This emits non-blocking contract pass/violation/error events
|
|
// for every covered request, gated by sampling.
|
|
await app.register(apophisPlugin, {
|
|
runtime: process.env.NODE_ENV === 'test' ? 'error' : 'off',
|
|
observe: {
|
|
enabled: true,
|
|
sampling: 0.1, // observe 10% of requests
|
|
sinks: [metricsSink],
|
|
},
|
|
})
|
|
```
|
|
|
|
For new services, `createFastify()` wires discovery and APOPHIS before your
|
|
routes, which avoids the most common ordering mistake:
|
|
|
|
```typescript
|
|
import { createFastify } from '@apophis/fastify'
|
|
|
|
const app = await createFastify({
|
|
logger: true,
|
|
apophis: {
|
|
runtime: process.env.NODE_ENV === 'test' ? 'error' : 'off',
|
|
observe: {
|
|
enabled: true,
|
|
sampling: 0.1,
|
|
sinks: [metricsSink],
|
|
},
|
|
},
|
|
})
|
|
```
|
|
|
|
Key constraints:
|
|
- Sink `emit()` can be sync or async (returns `void | Promise<void>`).
|
|
- Sink rejections and thrown errors are silently caught — they never affect the route response or status code.
|
|
- In production, observe hooks still run when `observe.enabled` and `observe.sinks` are configured; blocking runtime validation does not.
|
|
- Sampling is applied per-formula evaluation via `Math.random() < sampling`.
|
|
At `sampling: 1` every formula is emitted. At `sampling: 0` nothing is emitted.
|
|
- Only routes with APOPHIS annotations (`x-ensures`, `x-requires`) produce events.
|
|
Routes without annotations are not evaluated in observe mode.
|
|
|
|
## Sink Implementations
|
|
|
|
APOPHIS does not ship with built-in sinks. The `ObserveSink` interface lets you
|
|
plug in any backend. Common patterns:
|
|
|
|
- **OpenTelemetry**: emit counters and histograms via `@opentelemetry/api`.
|
|
- **pino logger**: emit structured log records via `pino.info()` / `pino.warn()`.
|
|
- **Internal metrics service**: POST events to an internal collector endpoint.
|
|
- **In-memory ring buffer**: capture recent events for diagnostics endpoints.
|
|
|
|
## Monorepo Validation
|
|
|
|
For monorepos, use `apophis doctor --workspace` to validate observe configuration across all workspace packages. `observe` itself does not support `--workspace`; use `doctor` to check config in each package.
|
|
|
|
## Mode Mismatch
|
|
|
|
Profiles configured for `verify` mode will be rejected by `apophis observe`. Only profiles with `mode: 'observe'` are valid.
|
|
```
|