v1.1.0: pooled runtime, 959 tests, production hardening (0 squash)
This commit is contained in:
@@ -0,0 +1,425 @@
|
||||
# IMHOTEP LLM Skill Guide
|
||||
|
||||
Purpose: teach an LLM/operator how to use Imhotep effectively, quickly, and safely in real repos.
|
||||
|
||||
Audience: coding agents and engineers adding relational GUI assertions to Playwright suites.
|
||||
|
||||
Reading mode: action-first. Prefer concrete patterns over theory.
|
||||
|
||||
## 1) Mental Model
|
||||
|
||||
Imhotep is for relational UI behavior, not snapshot aesthetics.
|
||||
|
||||
Use it to answer:
|
||||
|
||||
1. Is element A left/right/above/below B with meaningful bounds?
|
||||
2. Does this relationship hold across states and environments?
|
||||
3. Do semantic selectors and layout assertions converge on stable behavior?
|
||||
4. Do failures explain "why" with machine-usable diagnostics?
|
||||
|
||||
Do not use Imhotep as a thin wrapper around raw pixel assertions.
|
||||
|
||||
## 2) Operator Workflow (Always Start Here)
|
||||
|
||||
When entering a codebase:
|
||||
|
||||
1. Add `imhotep-playwright` and Playwright dependencies.
|
||||
2. Build one passing relation and one intentional failing relation.
|
||||
3. Confirm `checkAll()` returns structured diagnostics.
|
||||
4. Add semantic subject assertions (`getByRole`, `getByText`) where possible.
|
||||
5. Add transform-space checks (`visual` vs `layout`) for transformed UIs.
|
||||
6. Add state and viewport checks only after baseline relation checks are stable.
|
||||
|
||||
Ship small, truthful checks first; expand breadth iteratively.
|
||||
|
||||
## 3) Fast Start Template
|
||||
|
||||
```ts
|
||||
import { test, expect } from '@playwright/test'
|
||||
import { imhotep } from 'imhotep-playwright'
|
||||
|
||||
test('layout contract', async ({ page }) => {
|
||||
await page.goto('http://localhost:3000')
|
||||
// Pass cacheDir: null to avoid geometry cache serialization crash (known issue)
|
||||
const ui = await imhotep(page, { deterministic: true, seed: 42, cacheDir: null })
|
||||
|
||||
ui.expect('[data-testid="primary"]').to.be.leftOf('[data-testid="secondary"]', {
|
||||
minGap: 8,
|
||||
space: 'visual',
|
||||
})
|
||||
|
||||
const result = await ui.checkAll()
|
||||
expect(result.passed).toBe(true)
|
||||
})
|
||||
```
|
||||
|
||||
## 4) API Surface You Should Use
|
||||
|
||||
Primary public methods:
|
||||
|
||||
1. `imhotep(page, options?)`
|
||||
2. `ui.expect(subject)`
|
||||
3. `ui.spec(source)`
|
||||
4. `ui.checkAll()`
|
||||
5. `ui.extract(subject)`
|
||||
6. `ui.materializeState(selector, state)`
|
||||
7. `ui.applyEnvironment(env)`
|
||||
8. `ui.getByRole/getByText/getByLabelText/getByTestId/locator`
|
||||
|
||||
Property-run entry points:
|
||||
|
||||
1. `imhotepComponent(component, options)`
|
||||
2. `imhotepStory(storyId, options)`
|
||||
3. `imhotepFixture(fixturePath, options)`
|
||||
|
||||
## 5) Authoring Quality Ladder
|
||||
|
||||
### Bronze (minimum acceptable)
|
||||
|
||||
1. At least one relation assertion per critical screen
|
||||
2. One intentional failing test proving diagnostics are actionable
|
||||
|
||||
### Silver (production worthy)
|
||||
|
||||
1. Semantic selectors for user-visible elements
|
||||
2. State-aware checks (`hover`, `focus`, `active`, `disabled`, `checked`, `expanded`, `selected`, `pressed`, `visited`) for critical controls
|
||||
3. Responsive checks for mobile + desktop viewports
|
||||
|
||||
### Gold (high confidence)
|
||||
|
||||
1. Space-aware checks where transforms are present
|
||||
2. Property runs over meaningful prop/input domains
|
||||
3. Deterministic replay workflows documented in test harness
|
||||
4. CI gate on both workspace tests and fixture E2E
|
||||
|
||||
If tests only assert status booleans and ignore diagnostics, quality is incomplete.
|
||||
|
||||
## 6) Relation Checklist by Use Case
|
||||
|
||||
### Control Alignment
|
||||
|
||||
1. `leftOf/rightOf` with `minGap`
|
||||
2. `alignedWith` or `centeredWithin` with tolerance where needed
|
||||
|
||||
### Containment and Layering
|
||||
|
||||
1. `inside` / `contains` for container contracts
|
||||
2. `overlaps` only when overlap is intentional
|
||||
3. `inStackingContext` options for layering constraints
|
||||
4. `separatedFrom` for non-overlap with gap constraints
|
||||
|
||||
### Size Contracts
|
||||
|
||||
1. `atLeast('44px').wide` for target accessibility
|
||||
2. `atMost` and `between` for constrained layouts
|
||||
|
||||
### Motion/Transform UI
|
||||
|
||||
1. assert in default `visual` space first
|
||||
2. add explicit `space: 'layout'` where pre-transform semantics matter
|
||||
|
||||
## 7) Semantic Subject Guidance
|
||||
|
||||
Prefer semantic sources when they are stable and user-facing:
|
||||
|
||||
1. `getByRole(role, { name })`
|
||||
2. `getByLabelText(label)`
|
||||
3. `getByText(text)`
|
||||
4. `getByTestId(id)` as a pragmatic fallback
|
||||
5. `locator(css)` or raw CSS only when semantics are unavailable
|
||||
|
||||
Use mixed semantic + CSS references when migrating legacy suites incrementally.
|
||||
|
||||
## 8) Dense String Contracts
|
||||
|
||||
Use `ui.spec(...)` when contract sets are easier to maintain as grouped text.
|
||||
|
||||
Rules:
|
||||
|
||||
1. keep dense specs short and scoped per scenario
|
||||
2. keep fluent and dense checks semantically equivalent in critical paths
|
||||
3. use diagnostics from `checkAll()` to tighten ambiguous clauses
|
||||
|
||||
### Basic Relation Syntax
|
||||
|
||||
Selectors must be single-quoted strings. Relations are keywords, not method calls.
|
||||
|
||||
```js
|
||||
// Spatial relations with gap constraints
|
||||
ui.spec(`
|
||||
'[data-testid="a"]' leftOf '[data-testid="b"]' gap 8px
|
||||
'[data-testid="card"]' inside 'viewport'
|
||||
'[data-testid="header"]' above '[data-testid="content"]' gap 16px
|
||||
'[data-testid="sidebar"]' leftOf '[data-testid="main"]' gap 8px..24px
|
||||
`)
|
||||
```
|
||||
|
||||
Supported relations: `leftOf`, `rightOf`, `above`, `below`, `alignedWith`, `centeredWithin`, `inside`, `overlaps`, `contains`, `separatedFrom`.
|
||||
|
||||
**Fluent API only:**
|
||||
- Aliases: `beside`, `nextTo`, `adjacent`, `touching`, `near`, `under`, `within`
|
||||
- `space: 'layout'` / `space: 'visual'` option on relations
|
||||
- `.and` / `.or` chaining on fluent relations
|
||||
- State materialization: `disabled`, `checked`, `expanded`, `selected`, `pressed`, `visited`
|
||||
|
||||
**Dense DSL only:**
|
||||
- FOL quantifiers (`forall`, `exists`) with boolean connectives (`and`, `or`, `not`, `implies`)
|
||||
- `width` / `height` / `size` predicate calls with comparison operators (`>=`, `<=`, `==`, `!=`)
|
||||
- Frame attachments: `in viewport:`, `in containingBlock(...):`
|
||||
|
||||
**Both fluent and dense DSL:**
|
||||
- `contains`, `separatedFrom`
|
||||
- `between` size assertions
|
||||
|
||||
### Gap Options
|
||||
|
||||
```js
|
||||
ui.spec(`
|
||||
// Exact minimum gap
|
||||
'.button' leftOf '.label' gap 8px
|
||||
|
||||
// Gap range (between min and max)
|
||||
'.button' leftOf '.label' gap 8px..16px
|
||||
`)
|
||||
```
|
||||
|
||||
### Frame Attachments
|
||||
|
||||
Use `in frameName:` with indented assertions to scope relations to a specific frame.
|
||||
|
||||
```js
|
||||
ui.spec(`
|
||||
in viewport:
|
||||
'[data-testid="a"]' leftOf '[data-testid="b"]'
|
||||
'[data-testid="modal"]' centeredWithin 'viewport'
|
||||
|
||||
in containingBlock('[data-testid="parent"]'):
|
||||
'.child' inside '.parent'
|
||||
`)
|
||||
```
|
||||
|
||||
### Compound Assertions
|
||||
|
||||
Chain relations with `and` and `or` in dense DSL.
|
||||
|
||||
```js
|
||||
ui.spec(`
|
||||
'.header' above '.content' and leftOf '.sidebar'
|
||||
'.modal' centeredWithin 'viewport' or inside '.container'
|
||||
`)
|
||||
```
|
||||
|
||||
### Size Assertions
|
||||
|
||||
```js
|
||||
ui.spec(`
|
||||
// Minimum size
|
||||
'[data-testid="btn"]' atLeast 44px wide
|
||||
'[data-testid="btn"]' atLeast 44px tall
|
||||
|
||||
// Maximum size
|
||||
'[data-testid="img"]' atMost 200px wide
|
||||
|
||||
// Size range
|
||||
'[data-testid="img"]' between 100px and 200px wide
|
||||
|
||||
// Predicate-style size checks with comparison operators
|
||||
forall $btn in buttons('.primary'):
|
||||
width($btn) >= 44
|
||||
height($btn) >= 44
|
||||
`)
|
||||
```
|
||||
|
||||
### Quantifiers
|
||||
|
||||
Apply `all`, `any`, or `none` to assert over multiple elements.
|
||||
|
||||
```js
|
||||
ui.spec(`
|
||||
all '.item' above '.footer' gap 16px
|
||||
none '.error' overlaps '.success'
|
||||
`)
|
||||
```
|
||||
|
||||
### First-Order Logic (FOL)
|
||||
|
||||
Use `forall` and `exists` with boolean connectives for complex relational contracts.
|
||||
|
||||
```js
|
||||
ui.spec(`
|
||||
// All buttons are at least 44px wide
|
||||
forall $btn in buttons('.primary'):
|
||||
width($btn) >= 44
|
||||
|
||||
// Existence: at least one card contains a title
|
||||
exists $card in cards('.card'):
|
||||
descendants($card, '.title')
|
||||
|
||||
// Boolean connectives: and, or, not, implies
|
||||
forall $a in elements('.a'):
|
||||
forall $b in elements('.b'):
|
||||
leftOf($a, $b) and above($a, $b)
|
||||
|
||||
forall $modal in elements('.modal'):
|
||||
not overlaps($modal, '.backdrop')
|
||||
|
||||
forall $x in elements('.x'):
|
||||
forall $y in elements('.y'):
|
||||
inside($x, '.container') implies leftOf($x, $y)
|
||||
`)
|
||||
```
|
||||
|
||||
Supported connectives: `and`, `or`, `not`, `implies`.
|
||||
|
||||
Supported domain constructors: `elements(selector)`, `buttons(selector)`, `cards(selector)`.
|
||||
|
||||
Nested quantifiers for multi-variable formulas: use nested `forall` blocks instead of comma-separated variables.
|
||||
|
||||
Supported predicates in FOL: `leftOf`, `rightOf`, `above`, `below`, `inside`, `overlaps`, `alignedWith`, `centeredWithin`, `contains`, `separatedFrom`, `width`, `height`, `size`.
|
||||
|
||||
### Common Mistakes and Corrections
|
||||
|
||||
- **Bare selectors without quotes**: Selectors must be single-quoted strings.
|
||||
```js
|
||||
// ❌ Wrong — bare selector
|
||||
[data-testid="x"] leftOf [data-testid="y"]
|
||||
|
||||
// ✅ Correct — quoted selector
|
||||
'[data-testid="x"]' leftOf '[data-testid="y"]'
|
||||
```
|
||||
|
||||
- **Using `is` keyword**: The parser does not accept `is` or `have` as connecting words.
|
||||
```js
|
||||
// ❌ Wrong — 'is' is not a valid keyword
|
||||
'a' is leftOf 'b'
|
||||
|
||||
// ✅ Correct — direct relation keyword
|
||||
'a' leftOf 'b'
|
||||
```
|
||||
|
||||
- **Missing gap unit**: Gap values require a unit.
|
||||
```js
|
||||
// ❌ Wrong — missing unit
|
||||
'a' leftOf 'b' gap 8
|
||||
|
||||
// ✅ Correct — gap with unit
|
||||
'a' leftOf 'b' gap 8px
|
||||
```
|
||||
|
||||
- **Wrong quote style**: Use single quotes for selectors; double quotes inside are fine.
|
||||
```js
|
||||
// ❌ Wrong — double-quoted selector
|
||||
"[data-testid='x']" leftOf "[data-testid='y']"
|
||||
|
||||
// ✅ Correct — single-quoted selector with double quotes inside
|
||||
'[data-testid="x"]' leftOf '[data-testid="y"]'
|
||||
```
|
||||
|
||||
## 9) Diagnostics You Should Watch
|
||||
|
||||
Key codes and meanings:
|
||||
|
||||
1. `IMH_SELECTOR_ZERO_MATCHES`: selector resolved to no elements
|
||||
2. `IMH_EXTRACT_PROTOCOL_ERROR`: extraction path failed
|
||||
3. relation-specific failures (example: `IMH_RELATION_LEFT_OF_FAILED`)
|
||||
|
||||
Operator rule:
|
||||
|
||||
Do not silence diagnostics; treat them as contract feedback.
|
||||
|
||||
## 10) Determinism and Replay
|
||||
|
||||
When reproducibility matters:
|
||||
|
||||
1. initialize with deterministic options (`seed`)
|
||||
2. preserve failing diagnostics payloads in CI artifacts
|
||||
3. rerun with same seed before changing assertions
|
||||
|
||||
If a failure is flaky, first classify whether it is:
|
||||
|
||||
1. extraction instability,
|
||||
2. real layout nondeterminism,
|
||||
3. threshold too strict for CI hardware.
|
||||
|
||||
## 11) CI Integration Pattern
|
||||
|
||||
Recommended gates:
|
||||
|
||||
1. `npm run build`
|
||||
2. `npm test --workspaces`
|
||||
3. `npx playwright test`
|
||||
|
||||
For local-path package evaluation in temp projects:
|
||||
|
||||
1. install all required local packages, not just `imhotep-playwright`
|
||||
2. if symlink duplication appears, set `NODE_OPTIONS=--preserve-symlinks`
|
||||
|
||||
## 12) Anti-Patterns (Do Not Do This)
|
||||
|
||||
1. Writing only `expect(result.passed).toBe(true)` with no diagnostic assertions.
|
||||
2. Converting every relation to hardcoded pixel math.
|
||||
3. Ignoring transform-space semantics in transformed UIs.
|
||||
4. Treating selector zero matches as acceptable in passing tests.
|
||||
5. Suppressing fail-closed errors without root-cause triage.
|
||||
|
||||
## 13) Debugging Playbook
|
||||
|
||||
When a relation unexpectedly fails:
|
||||
|
||||
1. inspect `result.diagnostics` first
|
||||
2. inspect `result.clauseResults[*].status/truth/metrics`
|
||||
3. run `ui.extract(subject)` for both sides to inspect geometry/origin
|
||||
4. verify state and viewport preconditions are applied
|
||||
5. for transformed elements, compare `space: 'visual'` vs `'layout'`
|
||||
|
||||
When failure is `error` instead of `fail`:
|
||||
|
||||
1. suspect extraction or unsupported path
|
||||
2. verify selector materialization and runtime context
|
||||
3. fail closed and do not coerce to pass
|
||||
|
||||
## 14) Property Run Guidance
|
||||
|
||||
Use property runs for invariant classes, not one-off screenshots.
|
||||
|
||||
Examples:
|
||||
|
||||
1. minimum tap target sizes across prop combinations
|
||||
2. spacing constraints across variant inputs
|
||||
3. containment/alignment under generated data
|
||||
|
||||
For sampled runs:
|
||||
|
||||
1. store seed and failing case metadata
|
||||
2. shrink only with oracle-preserving checks
|
||||
|
||||
## 15) Contract Evolution Strategy
|
||||
|
||||
When tightening contracts in existing suites:
|
||||
|
||||
1. start with smoke relation checks per page
|
||||
2. add semantic subjects gradually
|
||||
3. introduce state assertions where user behavior depends on state
|
||||
4. introduce responsive and transform-space assertions next
|
||||
5. move shared checks into helper modules only after semantics stabilize
|
||||
|
||||
## 16) Documentation Pointers
|
||||
|
||||
1. `README.md` for usage and quickstart
|
||||
2. `SKILLS.md` for authoring patterns and DSL syntax
|
||||
3. `BUILD.md` for build/test/e2e commands
|
||||
4. `CHANGELOG.md` for release notes and known limitations
|
||||
5. `SECURITY.md` for trust boundaries
|
||||
|
||||
## 17) Final Rule for LLM Operators
|
||||
|
||||
Imhotep is valuable only when it encodes user-visible layout truths.
|
||||
|
||||
Ask for every critical view:
|
||||
|
||||
1. Which spatial relationships must always hold?
|
||||
2. Which relationships change with state or viewport?
|
||||
3. Which semantic subjects best represent user intent?
|
||||
4. What diagnostic evidence will prove regressions quickly?
|
||||
|
||||
Write those assertions first. Keep them deterministic. Fail closed.
|
||||
Reference in New Issue
Block a user