Files
Imhotep/SKILLS.md
T

426 lines
12 KiB
Markdown
Raw Normal View History

# IMHOTEP LLM Skill Guide
Purpose: teach an LLM/operator how to use Imhotep effectively, quickly, and safely in real repos.
Audience: coding agents and engineers adding relational GUI assertions to Playwright suites.
Reading mode: action-first. Prefer concrete patterns over theory.
## 1) Mental Model
Imhotep is for relational UI behavior, not snapshot aesthetics.
Use it to answer:
1. Is element A left/right/above/below B with meaningful bounds?
2. Does this relationship hold across states and environments?
3. Do semantic selectors and layout assertions converge on stable behavior?
4. Do failures explain "why" with machine-usable diagnostics?
Do not use Imhotep as a thin wrapper around raw pixel assertions.
## 2) Operator Workflow (Always Start Here)
When entering a codebase:
1. Add `imhotep-playwright` and Playwright dependencies.
2. Build one passing relation and one intentional failing relation.
3. Confirm `checkAll()` returns structured diagnostics.
4. Add semantic subject assertions (`getByRole`, `getByText`) where possible.
5. Add transform-space checks (`visual` vs `layout`) for transformed UIs.
6. Add state and viewport checks only after baseline relation checks are stable.
Ship small, truthful checks first; expand breadth iteratively.
## 3) Fast Start Template
```ts
import { test, expect } from '@playwright/test'
import { imhotep } from 'imhotep-playwright'
test('layout contract', async ({ page }) => {
await page.goto('http://localhost:3000')
// Pass cacheDir: null to avoid geometry cache serialization crash (known issue)
const ui = await imhotep(page, { deterministic: true, seed: 42, cacheDir: null })
ui.expect('[data-testid="primary"]').to.be.leftOf('[data-testid="secondary"]', {
minGap: 8,
space: 'visual',
})
const result = await ui.checkAll()
expect(result.passed).toBe(true)
})
```
## 4) API Surface You Should Use
Primary public methods:
1. `imhotep(page, options?)`
2. `ui.expect(subject)`
3. `ui.spec(source)`
4. `ui.checkAll()`
5. `ui.extract(subject)`
6. `ui.materializeState(selector, state)`
7. `ui.applyEnvironment(env)`
8. `ui.getByRole/getByText/getByLabelText/getByTestId/locator`
Property-run entry points:
1. `imhotepComponent(component, options)`
2. `imhotepStory(storyId, options)`
3. `imhotepFixture(fixturePath, options)`
## 5) Authoring Quality Ladder
### Bronze (minimum acceptable)
1. At least one relation assertion per critical screen
2. One intentional failing test proving diagnostics are actionable
### Silver (production worthy)
1. Semantic selectors for user-visible elements
2. State-aware checks (`hover`, `focus`, `active`, `disabled`, `checked`, `expanded`, `selected`, `pressed`, `visited`) for critical controls
3. Responsive checks for mobile + desktop viewports
### Gold (high confidence)
1. Space-aware checks where transforms are present
2. Property runs over meaningful prop/input domains
3. Deterministic replay workflows documented in test harness
4. CI gate on both workspace tests and fixture E2E
If tests only assert status booleans and ignore diagnostics, quality is incomplete.
## 6) Relation Checklist by Use Case
### Control Alignment
1. `leftOf/rightOf` with `minGap`
2. `alignedWith` or `centeredWithin` with tolerance where needed
### Containment and Layering
1. `inside` / `contains` for container contracts
2. `overlaps` only when overlap is intentional
3. `inStackingContext` options for layering constraints
4. `separatedFrom` for non-overlap with gap constraints
### Size Contracts
1. `atLeast('44px').wide` for target accessibility
2. `atMost` and `between` for constrained layouts
### Motion/Transform UI
1. assert in default `visual` space first
2. add explicit `space: 'layout'` where pre-transform semantics matter
## 7) Semantic Subject Guidance
Prefer semantic sources when they are stable and user-facing:
1. `getByRole(role, { name })`
2. `getByLabelText(label)`
3. `getByText(text)`
4. `getByTestId(id)` as a pragmatic fallback
5. `locator(css)` or raw CSS only when semantics are unavailable
Use mixed semantic + CSS references when migrating legacy suites incrementally.
## 8) Dense String Contracts
Use `ui.spec(...)` when contract sets are easier to maintain as grouped text.
Rules:
1. keep dense specs short and scoped per scenario
2. keep fluent and dense checks semantically equivalent in critical paths
3. use diagnostics from `checkAll()` to tighten ambiguous clauses
### Basic Relation Syntax
Selectors must be single-quoted strings. Relations are keywords, not method calls.
```js
// Spatial relations with gap constraints
ui.spec(`
'[data-testid="a"]' leftOf '[data-testid="b"]' gap 8px
'[data-testid="card"]' inside 'viewport'
'[data-testid="header"]' above '[data-testid="content"]' gap 16px
'[data-testid="sidebar"]' leftOf '[data-testid="main"]' gap 8px..24px
`)
```
Supported relations: `leftOf`, `rightOf`, `above`, `below`, `alignedWith`, `centeredWithin`, `inside`, `overlaps`, `contains`, `separatedFrom`.
**Fluent API only:**
- Aliases: `beside`, `nextTo`, `adjacent`, `touching`, `near`, `under`, `within`
- `space: 'layout'` / `space: 'visual'` option on relations
- `.and` / `.or` chaining on fluent relations
- State materialization: `disabled`, `checked`, `expanded`, `selected`, `pressed`, `visited`
**Dense DSL only:**
- FOL quantifiers (`forall`, `exists`) with boolean connectives (`and`, `or`, `not`, `implies`)
- `width` / `height` / `size` predicate calls with comparison operators (`>=`, `<=`, `==`, `!=`)
- Frame attachments: `in viewport:`, `in containingBlock(...):`
**Both fluent and dense DSL:**
- `contains`, `separatedFrom`
- `between` size assertions
### Gap Options
```js
ui.spec(`
// Exact minimum gap
'.button' leftOf '.label' gap 8px
// Gap range (between min and max)
'.button' leftOf '.label' gap 8px..16px
`)
```
### Frame Attachments
Use `in frameName:` with indented assertions to scope relations to a specific frame.
```js
ui.spec(`
in viewport:
'[data-testid="a"]' leftOf '[data-testid="b"]'
'[data-testid="modal"]' centeredWithin 'viewport'
in containingBlock('[data-testid="parent"]'):
'.child' inside '.parent'
`)
```
### Compound Assertions
Chain relations with `and` and `or` in dense DSL.
```js
ui.spec(`
'.header' above '.content' and leftOf '.sidebar'
'.modal' centeredWithin 'viewport' or inside '.container'
`)
```
### Size Assertions
```js
ui.spec(`
// Minimum size
'[data-testid="btn"]' atLeast 44px wide
'[data-testid="btn"]' atLeast 44px tall
// Maximum size
'[data-testid="img"]' atMost 200px wide
// Size range
'[data-testid="img"]' between 100px and 200px wide
// Predicate-style size checks with comparison operators
forall $btn in buttons('.primary'):
width($btn) >= 44
height($btn) >= 44
`)
```
### Quantifiers
Apply `all`, `any`, or `none` to assert over multiple elements.
```js
ui.spec(`
all '.item' above '.footer' gap 16px
none '.error' overlaps '.success'
`)
```
### First-Order Logic (FOL)
Use `forall` and `exists` with boolean connectives for complex relational contracts.
```js
ui.spec(`
// All buttons are at least 44px wide
forall $btn in buttons('.primary'):
width($btn) >= 44
// Existence: at least one card contains a title
exists $card in cards('.card'):
descendants($card, '.title')
// Boolean connectives: and, or, not, implies
forall $a in elements('.a'):
forall $b in elements('.b'):
leftOf($a, $b) and above($a, $b)
forall $modal in elements('.modal'):
not overlaps($modal, '.backdrop')
forall $x in elements('.x'):
forall $y in elements('.y'):
inside($x, '.container') implies leftOf($x, $y)
`)
```
Supported connectives: `and`, `or`, `not`, `implies`.
Supported domain constructors: `elements(selector)`, `buttons(selector)`, `cards(selector)`.
Nested quantifiers for multi-variable formulas: use nested `forall` blocks instead of comma-separated variables.
Supported predicates in FOL: `leftOf`, `rightOf`, `above`, `below`, `inside`, `overlaps`, `alignedWith`, `centeredWithin`, `contains`, `separatedFrom`, `width`, `height`, `size`.
### Common Mistakes and Corrections
- **Bare selectors without quotes**: Selectors must be single-quoted strings.
```js
// ❌ Wrong — bare selector
[data-testid="x"] leftOf [data-testid="y"]
// ✅ Correct — quoted selector
'[data-testid="x"]' leftOf '[data-testid="y"]'
```
- **Using `is` keyword**: The parser does not accept `is` or `have` as connecting words.
```js
// ❌ Wrong — 'is' is not a valid keyword
'a' is leftOf 'b'
// ✅ Correct — direct relation keyword
'a' leftOf 'b'
```
- **Missing gap unit**: Gap values require a unit.
```js
// ❌ Wrong — missing unit
'a' leftOf 'b' gap 8
// ✅ Correct — gap with unit
'a' leftOf 'b' gap 8px
```
- **Wrong quote style**: Use single quotes for selectors; double quotes inside are fine.
```js
// ❌ Wrong — double-quoted selector
"[data-testid='x']" leftOf "[data-testid='y']"
// ✅ Correct — single-quoted selector with double quotes inside
'[data-testid="x"]' leftOf '[data-testid="y"]'
```
## 9) Diagnostics You Should Watch
Key codes and meanings:
1. `IMH_SELECTOR_ZERO_MATCHES`: selector resolved to no elements
2. `IMH_EXTRACT_PROTOCOL_ERROR`: extraction path failed
3. relation-specific failures (example: `IMH_RELATION_LEFT_OF_FAILED`)
Operator rule:
Do not silence diagnostics; treat them as contract feedback.
## 10) Determinism and Replay
When reproducibility matters:
1. initialize with deterministic options (`seed`)
2. preserve failing diagnostics payloads in CI artifacts
3. rerun with same seed before changing assertions
If a failure is flaky, first classify whether it is:
1. extraction instability,
2. real layout nondeterminism,
3. threshold too strict for CI hardware.
## 11) CI Integration Pattern
Recommended gates:
1. `npm run build`
2. `npm test --workspaces`
3. `npx playwright test`
For local-path package evaluation in temp projects:
1. install all required local packages, not just `imhotep-playwright`
2. if symlink duplication appears, set `NODE_OPTIONS=--preserve-symlinks`
## 12) Anti-Patterns (Do Not Do This)
1. Writing only `expect(result.passed).toBe(true)` with no diagnostic assertions.
2. Converting every relation to hardcoded pixel math.
3. Ignoring transform-space semantics in transformed UIs.
4. Treating selector zero matches as acceptable in passing tests.
5. Suppressing fail-closed errors without root-cause triage.
## 13) Debugging Playbook
When a relation unexpectedly fails:
1. inspect `result.diagnostics` first
2. inspect `result.clauseResults[*].status/truth/metrics`
3. run `ui.extract(subject)` for both sides to inspect geometry/origin
4. verify state and viewport preconditions are applied
5. for transformed elements, compare `space: 'visual'` vs `'layout'`
When failure is `error` instead of `fail`:
1. suspect extraction or unsupported path
2. verify selector materialization and runtime context
3. fail closed and do not coerce to pass
## 14) Property Run Guidance
Use property runs for invariant classes, not one-off screenshots.
Examples:
1. minimum tap target sizes across prop combinations
2. spacing constraints across variant inputs
3. containment/alignment under generated data
For sampled runs:
1. store seed and failing case metadata
2. shrink only with oracle-preserving checks
## 15) Contract Evolution Strategy
When tightening contracts in existing suites:
1. start with smoke relation checks per page
2. add semantic subjects gradually
3. introduce state assertions where user behavior depends on state
4. introduce responsive and transform-space assertions next
5. move shared checks into helper modules only after semantics stabilize
## 16) Documentation Pointers
1. `README.md` for usage and quickstart
2. `SKILLS.md` for authoring patterns and DSL syntax
3. `BUILD.md` for build/test/e2e commands
4. `CHANGELOG.md` for release notes and known limitations
5. `SECURITY.md` for trust boundaries
## 17) Final Rule for LLM Operators
Imhotep is valuable only when it encodes user-visible layout truths.
Ask for every critical view:
1. Which spatial relationships must always hold?
2. Which relationships change with state or viewport?
3. Which semantic subjects best represent user intent?
4. What diagnostic evidence will prove regressions quickly?
Write those assertions first. Keep them deterministic. Fail closed.