12 KiB
IMHOTEP LLM Skill Guide
Purpose: teach an LLM/operator how to use Imhotep effectively, quickly, and safely in real repos.
Audience: coding agents and engineers adding relational GUI assertions to Playwright suites.
Reading mode: action-first. Prefer concrete patterns over theory.
1) Mental Model
Imhotep is for relational UI behavior, not snapshot aesthetics.
Use it to answer:
- Is element A left/right/above/below B with meaningful bounds?
- Does this relationship hold across states and environments?
- Do semantic selectors and layout assertions converge on stable behavior?
- Do failures explain "why" with machine-usable diagnostics?
Do not use Imhotep as a thin wrapper around raw pixel assertions.
2) Operator Workflow (Always Start Here)
When entering a codebase:
- Add
imhotep-playwrightand Playwright dependencies. - Build one passing relation and one intentional failing relation.
- Confirm
checkAll()returns structured diagnostics. - Add semantic subject assertions (
getByRole,getByText) where possible. - Add transform-space checks (
visualvslayout) for transformed UIs. - Add state and viewport checks only after baseline relation checks are stable.
Ship small, truthful checks first; expand breadth iteratively.
3) Fast Start Template
import { test, expect } from '@playwright/test'
import { imhotep } from 'imhotep-playwright'
test('layout contract', async ({ page }) => {
await page.goto('http://localhost:3000')
// Pass cacheDir: null to avoid geometry cache serialization crash (known issue)
const ui = await imhotep(page, { deterministic: true, seed: 42, cacheDir: null })
ui.expect('[data-testid="primary"]').to.be.leftOf('[data-testid="secondary"]', {
minGap: 8,
space: 'visual',
})
const result = await ui.checkAll()
expect(result.passed).toBe(true)
})
4) API Surface You Should Use
Primary public methods:
imhotep(page, options?)ui.expect(subject)ui.spec(source)ui.checkAll()ui.extract(subject)ui.materializeState(selector, state)ui.applyEnvironment(env)ui.getByRole/getByText/getByLabelText/getByTestId/locator
Property-run entry points:
imhotepComponent(component, options)imhotepStory(storyId, options)imhotepFixture(fixturePath, options)
5) Authoring Quality Ladder
Bronze (minimum acceptable)
- At least one relation assertion per critical screen
- One intentional failing test proving diagnostics are actionable
Silver (production worthy)
- Semantic selectors for user-visible elements
- State-aware checks (
hover,focus,active,disabled,checked,expanded,selected,pressed,visited) for critical controls - Responsive checks for mobile + desktop viewports
Gold (high confidence)
- Space-aware checks where transforms are present
- Property runs over meaningful prop/input domains
- Deterministic replay workflows documented in test harness
- CI gate on both workspace tests and fixture E2E
If tests only assert status booleans and ignore diagnostics, quality is incomplete.
6) Relation Checklist by Use Case
Control Alignment
leftOf/rightOfwithminGapalignedWithorcenteredWithinwith tolerance where needed
Containment and Layering
inside/containsfor container contractsoverlapsonly when overlap is intentionalinStackingContextoptions for layering constraintsseparatedFromfor non-overlap with gap constraints
Size Contracts
atLeast('44px').widefor target accessibilityatMostandbetweenfor constrained layouts
Motion/Transform UI
- assert in default
visualspace first - add explicit
space: 'layout'where pre-transform semantics matter
7) Semantic Subject Guidance
Prefer semantic sources when they are stable and user-facing:
getByRole(role, { name })getByLabelText(label)getByText(text)getByTestId(id)as a pragmatic fallbacklocator(css)or raw CSS only when semantics are unavailable
Use mixed semantic + CSS references when migrating legacy suites incrementally.
8) Dense String Contracts
Use ui.spec(...) when contract sets are easier to maintain as grouped text.
Rules:
- keep dense specs short and scoped per scenario
- keep fluent and dense checks semantically equivalent in critical paths
- use diagnostics from
checkAll()to tighten ambiguous clauses
Basic Relation Syntax
Selectors must be single-quoted strings. Relations are keywords, not method calls.
// Spatial relations with gap constraints
ui.spec(`
'[data-testid="a"]' leftOf '[data-testid="b"]' gap 8px
'[data-testid="card"]' inside 'viewport'
'[data-testid="header"]' above '[data-testid="content"]' gap 16px
'[data-testid="sidebar"]' leftOf '[data-testid="main"]' gap 8px..24px
`)
Supported relations: leftOf, rightOf, above, below, alignedWith, centeredWithin, inside, overlaps, contains, separatedFrom.
Fluent API only:
- Aliases:
beside,nextTo,adjacent,touching,near,under,within space: 'layout'/space: 'visual'option on relations.and/.orchaining on fluent relations- State materialization:
disabled,checked,expanded,selected,pressed,visited
Dense DSL only:
- FOL quantifiers (
forall,exists) with boolean connectives (and,or,not,implies) width/height/sizepredicate calls with comparison operators (>=,<=,==,!=)- Frame attachments:
in viewport:,in containingBlock(...):
Both fluent and dense DSL:
contains,separatedFrombetweensize assertions
Gap Options
ui.spec(`
// Exact minimum gap
'.button' leftOf '.label' gap 8px
// Gap range (between min and max)
'.button' leftOf '.label' gap 8px..16px
`)
Frame Attachments
Use in frameName: with indented assertions to scope relations to a specific frame.
ui.spec(`
in viewport:
'[data-testid="a"]' leftOf '[data-testid="b"]'
'[data-testid="modal"]' centeredWithin 'viewport'
in containingBlock('[data-testid="parent"]'):
'.child' inside '.parent'
`)
Compound Assertions
Chain relations with and and or in dense DSL.
ui.spec(`
'.header' above '.content' and leftOf '.sidebar'
'.modal' centeredWithin 'viewport' or inside '.container'
`)
Size Assertions
ui.spec(`
// Minimum size
'[data-testid="btn"]' atLeast 44px wide
'[data-testid="btn"]' atLeast 44px tall
// Maximum size
'[data-testid="img"]' atMost 200px wide
// Size range
'[data-testid="img"]' between 100px and 200px wide
// Predicate-style size checks with comparison operators
forall $btn in buttons('.primary'):
width($btn) >= 44
height($btn) >= 44
`)
Quantifiers
Apply all, any, or none to assert over multiple elements.
ui.spec(`
all '.item' above '.footer' gap 16px
none '.error' overlaps '.success'
`)
First-Order Logic (FOL)
Use forall and exists with boolean connectives for complex relational contracts.
ui.spec(`
// All buttons are at least 44px wide
forall $btn in buttons('.primary'):
width($btn) >= 44
// Existence: at least one card contains a title
exists $card in cards('.card'):
descendants($card, '.title')
// Boolean connectives: and, or, not, implies
forall $a in elements('.a'):
forall $b in elements('.b'):
leftOf($a, $b) and above($a, $b)
forall $modal in elements('.modal'):
not overlaps($modal, '.backdrop')
forall $x in elements('.x'):
forall $y in elements('.y'):
inside($x, '.container') implies leftOf($x, $y)
`)
Supported connectives: and, or, not, implies.
Supported domain constructors: elements(selector), buttons(selector), cards(selector).
Nested quantifiers for multi-variable formulas: use nested forall blocks instead of comma-separated variables.
Supported predicates in FOL: leftOf, rightOf, above, below, inside, overlaps, alignedWith, centeredWithin, contains, separatedFrom, width, height, size.
Common Mistakes and Corrections
-
Bare selectors without quotes: Selectors must be single-quoted strings.
// ❌ Wrong — bare selector [data-testid="x"] leftOf [data-testid="y"] // ✅ Correct — quoted selector '[data-testid="x"]' leftOf '[data-testid="y"]' -
Using
iskeyword: The parser does not acceptisorhaveas connecting words.// ❌ Wrong — 'is' is not a valid keyword 'a' is leftOf 'b' // ✅ Correct — direct relation keyword 'a' leftOf 'b' -
Missing gap unit: Gap values require a unit.
// ❌ Wrong — missing unit 'a' leftOf 'b' gap 8 // ✅ Correct — gap with unit 'a' leftOf 'b' gap 8px -
Wrong quote style: Use single quotes for selectors; double quotes inside are fine.
// ❌ Wrong — double-quoted selector "[data-testid='x']" leftOf "[data-testid='y']" // ✅ Correct — single-quoted selector with double quotes inside '[data-testid="x"]' leftOf '[data-testid="y"]'
9) Diagnostics You Should Watch
Key codes and meanings:
IMH_SELECTOR_ZERO_MATCHES: selector resolved to no elementsIMH_EXTRACT_PROTOCOL_ERROR: extraction path failed- relation-specific failures (example:
IMH_RELATION_LEFT_OF_FAILED)
Operator rule:
Do not silence diagnostics; treat them as contract feedback.
10) Determinism and Replay
When reproducibility matters:
- initialize with deterministic options (
seed) - preserve failing diagnostics payloads in CI artifacts
- rerun with same seed before changing assertions
If a failure is flaky, first classify whether it is:
- extraction instability,
- real layout nondeterminism,
- threshold too strict for CI hardware.
11) CI Integration Pattern
Recommended gates:
npm run buildnpm test --workspacesnpx playwright test
For local-path package evaluation in temp projects:
- install all required local packages, not just
imhotep-playwright - if symlink duplication appears, set
NODE_OPTIONS=--preserve-symlinks
12) Anti-Patterns (Do Not Do This)
- Writing only
expect(result.passed).toBe(true)with no diagnostic assertions. - Converting every relation to hardcoded pixel math.
- Ignoring transform-space semantics in transformed UIs.
- Treating selector zero matches as acceptable in passing tests.
- Suppressing fail-closed errors without root-cause triage.
13) Debugging Playbook
When a relation unexpectedly fails:
- inspect
result.diagnosticsfirst - inspect
result.clauseResults[*].status/truth/metrics - run
ui.extract(subject)for both sides to inspect geometry/origin - verify state and viewport preconditions are applied
- for transformed elements, compare
space: 'visual'vs'layout'
When failure is error instead of fail:
- suspect extraction or unsupported path
- verify selector materialization and runtime context
- fail closed and do not coerce to pass
14) Property Run Guidance
Use property runs for invariant classes, not one-off screenshots.
Examples:
- minimum tap target sizes across prop combinations
- spacing constraints across variant inputs
- containment/alignment under generated data
For sampled runs:
- store seed and failing case metadata
- shrink only with oracle-preserving checks
15) Contract Evolution Strategy
When tightening contracts in existing suites:
- start with smoke relation checks per page
- add semantic subjects gradually
- introduce state assertions where user behavior depends on state
- introduce responsive and transform-space assertions next
- move shared checks into helper modules only after semantics stabilize
16) Documentation Pointers
README.mdfor usage and quickstartSKILLS.mdfor authoring patterns and DSL syntaxBUILD.mdfor build/test/e2e commandsCHANGELOG.mdfor release notes and known limitationsSECURITY.mdfor trust boundaries
17) Final Rule for LLM Operators
Imhotep is valuable only when it encodes user-visible layout truths.
Ask for every critical view:
- Which spatial relationships must always hold?
- Which relationships change with state or viewport?
- Which semantic subjects best represent user intent?
- What diagnostic evidence will prove regressions quickly?
Write those assertions first. Keep them deterministic. Fail closed.