Files
Imhotep/SKILLS.md
T

12 KiB

IMHOTEP LLM Skill Guide

Purpose: teach an LLM/operator how to use Imhotep effectively, quickly, and safely in real repos.

Audience: coding agents and engineers adding relational GUI assertions to Playwright suites.

Reading mode: action-first. Prefer concrete patterns over theory.

1) Mental Model

Imhotep is for relational UI behavior, not snapshot aesthetics.

Use it to answer:

  1. Is element A left/right/above/below B with meaningful bounds?
  2. Does this relationship hold across states and environments?
  3. Do semantic selectors and layout assertions converge on stable behavior?
  4. Do failures explain "why" with machine-usable diagnostics?

Do not use Imhotep as a thin wrapper around raw pixel assertions.

2) Operator Workflow (Always Start Here)

When entering a codebase:

  1. Add imhotep-playwright and Playwright dependencies.
  2. Build one passing relation and one intentional failing relation.
  3. Confirm checkAll() returns structured diagnostics.
  4. Add semantic subject assertions (getByRole, getByText) where possible.
  5. Add transform-space checks (visual vs layout) for transformed UIs.
  6. Add state and viewport checks only after baseline relation checks are stable.

Ship small, truthful checks first; expand breadth iteratively.

3) Fast Start Template

import { test, expect } from '@playwright/test'
import { imhotep } from 'imhotep-playwright'

test('layout contract', async ({ page }) => {
  await page.goto('http://localhost:3000')
  // Pass cacheDir: null to avoid geometry cache serialization crash (known issue)
  const ui = await imhotep(page, { deterministic: true, seed: 42, cacheDir: null })

  ui.expect('[data-testid="primary"]').to.be.leftOf('[data-testid="secondary"]', {
    minGap: 8,
    space: 'visual',
  })

  const result = await ui.checkAll()
  expect(result.passed).toBe(true)
})

4) API Surface You Should Use

Primary public methods:

  1. imhotep(page, options?)
  2. ui.expect(subject)
  3. ui.spec(source)
  4. ui.checkAll()
  5. ui.extract(subject)
  6. ui.materializeState(selector, state)
  7. ui.applyEnvironment(env)
  8. ui.getByRole/getByText/getByLabelText/getByTestId/locator

Property-run entry points:

  1. imhotepComponent(component, options)
  2. imhotepStory(storyId, options)
  3. imhotepFixture(fixturePath, options)

5) Authoring Quality Ladder

Bronze (minimum acceptable)

  1. At least one relation assertion per critical screen
  2. One intentional failing test proving diagnostics are actionable

Silver (production worthy)

  1. Semantic selectors for user-visible elements
  2. State-aware checks (hover, focus, active, disabled, checked, expanded, selected, pressed, visited) for critical controls
  3. Responsive checks for mobile + desktop viewports

Gold (high confidence)

  1. Space-aware checks where transforms are present
  2. Property runs over meaningful prop/input domains
  3. Deterministic replay workflows documented in test harness
  4. CI gate on both workspace tests and fixture E2E

If tests only assert status booleans and ignore diagnostics, quality is incomplete.

6) Relation Checklist by Use Case

Control Alignment

  1. leftOf/rightOf with minGap
  2. alignedWith or centeredWithin with tolerance where needed

Containment and Layering

  1. inside / contains for container contracts
  2. overlaps only when overlap is intentional
  3. inStackingContext options for layering constraints
  4. separatedFrom for non-overlap with gap constraints

Size Contracts

  1. atLeast('44px').wide for target accessibility
  2. atMost and between for constrained layouts

Motion/Transform UI

  1. assert in default visual space first
  2. add explicit space: 'layout' where pre-transform semantics matter

7) Semantic Subject Guidance

Prefer semantic sources when they are stable and user-facing:

  1. getByRole(role, { name })
  2. getByLabelText(label)
  3. getByText(text)
  4. getByTestId(id) as a pragmatic fallback
  5. locator(css) or raw CSS only when semantics are unavailable

Use mixed semantic + CSS references when migrating legacy suites incrementally.

8) Dense String Contracts

Use ui.spec(...) when contract sets are easier to maintain as grouped text.

Rules:

  1. keep dense specs short and scoped per scenario
  2. keep fluent and dense checks semantically equivalent in critical paths
  3. use diagnostics from checkAll() to tighten ambiguous clauses

Basic Relation Syntax

Selectors must be single-quoted strings. Relations are keywords, not method calls.

// Spatial relations with gap constraints
ui.spec(`
  '[data-testid="a"]' leftOf '[data-testid="b"]' gap 8px
  '[data-testid="card"]' inside 'viewport'
  '[data-testid="header"]' above '[data-testid="content"]' gap 16px
  '[data-testid="sidebar"]' leftOf '[data-testid="main"]' gap 8px..24px
`)

Supported relations: leftOf, rightOf, above, below, alignedWith, centeredWithin, inside, overlaps, contains, separatedFrom.

Fluent API only:

  • Aliases: beside, nextTo, adjacent, touching, near, under, within
  • space: 'layout' / space: 'visual' option on relations
  • .and / .or chaining on fluent relations
  • State materialization: disabled, checked, expanded, selected, pressed, visited

Dense DSL only:

  • FOL quantifiers (forall, exists) with boolean connectives (and, or, not, implies)
  • width / height / size predicate calls with comparison operators (>=, <=, ==, !=)
  • Frame attachments: in viewport:, in containingBlock(...):

Both fluent and dense DSL:

  • contains, separatedFrom
  • between size assertions

Gap Options

ui.spec(`
  // Exact minimum gap
  '.button' leftOf '.label' gap 8px

  // Gap range (between min and max)
  '.button' leftOf '.label' gap 8px..16px
`)

Frame Attachments

Use in frameName: with indented assertions to scope relations to a specific frame.

ui.spec(`
  in viewport:
    '[data-testid="a"]' leftOf '[data-testid="b"]'
    '[data-testid="modal"]' centeredWithin 'viewport'

  in containingBlock('[data-testid="parent"]'):
    '.child' inside '.parent'
`)

Compound Assertions

Chain relations with and and or in dense DSL.

ui.spec(`
  '.header' above '.content' and leftOf '.sidebar'
  '.modal' centeredWithin 'viewport' or inside '.container'
`)

Size Assertions

ui.spec(`
  // Minimum size
  '[data-testid="btn"]' atLeast 44px wide
  '[data-testid="btn"]' atLeast 44px tall

  // Maximum size
  '[data-testid="img"]' atMost 200px wide

  // Size range
  '[data-testid="img"]' between 100px and 200px wide

  // Predicate-style size checks with comparison operators
  forall $btn in buttons('.primary'):
    width($btn) >= 44
    height($btn) >= 44
`)

Quantifiers

Apply all, any, or none to assert over multiple elements.

ui.spec(`
  all '.item' above '.footer' gap 16px
  none '.error' overlaps '.success'
`)

First-Order Logic (FOL)

Use forall and exists with boolean connectives for complex relational contracts.

ui.spec(`
  // All buttons are at least 44px wide
  forall $btn in buttons('.primary'):
    width($btn) >= 44

  // Existence: at least one card contains a title
  exists $card in cards('.card'):
    descendants($card, '.title')

  // Boolean connectives: and, or, not, implies
  forall $a in elements('.a'):
    forall $b in elements('.b'):
      leftOf($a, $b) and above($a, $b)

  forall $modal in elements('.modal'):
    not overlaps($modal, '.backdrop')

  forall $x in elements('.x'):
    forall $y in elements('.y'):
      inside($x, '.container') implies leftOf($x, $y)
`)

Supported connectives: and, or, not, implies.

Supported domain constructors: elements(selector), buttons(selector), cards(selector).

Nested quantifiers for multi-variable formulas: use nested forall blocks instead of comma-separated variables.

Supported predicates in FOL: leftOf, rightOf, above, below, inside, overlaps, alignedWith, centeredWithin, contains, separatedFrom, width, height, size.

Common Mistakes and Corrections

  • Bare selectors without quotes: Selectors must be single-quoted strings.

    // ❌ Wrong — bare selector
    [data-testid="x"] leftOf [data-testid="y"]
    
    // ✅ Correct — quoted selector
    '[data-testid="x"]' leftOf '[data-testid="y"]'
    
  • Using is keyword: The parser does not accept is or have as connecting words.

    // ❌ Wrong — 'is' is not a valid keyword
    'a' is leftOf 'b'
    
    // ✅ Correct — direct relation keyword
    'a' leftOf 'b'
    
  • Missing gap unit: Gap values require a unit.

    // ❌ Wrong — missing unit
    'a' leftOf 'b' gap 8
    
    // ✅ Correct — gap with unit
    'a' leftOf 'b' gap 8px
    
  • Wrong quote style: Use single quotes for selectors; double quotes inside are fine.

    // ❌ Wrong — double-quoted selector
    "[data-testid='x']" leftOf "[data-testid='y']"
    
    // ✅ Correct — single-quoted selector with double quotes inside
    '[data-testid="x"]' leftOf '[data-testid="y"]'
    

9) Diagnostics You Should Watch

Key codes and meanings:

  1. IMH_SELECTOR_ZERO_MATCHES: selector resolved to no elements
  2. IMH_EXTRACT_PROTOCOL_ERROR: extraction path failed
  3. relation-specific failures (example: IMH_RELATION_LEFT_OF_FAILED)

Operator rule:

Do not silence diagnostics; treat them as contract feedback.

10) Determinism and Replay

When reproducibility matters:

  1. initialize with deterministic options (seed)
  2. preserve failing diagnostics payloads in CI artifacts
  3. rerun with same seed before changing assertions

If a failure is flaky, first classify whether it is:

  1. extraction instability,
  2. real layout nondeterminism,
  3. threshold too strict for CI hardware.

11) CI Integration Pattern

Recommended gates:

  1. npm run build
  2. npm test --workspaces
  3. npx playwright test

For local-path package evaluation in temp projects:

  1. install all required local packages, not just imhotep-playwright
  2. if symlink duplication appears, set NODE_OPTIONS=--preserve-symlinks

12) Anti-Patterns (Do Not Do This)

  1. Writing only expect(result.passed).toBe(true) with no diagnostic assertions.
  2. Converting every relation to hardcoded pixel math.
  3. Ignoring transform-space semantics in transformed UIs.
  4. Treating selector zero matches as acceptable in passing tests.
  5. Suppressing fail-closed errors without root-cause triage.

13) Debugging Playbook

When a relation unexpectedly fails:

  1. inspect result.diagnostics first
  2. inspect result.clauseResults[*].status/truth/metrics
  3. run ui.extract(subject) for both sides to inspect geometry/origin
  4. verify state and viewport preconditions are applied
  5. for transformed elements, compare space: 'visual' vs 'layout'

When failure is error instead of fail:

  1. suspect extraction or unsupported path
  2. verify selector materialization and runtime context
  3. fail closed and do not coerce to pass

14) Property Run Guidance

Use property runs for invariant classes, not one-off screenshots.

Examples:

  1. minimum tap target sizes across prop combinations
  2. spacing constraints across variant inputs
  3. containment/alignment under generated data

For sampled runs:

  1. store seed and failing case metadata
  2. shrink only with oracle-preserving checks

15) Contract Evolution Strategy

When tightening contracts in existing suites:

  1. start with smoke relation checks per page
  2. add semantic subjects gradually
  3. introduce state assertions where user behavior depends on state
  4. introduce responsive and transform-space assertions next
  5. move shared checks into helper modules only after semantics stabilize

16) Documentation Pointers

  1. README.md for usage and quickstart
  2. SKILLS.md for authoring patterns and DSL syntax
  3. BUILD.md for build/test/e2e commands
  4. CHANGELOG.md for release notes and known limitations
  5. SECURITY.md for trust boundaries

17) Final Rule for LLM Operators

Imhotep is valuable only when it encodes user-visible layout truths.

Ask for every critical view:

  1. Which spatial relationships must always hold?
  2. Which relationships change with state or viewport?
  3. Which semantic subjects best represent user intent?
  4. What diagnostic evidence will prove regressions quickly?

Write those assertions first. Keep them deterministic. Fail closed.