Testing Strategy

This is stage-cli's canonical testing policy. All testing decisions — by humans and AI agents — follow this document.

Philosophy

Confidence over coverage. Optimize for confidence in shipping, not coverage numbers.
Cheapest test that catches the bug. Prefer the lowest-cost test layer that would detect the defect.
TypeScript + Biome are the first test layer. Strict TypeScript catches wrong arg types, missing fields, and null access (noUncheckedIndexedAccess is on). Biome catches unused code and import issues. Don't duplicate either with tests.
"Write tests. Not too many. Mostly integration." End-to-end tests that exercise the CLI's HTTP routes against a real SQLite database are the highest-ROI automated tests in this codebase.

Test Layers (Priority Order)

1. Static Analysis

TypeScript strict mode (strict: true, noUncheckedIndexedAccess: true, verbatimModuleSyntax: true) + Biome. Free. Run pnpm typecheck and pnpm lint before pushing.

2. Route / Server Integration Tests

Test API route handlers through the real startServer() HTTP boundary, hitting a real SQLite database created in a temp directory. Mock only what isn't part of the CLI itself (none today — the CLI has no external services).

This is the highest-ROI test layer. Most logic worth testing is request handling, schema validation, and database state transitions.

Examples:

packages/cli/src/__tests__/runs.routes.test.ts — exercises run/chapter routes against a real server + SQLite
packages/cli/src/__tests__/view-state.routes.test.ts — exercises view-state routes end-to-end
packages/cli/src/__tests__/server.test.ts — covers the static-file fallback, route compilation, and path-traversal guard

Use the helpers in packages/cli/src/__tests__/fixtures.ts to spin up a temp DB and the server.

3. Pure Logic Unit Tests

Schemas, parsers, and pure helpers. No mocks needed — these are pure functions.

Examples:

packages/cli/src/__tests__/schema.test.ts — Zod chapter-import schemas
packages/cli/src/__tests__/path.test.ts — DB path resolution
packages/cli/src/__tests__/import-chapters.test.ts — chapter import transformation

4. Web UI Component Tests

Narrow exception: tests for keyboard navigation, focus management, and form behavior in the React UI that can't be exercised via the server.

Constraints:

Must mock zero or one external boundary (the CLI's /api/* fetch calls)
Must mock at most one internal module
If a test needs more, lift the logic out of the component into packages/web/src/lib/ and test it there

Web tests live alongside their modules under packages/web/src/lib/__tests__/ and use happy-dom (set per-file via the // @vitest-environment happy-dom directive). Vitest runs from the workspace root and picks up tests in any package.

What to Test

Business logic with branching, data transformation, or edge cases
Path-resolution / sandboxing logic (the static-file path-traversal guard is security-sensitive)
Parsing and normalization (chapter JSON ingestion, schema validation)
API route handlers through their HTTP boundary
Bug fixes (regression test required)
Non-obvious Zod schema defaults or transforms

What NOT to Test

Static rendering — components that only render fixed markup with no conditional logic or user interaction
Type-guaranteed behavior — outcomes the type system already enforces
Trivial schema validation — a Zod schema parsing a valid object is a tautology
Component rendering in JSDOM that requires mocking 2+ internal modules — if you need router + fetch + global store, the test is testing the mock, not the app
UI library component behavior — that a tooltip, popover, or dialog renders correctly is the library's job
Drizzle / better-sqlite3 itself — assume the libraries work; only test your queries and migrations

The Mock Budget Rule

A test may mock at most:

One external-service boundary — for now, none (the CLI is fully local). For future work, this would be one HTTP API or one process boundary.
One internal module — vi.mock() for fetch, a route handler, or a hook

If a test needs 2+ internal mocks, the test is testing the mock setup, not the app. Either:

Extract the logic into a pure module and test it directly
Lift the test up to the route/server layer and use a real DB
Don't write the test

Infrastructure fakes don't count toward the budget:

Test clocks (vi.useFakeTimers())
Environment variable overrides
Temp directories / temp DBs
Simple stubs for browser APIs (matchMedia, ResizeObserver, IntersectionObserver)

Escape hatch: If a test file exceeds 200 lines due to legitimate test complexity (not mock setup), split by behavior group into multiple files. The mock budget still applies per file.

AI Agent Testing Rules

Never modify production source code to make a test writable. If the code is hard to test, either test at a different level or skip the test.
Max 200 lines per test file. If a module has 15+ behaviors worth testing, split into multiple focused test files by behavior group.
Never mock more than one external-service boundary per test file.
Never mock more than one internal module per test file.
Never test that a component "renders without crashing" — TypeScript already guarantees this.
Factory functions over inline object literals. Use make* or create* helpers in packages/cli/src/__tests__/fixtures.ts (or alongside the test file) with overrides.
One clear behavior per test. Name by behavior, not method name.
Arrange-Act-Assert. One clear action per test.
Use a real DB, not a mock. Spin up a temp SQLite via the existing fixtures. Drizzle/better-sqlite3 are fast enough that mocking them is never the right call.

When TDD Is Required

Bug fixes (write the failing test first, then fix)
Path-resolution / static-file sandbox changes
Parsers and data transformers
Schema changes that affect ingestion or query shape

When TDD Is Optional

Exploratory UI work
Layout and styling
Straightforward CRUD wiring (still test once it works)
Prototyping (but add tests before merging)

PR Requirements

New business logic, bug fixes, and security-sensitive changes (anything touching path resolution or static serving) must include tests
Visual-only UI changes do not require tests
New API routes must have at least one route-level integration test before merging

Slop Cleanup Rule

Never disable or delete tests to make them pass — fix the underlying issue. The only exception is slop tests (defined below), which should be deleted or rewritten when you're already modifying the code they cover.

When modifying code that is covered by a slop test (a test that violates the mock budget), delete or rewrite the test as part of the change. Don't leave broken-window tests in place.

Definition of a slop test: A test that mocks 2+ internal modules, OR has more mock setup lines than assertion lines, OR tests only that a component renders static markup, OR mocks the database instead of using a real temp SQLite.

Decision Guide

Scenario	Test layer	TDD?
Bug fix	Regression test at cheapest layer	Required
New API route	Route/server integration with real DB	Required
New business rule / logic	Pure unit or route integration	Required
Path-resolution / static-file change	Route/server integration	Required
Visual-only UI change	None	N/A
New parser / transformer	Pure unit	Required
New React component (logic-heavy)	Extract logic to `packages/web/src/lib/`, test there	Optional
New React component (display-only)	None	N/A
New schema migration	Route integration that exercises new columns	Required

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Strategy

Philosophy

Test Layers (Priority Order)

1. Static Analysis

2. Route / Server Integration Tests

3. Pure Logic Unit Tests

4. Web UI Component Tests

What to Test

What NOT to Test

The Mock Budget Rule

AI Agent Testing Rules

When TDD Is Required

When TDD Is Optional

PR Requirements

Slop Cleanup Rule

Decision Guide

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing Strategy

Philosophy

Test Layers (Priority Order)

1. Static Analysis

2. Route / Server Integration Tests

3. Pure Logic Unit Tests

4. Web UI Component Tests

What to Test

What NOT to Test

The Mock Budget Rule

AI Agent Testing Rules

When TDD Is Required

When TDD Is Optional

PR Requirements

Slop Cleanup Rule

Decision Guide