v0.37.4.0 feat: pgGraph-inspired CI scaffolding wave (heavy tests + fuzz + RSS gate + frontier cap) by garrytan · Pull Request #1228 · garrytan/gbrain

garrytan · 2026-05-20T13:51:34Z

Summary

Adopts CI/test scaffolding patterns from pgGraph (a Postgres-extension project we evaluated and rejected as a runtime dep, but whose tests/heavy/ directory closes bug classes that have bitten gbrain in production). Plus one opt-in production change: BFS frontier cap on traverseGraph.

The wave landed in one commit after three plan-review passes (CEO scope + Eng dual-voice + Codex 2nd-pass verification of the revised plan).

Performance / robustness:

tests/heavy/measure_rss.sh — peak-RSS measurement against a 200-page synthetic workload; informational-only on macOS, baseline refresh gated to Linux CI runners
tests/heavy/read_latency_under_sync.sh — search p50/p95/p99 with baseline (no writes) vs under-load (parallel writers); delta_pct reported per percentile

Bug-class prevention (catches issues caught only in prod historically):

tests/heavy/pg_upgrade_matrix.sh — walks pre-v0.13 and pre-v0.18 simulated brain shapes forward to head via the engine's bootstrap → SCHEMA_SQL → migrations → verifySchema chain. Catches whole-system upgrade wedges. Honest contract: multi-layer healing means single-probe regressions aren't isolated here — test/schema-bootstrap-coverage.test.ts covers that.
test/fuzz/ — fast-check property tests over 8 trust-boundary validators. 2 are bundle-pure (escapeLikePattern, parseFactsFence) and guarded by scripts/check-fuzz-purity.sh (wired into verify). 5 are property-tested without the purity guarantee. 1 (validateUploadPath) gets its own fs-backed file with temp-dir confinement.
tests/heavy/sync_lock_regression.sh — N concurrent gbrain sync against one DB; asserts 1 winner + N-1 fast-fail lock-busy + zero leaked gbrain_cycle_locks rows. Correct semantics — eng review caught that the original plan asserted "wait + queue" but performSync actually fails fast.

Engine API addition (back-compat):

BrainEngine.traverseGraph(slug, depth, opts?) opts gain frontierCap?: number + onTruncation?: (info: TruncationInfo) => void. Both Postgres and PGLite implement parenthesized LIMIT N ORDER BY (slug, id) inside the recursive CTE. Return shape unchanged — Promise<GraphNode[]> preserved for MCP wire stability. Per-call callback closure (NOT engine-instance state) so concurrent traversals don't cross-talk.

Infra / dev experience:

tests/heavy/ directory convention + scripts/run-heavy.sh + bun run test:heavy. Helper files prefixed with _ are skipped by the runner.
.github/workflows/heavy-tests.yml — cron '17 8 * * *' + pull_request labeled trigger (heavy-tests) + Postgres service + artifact upload on failure. Pinned action SHAs per CLAUDE.md convention.

Test Coverage

Layer	What	Status
Unit tests	8052 pass / 0 fail across 8 shards (345s wallclock)	✅
T8 regression	5 contracts pinned (cap-unset, cap-hit, cap-not-hit, MCP wire-shape, concurrency)	✅ all pass
Fuzz suite	12 properties × 1000 runs across 3 test files	✅ ~3s, runs in default `bun test`
`bun run verify`	All 14 pre-checks + typecheck	✅ green
`bun run test:heavy` (PGLite-only paths)	RSS gate + read-latency under sync	✅ green
`bun run test:heavy` (Postgres paths)	pg_upgrade_matrix + sync_lock_regression — smoke-tested locally	✅ green

The new check:fuzz-purity gate ran against the pure-target list; all 2 verified bundle-pure (zero transitive node:fs / node:child_process / engine imports).

Pre-Landing Review

Three review passes before any code:

CEO scope review — selected Approach C (full sweep, 9 tasks) over Approach B
Eng dual-voice (Claude subagent + Codex) — 8 convergent CRITICAL/HIGH findings, all corrected in the plan before implementation
Codex 2nd-pass verification on the revised plan — caught 3 NEW issues the first pass missed:
- lastTraverseTruncation as engine-instance state would have been concurrency-unsafe. Switched to per-call onTruncation callback.
- require.cache snapshot for the fuzz purity guard is theatrical under Bun's ESM loader. Switched to bun build --target=bun + grep (bundle-true verification).
- Committed 50K-page PGLite fixture would have been a repo-size risk and contradicted T1's no-blobs principle. Switched to in-process synthesis.

All 3 caught issues addressed before this PR.

Plan Completion

All 9 tasks complete (T1-T9). Plan file: ~/.claude/plans/system-instruction-you-are-working-sorted-gizmo.md.

Task	Description	Verified
T1	Schema-migration matrix (deterministic builder)	✅
T2	Fuzz harness + purity guard	✅
T3	RSS budget gate (Linux-only baseline)	✅
T4	tests/heavy convention + runner	✅
T5	heavy-tests.yml CI workflow	✅
T6	read-latency-under-sync	✅
T7	sync lock regression	✅
T8	BFS frontier cap on `traverseGraph` (prototype-first design)	✅ 5 contracts pinned
T9	docs + CHANGELOG	✅

TODOS

No items closed by this wave. The 6GB RES on query`` observation in TODOS.md gets observability from T3's RSS gate but isn't fixed by it — leaving the entry open.

Documentation

CLAUDE.md updated:

File taxonomy section gains tests/heavy/*.sh entry (with underscore-prefix helper convention)
File taxonomy section gains test/fuzz/*.test.ts entry (with the purity-guard mechanism explanation)
traverseGraph entry notes the new TraverseGraphOpts + TruncationInfo exports and the per-call callback design

llms-full.txt regenerated.

Test plan

All 8052 unit tests pass
bun run verify green (including new check:fuzz-purity)
bun run test:heavy green locally (PGLite paths and Postgres paths via gbrain-test-pg container)
T8 regression test pins 5 contracts including concurrency independence
gh workflow view heavy-tests will validate the cron once the workflow file lands on master

🤖 Generated with Claude Code

Schema-migration matrix + fuzz harness + RSS budget gate + read-latency under sync + sync lock regression + tests/heavy convention + nightly CI workflow + BFS frontier cap on traverseGraph. CI infra (T1-T7): - tests/heavy/ directory convention + scripts/run-heavy.sh + bun run test:heavy - tests/heavy/pg_upgrade_matrix.sh: walk pre-v0.13 + pre-v0.18 brain shapes forward to head via bootstrap → SCHEMA_SQL → migrations → verifySchema - test/fuzz/{pure,mixed,filesystem}-validators.test.ts: 1000-run fast-check property tests across 8 trust-boundary validators - scripts/check-fuzz-purity.sh: bun-bundle + grep guard, wired into verify - tests/heavy/measure_rss.sh: in-memory PGLite workload + peak RSS measurement via /proc/self/status (Linux) or process.memoryUsage().rss fallback (macOS, refuses to write baseline) - tests/heavy/read_latency_under_sync.sh: phase A baseline + phase B under parallel writer load, reports p50/p95/p99 + delta_pct - tests/heavy/sync_lock_regression.sh: N concurrent gbrain sync against one DB, asserts 1 winner + N-1 lock-busy + zero leaked gbrain_cycle_locks rows - .github/workflows/heavy-tests.yml: cron '17 8 * * *' + heavy-tests label trigger + Postgres service + artifact upload on failure Engine (T8): - BrainEngine.traverseGraph opts gain frontierCap?: number + onTruncation?: (info: TruncationInfo) => void callback. Return shape preserved (Promise<GraphNode[]>) for MCP wire stability. - Postgres CTE: parenthesized LIMIT N ORDER BY (slug, id) inside recursive term. - PGLite: same SQL with positional params. - Per-call callback closure — not engine-instance state — so concurrent traversals on the same engine don't cross-talk. 5 contracts pinned in test/regressions/v0_36_frontier_cap.test.ts. Three plan-review passes ran before any code: CEO scope review (Approach C), Eng dual-voice review (Claude subagent + Codex), and Codex 2nd-pass against the revised plan. The 2nd pass caught issues the first two missed (Bun ESM vs require.cache; engine-instance metadata stomping under concurrency; fixture-size inconsistency). All addressed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Resolves: - VERSION: keep branch's 0.37.4.0 (master at 0.37.3.0; my slot is next) - package.json: keep 0.37.4.0; merge `verify` to include BOTH new gates — master's check:skill-brain-first AND branch's check:fuzz-purity - CHANGELOG.md: strip markers; both sides' entries kept (0.37.4.0 above master's 0.37.3.0 + 0.37.2.0) - TODOS.md: strip markers; both sides' new follow-up sections kept (branch's pgGraph follow-ups + master's skill_brain_first follow-ups) Trio agrees: VERSION=package.json=CHANGELOG=0.37.4.0. Verify + typecheck clean. T8 + fuzz tests still pass on merged state.

Same content, different slot in the version queue. v0.40.0.1 was the queue allocator's default safe slot (bumped past PR #1128's claimed 0.40.0.0). v0.37.5.0 is a PATCH above #1228's claimed 0.37.4.0 and sits closer to current master (0.37.1.0) in CHANGELOG ordering. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@garrytan-agents

…agging valid YAML) (#1229) * fix(markdown): YAML-aware NESTED_QUOTES validator The validator at src/core/markdown.ts:219-238 was a syntactic count-of-quotes heuristic that flagged any frontmatter line with 3+ unescaped " characters. That heuristic is too dumb: valid YAML flow sequences like `tags: ["yc", "w2025"]` and single-quoted scalars like `title: 'a: "b" "c"'` both have 3+ unescaped " by design. Fix: keep the count fast path, then disambiguate with js-yaml.safeLoad on the value. Only flag lines that genuinely fail to parse. The full-frontmatter YAML_PARSE check (check 6) still catches structural failures. Closes the 6,981-error class on Garry's 105K-page brain in one ~10 LOC change — existing data on disk was already valid YAML; the validator was wrong about it. No `gbrain frontmatter generate --fix` sweep needed. js-yaml@3.14.2 promoted from transitive (via gray-matter) to direct dependency. @types/js-yaml@3.12.10 added to devDependencies. 5 new YAML-aware test cases in test/markdown-validation.test.ts: - flow sequence with quoted tags does NOT trigger (6,981 regression guard) - single-quoted scalar with literal inner double quotes does NOT trigger - escaped-as-'' quotes inside flow seq do NOT trigger - genuinely broken nested quotes STILL trigger - unclosed bracket STILL surfaces NESTED_QUOTES or YAML_PARSE Closes PR #1217 — outside-voice (codex) review caught that the bug was the validator, not the emitter. Original 6,981-error signal from @garrytan-agents. * chore: bump version and changelog (v0.40.0.1) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: retarget version slot v0.40.0.1 -> v0.37.5.0 Same content, different slot in the version queue. v0.40.0.1 was the queue allocator's default safe slot (bumped past PR #1128's claimed 0.40.0.0). v0.37.5.0 is a PATCH above #1228's claimed 0.37.4.0 and sits closer to current master (0.37.1.0) in CHANGELOG ordering. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

garrytan force-pushed the garrytan/austin branch from 26fd984 to 00efa0e Compare May 20, 2026 13:54

garrytan changed the title ~~v0.40.1.0 feat: pgGraph-inspired CI scaffolding wave (heavy tests + fuzz + RSS gate + frontier cap)~~ v0.37.4.0 feat: pgGraph-inspired CI scaffolding wave (heavy tests + fuzz + RSS gate + frontier cap) May 20, 2026

garrytan force-pushed the garrytan/austin branch from 00efa0e to 5c1c86b Compare May 20, 2026 14:29

garrytan merged commit 9a3ef3c into master May 21, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.37.4.0 feat: pgGraph-inspired CI scaffolding wave (heavy tests + fuzz + RSS gate + frontier cap)#1228

v0.37.4.0 feat: pgGraph-inspired CI scaffolding wave (heavy tests + fuzz + RSS gate + frontier cap)#1228
garrytan merged 2 commits into
masterfrom
garrytan/austin

garrytan commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 20, 2026

Summary

Test Coverage

Pre-Landing Review

Plan Completion

TODOS

Documentation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant