fix(brainstorm): cost guardrails + judge overflow + far set cap by garrytan-agents · Pull Request #1234 · garrytan/gbrain

garrytan-agents · 2026-05-20T16:27:56Z

Incident: LSD Brainstorm 53× Cost Overrun

Estimated: $0.96 → Actual: $50.71 on a 13,690-page brain. Zero ideas delivered.

Root Causes

Far set explosion — listPrefixSampledPages returned one page per prefix (~2K prefixes → 1,985 pages instead of configured 12)
No cost circuit breaker — no mechanism to abort when actual spend diverges from estimate
Judge context overflow — 15,868 ideas at ~350 tokens each = 5.5M tokens, exceeding Sonnet 1M limit
Unpaired UTF-16 surrogates in OCR/import pages crashed JSON serialization
No per-cross timeout — individual crosses could hang indefinitely

Implemented Fixes

P1: Far set cap (domain-bank.ts)

Shuffle candidate prefixes, slice to maxFarSet (default max(m*4, 50)) before SQL
Final trim to m by distance score
Bill now scales with m, not |prefixes|

P2: Cost guardrails (brainstorm.ts + orchestrator.ts)

--max-cost <usd> (default $5): hard-abort pre-run
--strict-budget: abort mid-run if spend exceeds 5× estimate
--max-far-set <n> (default 50): explicit cap
--judge-model <id>: route judge to larger-context model

P3: Judge chunking (judges.ts)

Split ideas into batches of 100 (configurable via --max-ideas-per-judge-call)
Each batch is separate LLM call; results concatenated
15,868 ideas → 159 calls of ~100 instead of one 3M-token call

P4: Unicode sanitization (orchestrator.ts)

Strip unpaired UTF-16 surrogates before building cross prompts
Prevents JSON-encoding crashes on OCR/import-derived pages

Postmortem

Full incident report with token flow forensics and architectural proposals (global budgets for all analysis functions, diarization, checkpointing) in docs/incidents/2026-05-20-lsd-cost-explosion.md.

Proposed Future Work

P5: Global token/time/cost budgets for ALL gbrain analysis functions (brainstorm, dream, extract, enrich, eval, integrity, doctor)
P6: Diarization — summarize oversized payloads to fit context instead of failing
P7: Structured error recovery with checkpointing for interrupted runs

Tests

12 new tests in test/brainstorm/cost-guardrails.test.ts
Full brainstorm suite: 82 pass, 0 fail
tsc --noEmit: clean

- Add --max-cost flag (default $5) to brainstorm/lsd commands; hard-aborts pre-run if estimate exceeds, and mid-run if running cost overshoots. - Add --max-far-set flag (default max(m*4, 50)) to cap the domain bank's prefix-stratified sampling. listPrefixSampledPages returns one page per prefix; on a 13K-page brain with ~2K distinct prefixes this was pulling ~1985 far pages instead of the configured m=6. fetchFar now shuffles + caps the prefix list, and trims final pages to m by distance score. - Add --strict-budget flag: abort mid-run if running cost exceeds 5x the initial estimate (warn-only by default). - Chunk the judge phase (default 100 ideas per LLM call, --max-ideas-per-judge-call to override). Large brain runs produced 15K+ ideas, blowing past the model's 1M-token context in a single call. Now batched and concatenated. - Add --judge-model flag for routing the judge phase to a larger-context model when needed. - Sanitize unpaired UTF-16 surrogates in cross-prompt content (close+far page bodies, titles, question) to prevent JSON-encoding crashes on OCR/import-derived pages with lone surrogates. Fixes: 53x cost overrun on 13K-page brain ($0.96 estimate vs $50.71 actual) Fixes: judge phase 3M-token overflow > 1M model context Fixes: 1985-page far set when m_far was configured at 6

Incident report covering: - Root cause analysis (5 contributing factors) - Observed token flow and cost breakdown - Implemented fixes (P1-P4) in dc080ac - Proposed architectural changes: - P5: Global token/time/cost budgets for ALL analysis functions - P6: Diarization/summarization for oversized payloads - P7: Structured error recovery with checkpointing Key insight: every gbrain analysis function that makes LLM calls needs configurable budgets (tokens, cost, wall-clock time) with graceful degradation on exhaustion.

root added 2 commits May 20, 2026 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(brainstorm): cost guardrails + judge overflow + far set cap#1234

fix(brainstorm): cost guardrails + judge overflow + far set cap#1234
garrytan-agents wants to merge 2 commits into
garrytan:masterfrom
garrytan-agents:fix/brainstorm-cost-guardrails

garrytan-agents commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan-agents commented May 20, 2026

Incident: LSD Brainstorm 53× Cost Overrun

Root Causes

Implemented Fixes

Postmortem

Proposed Future Work

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant