feat(0.5.0): auto-load task-relevant KG entities in intake (BRO-1295)#6
Conversation
Closes the "contextualized on who, not on what" gap. Until now intake surfaced only persona constraints every prompt; task-specific entities loaded only when a domain lens scored >=2. For off-lens prompts (most), no topic knowledge was auto-loaded. Now every turn carries the top-5 relevant entities from the graph. Mechanism (self-contained — no cross-skill import of the kg loader): - After lens selection, scan docs/knowledge-index.md (dense-catalog-v2), score each entity's slug/tags/claim vs prompt tokens, surface top-5 in a new "Task-relevant knowledge (auto-loaded by relevance ...)" block. Guards (found + fixed via P11 validation against the live catalog): - Curated-match gate: surface only on slug/tag overlap, never body-text alone. - Body-excerpt suppression: weak-core_claim entities render path-only, not markdown noise (ellipsis + structural-marker heuristic). - Robust header parse: tolerates trailing " · score N/9" that drifted the parser. - Weighted slug(3)/tag(2)/claim(1), min 3, dedup vs persona/lens, persona skipped. - Never blocks the turn: any absence/parse error -> empty list, exit 0. 4 new tests; full suite 51/51 green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 26 minutes and 35 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…uppression (BRO-1295) P20 cross-review (FAIL 6/10) found two real, verified, untested defects: 1. BLOCKING — `except OSError` does not catch `UnicodeDecodeError` (a ValueError), so a single non-UTF-8 byte in any scraped catalog claim crashed the every-prompt hook (exit 1). Siblings already use `except Exception`; my narrow catch was a regression. Fix: broaden to `except Exception` + read with errors="replace" + an 8 MiB size cap (keeps the hook inside its time budget at any graph size). 2. BLOCKING — `_clean_claim_or_none` matched interior " > " as a markdown blockquote, false-suppressing legit math claims. On the live catalog this stripped the claim off `stability-budget` itself (the flagship RCS entity: "lambda_i must stay > 0 …"). Fix: detect only a *leading* "> "; the ellipsis check still catches the real body excerpts. Non-blocking (also applied): type whitelist (drop discovery/question noise), dedup by type/slug (not bare stem), catalog size cap. Tests: +2 regression tests (math-inequality claim survives; non-UTF-8 catalog exits 0). Suite 53/53. Verified on the live 318-entity catalog: stability-budget now renders its full claim inline; 0 discovery/question noise on generic prompts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
P20 Cross-Review — verdict: PASS (after one fix round)Round 1 (FAIL 6/10) — fresh-context adversarial review found two real, verified, untested blocking defects:
Fixes (c48515e): broaden to Round 2 (PASS) — re-verification confirmed: both resolved; no new regressions; suite 53/53; and the 2 new tests genuinely gate the fixes (reverting B1 → |
Problem
Per-task knowledge contextualization was manual. The
role-xintake hook auto-surfaced only persona constraints every prompt; task-specific entities loaded only when a domain lens scored ≥2. For off-lens prompts (most), no topic knowledge auto-loaded — the agent was contextualized on who the user is, not what we already know about the task. Thekgskill was the manual fallback.This was surfaced by an audit (BRO-1288 arc): capture + catalog-index are automated (Stop hooks), substrate is healthy (355 entities, 0 lint errors), retrieval is decent (R@5 0.86 via
bookkeeping bench) — but the read path wasn't wired into every turn.Change
After lens selection, intake scans the dense knowledge catalog (
docs/knowledge-index.md, schemadense-catalog-v2), scores each entity's slug/tags/claim against the prompt tokens, and surfaces the top-5 under a new block:Self-contained — no cross-skill import of the
kgloader, so the hook degrades gracefully (empty list, exit 0) when the catalog or kg is absent. Reuses the existing_tokenize_prompt+_confined_entity_path+_safe_inlinehelpers.Guards (each found + fixed by P11 validation against the live 318-entity catalog)
core_claimentities render path-only…(a real ≤140-char claim is stored whole) or structural markers· score N/9\]$anchor and made the parser drift, mis-attributing an adjacent block's text (caught a real entity surfacing the wrong claim)persona-type skippedValidation (P11)
Dep-chain (P14)
_format_intake_context,_tokenize_prompt,_confined_entity_path,_safe_inline; thedense-catalog-v2format emitted bybookkeeping index.UserPromptSubmitintake hook (every prompt, all workspaces). Merge tomain= published;.agents/+ global snapshots refreshed vianpx skills update role-x.Closes BRO-1295 · follows BRO-1288/BRO-1292.
🤖 Generated with Claude Code