feat(verify): add HTML embed path and extract parallel-generation rules#26
feat(verify): add HTML embed path and extract parallel-generation rules#26bensonwong wants to merge 1 commit into
Conversation
…es file - New §3 "HTML annotation path": guides agents to annotate a source HTML file with data-cite="N" attributes and append <<<CITATION_DATA>>> after </html>, then run verify --html to preserve original HTML structure - New §2 triage row for "embed citations into static HTML" case; narrows the existing "Existing verified HTML" row to CLI-prior-run only - §4 --html note updated to cover both the embed-into and re-verify cases - Parallel generation guidance extracted to rules/parallel-generation.md; SKILL.md now defers to it for 100+ page / 3+ file scenarios - AGENTS.md guidance router updated with parallel-generation.md entry - Removed cloud-sandbox probe block, proxy invariants, and tool alternatives list (moved to their respective rules files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR ReviewOverall this is a well-structured refactor. Extracting the parallel pipeline into its own rules file and adding the HTML embed path are both good moves. A few issues worth addressing before merge. Blocking / CorrectnessThreshold mismatch: three inconsistent definitions of when to use parallel generation
The original SKILL.md was internally consistent ("2+ distinct sections / 2+ top-level headings"). Pick one formulation and apply it in all three places.
Old text named the forbidden tools ( New text collapses to: "If This is the exact moment agents are most tempted to reach for an alternative tool. The explanation needs to live here — either restore it inline or add an explicit pointer to the file that carries the full rule. SignificantCloud-sandbox probe removed with no guaranteed load trigger The original section warned: "false negatives are catastrophic because the agent loses awareness of the 45 s bash timeout, the The proactive probe ( Proxy invariant dropped from Invariants "Never modify proxy environment variables on individual command runs" was listed in Invariants — the section that applies everywhere, sandbox or not — precisely because this failure mode appears outside sandboxes too. The rules file is on-demand only. Restoring this line to Invariants is one sentence and closes a real gap. Minor / PolishTriage row example text is awkward Old: A file doesn't make claims. Consider: Results summary format dropped with no pointer The verified/partial/not-found summary format was removed from the closing step. None of the linked rules files appear to own it. If it's intentionally retired, say so in the section; otherwise agents will invent inconsistent formats.
Fragile if the function is renamed. Rewrite as observable behavior: "The CLI strips everything from CLI version note ( Fine if the minimum supported CLI now always emits |
Summary
data-cite="N"attributes on any HTML element and appends<<<CITATION_DATA>>>as raw text after</html>;parseCitationData()strips it before output so it never renders in the browser. Runsverify --htmlto inject the CDN runtime while preserving original structure and styling.data-citation-keyattributes).--htmlnote updated — now covers both entry points: embed-into case (.deepcitation/{draft}-body.html) and re-verify of prior CLI output.rules/parallel-generation.md. SKILL.md defers to it for 100+ page / 3+ file scenarios; single-topic and sub-100-page cases stay inline.rules/parallel-generation.md.Test plan
SKILL.md§1–§4 end-to-end: confirm read-only, verify/markdown, and verify/html-embed are each clearly differentiated in Orient and Triagedata-citewrapping →<<<CITATION_DATA>>>after</html>→ §4verify --html .deepcitation/{draft}-body.htmlrules/parallel-generation.mdexists and contains the full pipeline (evidence tagging, split math, merge failure recovery)parallel-generation.mdunder the correct trigger keywords