[codex] Refine verify skill instructions#27
Conversation
…es file - New §3 "HTML annotation path": guides agents to annotate a source HTML file with data-cite="N" attributes and append <<<CITATION_DATA>>> after </html>, then run verify --html to preserve original HTML structure - New §2 triage row for "embed citations into static HTML" case; narrows the existing "Existing verified HTML" row to CLI-prior-run only - §4 --html note updated to cover both the embed-into and re-verify cases - Parallel generation guidance extracted to rules/parallel-generation.md; SKILL.md now defers to it for 100+ page / 3+ file scenarios - AGENTS.md guidance router updated with parallel-generation.md entry - Removed cloud-sandbox probe block, proxy invariants, and tool alternatives list (moved to their respective rules files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR ReviewThe restructuring intent is good - pulling parallel-generation detail into its own rules file reduces cognitive load for single-document tasks. A few issues need attention before merging. Bugs / Inconsistencies1. Flag mismatch: short vs long markdown output flag SKILL.md section 4 now uses the short form of the markdown output flag, but parallel-generation.md still uses the long form. One will break at runtime. Reconcile them across both files. 2. Model flag removed in SKILL.md but kept in parallel-generation.md The main verify command in SKILL.md dropped the model flag, matching the PR description. But the merge+verify command in parallel-generation.md still includes it. Either remove it from the rules file too, or restore it to SKILL.md. The current state is contradictory. 3. Parallel-generation trigger condition diverges between files SKILL.md states the trigger as "100+ pages AND 3+ distinct files". parallel-generation.md states it as "100+ pages AND 2+ distinct topics". Files vs. topics, and 3 vs. 2 - agents reading only SKILL.md will apply the wrong threshold. Reconcile these or have SKILL.md defer entirely to the rules file. Content Regressions4. Explicit alternative-tool list removed The old block named pdfplumber, PyPDF2, Tesseract, libreoffice, curl/wget, etc. by name. The replacement ("if you are tempted to reach for a generic read tool...") is weaker - agents pattern-match on concrete names. Consider keeping the list collapsed or moving it to a rules/tool-precedence.md file the router points to. 5. Proxy and timeout hard rules silently dropped Two HARD RULES removed from the Invariants section with no replacement anywhere:
These are not in cloud-sandbox-constraints.md either. If moving them there, do so explicitly. 6. Cloud sandbox probing logic removed without a redirect SKILL.md used to tell agents exactly when/how to probe for sandbox markers and explicitly warned not to gate solely on CLAUDE_CODE_REMOTE. That probing logic is gone, and the AGENTS.md router only routes to cloud-sandbox-constraints.md when an agent is already working on sandbox behavior. A first-time agent will not know to load the file. Add at minimum a note to probe for sandbox markers before the first deepcitation command. 7. prepare text flag change unexplained The prepare example dropped the text flag. If it was removed from the CLI, a brief note confirming this is intentional would help reviewers. Minor Notes8. verify html example now includes claim flag The new html example adds a claim flag which was absent before. If this is now required, clarify when to omit it (e.g. re-running existing verified HTML). 9. Results summary format removed without replacement The checkmark/warning/X summary line gave users a scannable result at a glance. The new Step 4 closure is less structured. If leaving format to the agent, say so explicitly. What works well
Summary: The structural refactor is sound, but items 1-3 are bugs producing broken CLI commands or wrong agent behavior. Items 4-6 remove safety guardrails. Fix 1-3 before merging; decide whether 4-6 need explicit re-anchoring in the rules tree. |
What changed
skills/verify/SKILL.mdguidance around claim/evidence triage for HTML and other claim-bearing inputs.prepareexample command and removed the stale--modelflag from theverifyexample.Why
Validation
skills/verify/SKILL.mddiff locally.origin.