refactor(verify): narrow trigger to claim+evidence; drop read-only mode by bensonwong · Pull Request #28 · DeepCitation/skills

bensonwong · 2026-04-16T23:48:51Z

Summary

Skill was firing too broadly — description and §1 Orient included OCR, extract, summarize, read, parse as triggers, so /verify ran even when the user only wanted file contents
Dropped read-only mode from §1 — skill now requires both a claim and an evidence source; plain document reading uses prepare directly and answers normally
Removed Tool precedence section ("ALWAYS use this skill for PDFs…") — overclaim replaced with a single clear note: use prepare for reading, /verify for citing
Dropped §2 Read-only fast path (~13 lines) — no longer a concern of this skill
Triage table 8→6 rows — removed read-only row and the self-correction loop ("you prepared the claims file as evidence")
Collapsed Format 2 k≠claimText subsection (8 lines) into a 2-line gotcha note — concept still enforced by field table and STOP AND CHECK
Updated frontmatter description to verify-only scope
Net: 309 → 271 lines (−38)

Test plan

Prompt with only a PDF and "summarize this" — skill should NOT fire; model uses prepare and answers directly
Prompt with a claim + evidence doc — skill fires, full §1→§4 pipeline runs
/verify in prompt — skill fires regardless
HTML embed case — §2 triage routes to HTML annotation path correctly
Existing verified HTML re-run — triage routes to verify --html correctly

Skill was firing on any document mention (read, OCR, summarize, extract) and running the full pipeline when the user only wanted file contents. - Remove read-only mode from §1 Orient — skill now requires both a claim and an evidence source to run - Drop Tool precedence section ("ALWAYS use this skill for PDFs...") - Drop §2 Read-only fast path (13 lines) - Remove read-only and self-correction-loop rows from §2 triage table (8 rows → 6) - Collapse redundant Format 2 k≠claimText subsection into a 2-line gotcha - Update frontmatter description to reflect verify-only scope - Add explicit note: if only a document is provided, use prepare directly and answer normally; /verify only kicks in when there is something to cite 309 → 271 lines Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-04-16T23:53:27Z

Review: refactor(verify): narrow trigger; drop read-only mode

Overall the refactor is sound. Narrowing the trigger is correct — the old description fired the skill on plain reads, burning steps unnecessarily. The changes are internally consistent.

What works well

Frontmatter description is now precise. The old version was a catch-all; the new one correctly scopes to verify/cite/fact-check.

citation-anchors.md reference block (lines 96-103) is now a structured when-to-consult list rather than an afterthought at the end of the Format 2 subsection. Clearer decision criteria for when to open that rules file — a genuine improvement.

Field table k note (line 166) captures both formats in one place. Cleaner than the old 8-line subsection, and STOP AND CHECK (line 208) reinforces it identically.

Triage table simplification is justified. The "you prepared claims as evidence" row was a self-correction path that the narrower trigger now prevents upstream.

Minor gaps worth addressing

1. Follow-up verification: no guidance on re-using an existing .txt

Line 15 introduces: "If the user first asks a question about a document, answer it using prepare. If they then ask you to verify that answer, run this skill..."

But there is no hint about whether to re-run prepare or reuse .deepcitation/name.txt if it still exists. The removed read-only fast path had this explicitly: "If the same document was used and .deepcitation/name.txt still exists on disk, no need to re-run prepare."

Worth preserving — either appended to line 15 or as a note in S2 under the file/URL row. Without it, an agent will re-run prepare unnecessarily on a doc it already processed.

2. Format 2 gotcha example (low priority)

"converts automatically" reads like a SAFE agreement. AGENTS.md favors generic, domain-neutral examples (invoices, leases, contracts, memos). A SAFE is a contract so it is technically fine, but something more self-evidently neutral — e.g. "expires automatically" from a lease — would better match the spirit of the invariant.

3. Test plan gap

No coverage for the follow-up scenario introduced on line 15 (user reads first, then verifies). Worth adding: prepare a doc, get an answer, then ask to verify — skill fires, prepare is skipped if .txt exists.

No issues with

Removal of the Tool precedence section ("ALWAYS use this skill") — overclaim, rightly gone
Preamble + prepare in same turn (unchanged, still correct)
HTML annotation path (untouched)
STOP AND CHECK (untouched)
Invariants section (untouched)

Net: approve. The re-prepare optimization is the one substantive thing worth adding before merge; the rest are suggestions.

- Add re-prepare optimization note to follow-up scenario (line 15): if .deepcitation/<name>.txt exists from a prior prepare run, skip §2 and go straight to §3 — avoids redundant prepare on same doc [must-fix] - Swap SAFE-flavored Format 2 gotcha example ("converts automatically") for a lease-neutral one ("renewed automatically" / "automatically renew") per AGENTS.md domain-neutral example invariant [nice-to-have] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bensonwong · 2026-04-16T23:55:30Z

@claude review the latest changes

bensonwong merged commit 271b8bb into main Apr 17, 2026

bensonwong deleted the refactor/verify-narrow-to-claim-evidence branch April 17, 2026 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(verify): narrow trigger to claim+evidence; drop read-only mode#28

refactor(verify): narrow trigger to claim+evidence; drop read-only mode#28
bensonwong merged 2 commits into
mainfrom
refactor/verify-narrow-to-claim-evidence

bensonwong commented Apr 16, 2026

Uh oh!

claude Bot commented Apr 16, 2026

Uh oh!

bensonwong commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

bensonwong commented Apr 16, 2026

Summary

Test plan

Uh oh!

claude Bot commented Apr 16, 2026

Review: refactor(verify): narrow trigger; drop read-only mode

What works well

Minor gaps worth addressing

No issues with

Uh oh!

bensonwong commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant