Skip to content

refactor(verify): proxy hard rule, retry guard, multi-signal sandbox detection, Format 1 citations, page/line tagging#24

Merged
bensonwong merged 4 commits into
mainfrom
refactor/verify-proxy-hard-rules-tagging
Apr 14, 2026
Merged

refactor(verify): proxy hard rule, retry guard, multi-signal sandbox detection, Format 1 citations, page/line tagging#24
bensonwong merged 4 commits into
mainfrom
refactor/verify-proxy-hard-rules-tagging

Conversation

@bensonwong

@bensonwong bensonwong commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Proxy rule (breaking reversal): Remove the old "clear stale localhost proxy" snippet — it actively destroyed working Cowork sessions. Add an explicit hard rule: never unset/override HTTP_PROXY or HTTPS_PROXY. The CLI auto-detects the proxy correctly; touching these variables breaks it.
  • Retry / fallback hard rules: If prepare/verify exits non-zero or is killed, stop and report stderr — no backgrounding with nohup, no sleep/timeout wrappers, no --out--text swaps. No direct-read fallback (Read/urllib/web-fetch cannot substitute for a failed prepare). One exception: the 2-line banner-only truncation → retry once.
  • Sandbox detection (multi-signal): Probe on ANY of $CLAUDE_CODE_REMOTE=="true", HTTP_PROXY containing localhost:3128, whoami returning an adjective-color-name, or a prior bash call killed at ~45 s. Do NOT gate on $CLAUDE_CODE_REMOTE alone — it is not reliably set in every Cowork session. Updated cloud-sandbox-constraints.md to match.
  • Format 1 only for verifiable citations: Remove Format 2 guidance from sub-agent prompts — Format 2 was empirically broken (alignment iter4: 55 citations, 0 verified). Add [N] adjacency hard rule: [N] must appear immediately after **k**, never at clause end.
  • Page/line tagging for split-agent evidence files: Replace bare page-range slices with render_chunk() that wraps each page in <page_number_N_index_I> tags and tags lines at index 0, last, and every 5th (matching the CLI renderer). Without tags, subagents confabulate page_id/line_ids from global file offsets, producing citations that 404 in the viewer.

Test plan

  • Read skills/verify/SKILL.md Step 2 — confirm the proxy-clearing bash block is gone and the "Never modify proxy environment variables" rule is prominent
  • Confirm the Hard rules section (no retry spirals, no direct-read fallback) appears before the environment notes
  • Confirm sandbox detection lists all 4 signals and includes the env | grep + whoami probe snippet
  • Read the split-agent section — confirm render_chunk() code is present and the Format 2 guidance is fully removed
  • Read skills/verify/rules/cloud-sandbox-constraints.md — confirm the detection header matches the multi-signal description in SKILL.md

…detection, Format 1 citations, and page/line tagging

Proxy rule (BREAKING reversal):
- Remove the old "clear stale localhost proxy" snippet — that code was
  actively harmful; the Cowork proxy vars are the session's network
  lifeline, not noise to filter.
- Add explicit hard rule: NEVER unset/override HTTP_PROXY or HTTPS_PROXY.

Retry / fallback hard rules:
- No retry spirals: if prepare/verify exits non-zero or is killed, stop
  and report stderr — don't background with nohup, wrap in sleep/timeout,
  swap --out↔--text, or shrink input.
- No direct-read fallback: cannot substitute Read/urllib/web-fetch for a
  failed prepare and present the result as verified.
- One exception: prepare emitting only the 2-line banner → retry once.

Sandbox detection (multi-signal):
- Probe on ANY of: $CLAUDE_CODE_REMOTE=="true", HTTP_PROXY containing
  localhost:3128, whoami returning adjective-color-name, or a prior bash
  call killed at ~45s. Do NOT gate on $CLAUDE_CODE_REMOTE alone.
- cloud-sandbox-constraints.md: update detection header to match.

Format 1 only for verifiable citations:
- Remove Format 2 guidance from sub-agent prompts — Format 2 was proven
  broken in alignment iter4 (55 citations, 0 verified).
- Add [N] adjacency hard rule: [N] must appear immediately after **k**,
  never at clause end.

Page/line tagging for split-agent evidence files:
- Replace bare page-range slice with render_chunk() that wraps each page
  in <page_number_N_index_I> tags and tags lines at idx 0, last, and
  every 5th — matching the CLI renderer. Subagents copy tags verbatim
  into p/l fields; without tags they confabulate page_ids from global
  line offsets, producing citations that 404 in the viewer.
- Add validation: grep -c '<page_number_' before dispatching agents.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

PR Review

Overall this is a solid, well-motivated set of changes. The proxy reversal and multi-signal sandbox detection are clearly backed by operational evidence, and the Format 1–only pivot is justified by the empirical data cited (iter4: 55 citations, 0 verified). A few issues worth addressing before merge:


Bugs / correctness

1. Missing Path import in render_chunk() pseudocode

Path(".deepcitation/evidence-a.txt").write_text(render_chunk(agent_a_pages))

Path is not imported anywhere in the snippet. Agents that copy this verbatim will get a NameError. Add from pathlib import Path at the top of the block, or replace with open(..., "w").write(...).

2. Empty-line stripping may break l (line_id) accuracy

raw_lines = [l for l in page_text.split("\n") if l.strip()]

Stripping blank lines before assigning line id values means the tagged indices only count non-empty lines. If the CLI renderer counts all lines (including blanks) when it emits <line id="K"> tags, the IDs generated here will diverge from what the verify CLI expects. The PR says "matches the CLI renderer" — please confirm the CLI also strips empty lines before assigning ids, or remove the if l.strip() filter. A one-line comment would help future readers either way.

3. Validation only checks evidence-b.txt

grep -c '<page_number_' .deepcitation/evidence-b.txt

Agent A's file is never validated. Check both files, or note explicitly that the same check applies to evidence-a.txt.


Guidance gap introduced by Format 2 removal

The old "Format 2 when bold ≠ k — HARD RULE" explicitly covered terms that aren't verbatim in the source (acronym expansions like RLHF, concept labels like forward alignment). That rule is removed without a direct replacement.

The new CoT gate handles it indirectly: "if your planned key phrase doesn't appear word-for-word in f, it's a paraphrase — fix f first, then re-derive k." But for genuinely non-verbatim terms, there's no clear instruction — skip the citation, use the acronym as k, or bold something else?

Consider adding a one-liner: e.g., "If no short verbatim phrase in f can serve as k, bold the closest literal term that does appear — never invent a paraphrase."


Minor nits

Step 1 preamble: parallelism urgency removed

The deleted latency table and "fire prepare in the same turn as the preamble — never wait for a second message" were the only explicit directives preventing agents from delaying prepare on slow-tier evidence (URLs, Office files). The replacement text doesn't preserve that urgency. Even a single sentence — "Emit this preamble and call prepare in the same assistant turn" — would prevent the regression.

[N] adjacency example

The BAD/GOOD example is clear. Worth adding that prose may continue after [N] — just not between **k** and [N] — since the examples don't show a continuation case.


What's working well

  • The proxy hard rule is unambiguous; the rationale ("single fastest way to break a working Cowork session") will prevent future regressions.
  • render_chunk() solves a real confabulation failure mode cleanly; the page-tag preservation requirement (original 1-based indices, not renumbered) is correctly specified.
  • The l field counting instructions (per-page not per-file; count from nearest tag) are precise and actionable.
  • cloud-sandbox-constraints.md update is minimal and consistent with SKILL.md changes.

🤖 Generated with Claude Code

…al validation, non-verbatim k, prepare parallelism urgency, [N] continuation note

- Fix: add `from pathlib import Path` to render_chunk() snippet (NameError on copy-paste)
- Fix: remove `if l.strip()` blank-line filter from raw_lines — count all lines so idx+1 matches the CLI's 1-based line ids
- Fix: extend pre-dispatch validation to both evidence files (was only checking evidence-b.txt)
- Fix: add guidance for non-verbatim k — "bold the closest literal term that does appear word-for-word" when no short phrase in f can serve as k
- Improve: restore the "emit preamble and call prepare in the same assistant turn" directive dropped from Step 1 when the latency table was removed
- Improve: note in [N] adjacency example that prose may continue after [N]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bensonwong

Copy link
Copy Markdown
Contributor Author

@claude review the latest changes

@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

The PR diff is unchanged since my previous review above. Let me re-examine with fresh eyes to confirm what still applies.

Still open:

  1. Validation grep counts 2× pagesgrep -c '<page_number_' evidence-b.txt matches both opening (<page_number_N_index_I>) and closing (</page_number_N_index_I>) tags, so it returns 2 × chunk_size, not chunk_size. Use grep -cP '^<page_number_' (anchored, non-closing) to get an accurate page count.

  2. Path import missing from snippet — The render_chunk block calls Path(...).write_text(...) but from pathlib import Path isn't in the displayed snippet. Agents copying this verbatim will get a NameError. Either add the import or switch to open(..., "w").write(render_chunk(...)).

Previously raised, now resolved on re-read:

  • The non-verbatim-term gap (Format 2 removal) is addressed in the PR — the updated CoT gate paragraph already says "If no short verbatim phrase in f can serve as k, bold the closest literal term that does appear word-for-word — never invent or paraphrase."
  • Empty-line handling in raw_lines is correct as written (no blank-line stripping, all lines counted), so l IDs should be consistent.

No new issues found. Fix the two items above and this is good to merge.

🤖 Generated with Claude Code

`grep -c '<page_number_'` matched both opening and closing page tags,
returning 2×chunk_size instead of chunk_size. Use `grep -cP '^<page_number_'`
(anchored to line start) to count only opening tags — closing tags start
with `</` and are excluded by the anchor.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Review update — latest changes

Checking against the two open items from the previous review:

Both are resolved in the current diff:

  1. Path importfrom pathlib import Path is now present at the top of the render_chunk code block.
  2. Validation grep accuracygrep -cP '^<page_number_' (anchored, Perl regex) is used for both evidence-a.txt and evidence-b.txt. Anchoring with ^ means closing tags (</page_number_...>) are not counted, so the count correctly equals chunk size.

One minor note

render_chunk() uses "\n".join(parts) where each part already ends with a trailing \n. This produces a double blank line between page blocks in the output file. It's harmless for parsing (the grep validation and subagent tag-copying aren't whitespace-sensitive), but worth a brief comment if you want the output to be clean for human inspection.


No blocking issues. Good to merge.

🤖 Generated with Claude Code

…hunk

Trailing \n on each part + \n from join produced two blank lines between
page blocks. Drop the trailing \n from parts.append so join supplies the
single separator.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bensonwong bensonwong merged commit ae77e1c into main Apr 14, 2026
@bensonwong bensonwong deleted the refactor/verify-proxy-hard-rules-tagging branch April 14, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant