refactor: v0.14.0 verified-safe cleanups (close the inflated compression milestone) by JFK · Pull Request #96 · JFK/gh-issue-driven

JFK · 2026-06-13T05:54:17Z

Closes #89

TL;DR

The v0.14.0 "token + precision optimization" milestone was re-audited against the actual files (using the token-baseline tool shipped in #94). Its premise turned out to be largely invalid. This PR ships the only verified-safe, genuinely-beneficial residue — two small cleanups netting −69 tokens (−0.09%). That negligible number is the finding.

What the re-audit found

Compression is mostly infeasible, not "~28% / ~6,400 tokens":

Slash commands load whole — there is no runtime include / conditional load. So "move the size-heuristic to an appendix" saves nothing (the appendix is still in the file), and cross-file dedup (refactor: optimize ship.md + review.md — compress + remove decouple-leftover dup #90, build: extract shared boilerplate via generator (trust boundary/lang/pre-flight/reviewer hint) #93) can't reduce runtime tokens without a Read-per-invocation.
The bulk of start.md / ship.md is load-bearing executable spec (gate contracts, state schema, the ## Verdict: parser regex, the "Invoke /plugin:command via the Skill tool" phrasing). Compressing it is not a safe token play.

The precision "bugs" were phantom:

refactor: optimize start.md — compress ~28% + fix dry-run/verdict/step-18b precision bugs #89 step-18b precedence is already an If / Else if / Else chain (start.md:1034-1037).
verdict last-wins is already explicit (start.md:556,575,611) — and now test-guarded by test: add decline-token coverage to the verdict-parser contract (#88) #95.
refactor: optimize goal/propose/tag — prose->table, dedup DRY_RUN, fix propose parallel-Skill claim #91 propose.md "invoke both skills in parallel" is correct (batched Skill calls are supported — used in this milestone's own gate2 runs).
refactor: optimize goal/propose/tag — prose->table, dedup DRY_RUN, fix propose parallel-Skill claim #91 propose.md "regex mismatch" is a harmless subset (extraction [a-z0-9 ] ⊆ validation [a-zA-Z0-9 _.:-]), not a contradiction.

What this PR actually does (the safe residue)

start.md: delete a verbatim-redundant lang != "en" localization line (was line 649) that duplicated line 647.
goal.md: convert the red-verdict force-continue prose into a compact decision table, preserving every load-bearing detail (the gate2.binary_gate fail exception, phase routing, continue-to steps).
Refreshed tests/fixtures/token-baseline.txt (per the test: deterministic token-count baseline for command files #87 workflow): TOTAL ~78,424 → ~78,355 tokens.

Milestone disposition

refactor: optimize start.md — compress ~28% + fix dry-run/verdict/step-18b precision bugs #89 — closed by this PR (its only real item was the line-649 dedup).
refactor: optimize ship.md + review.md — compress + remove decouple-leftover dup #90 / refactor: optimize goal/propose/tag — prose->table, dedup DRY_RUN, fix propose parallel-Skill claim #91 / refactor: optimize config/doctor/status — PMRP->appendix, schema->bullets #92 — being closed separately with the per-issue verified findings (claims phantom / infeasible / marginal).
build: extract shared boilerplate via generator (trust boundary/lang/pre-flight/reviewer hint) #93 — converted to a design-discussion issue: it conflicts with the explicit "no scripts/ directory" principle (CONTRIBUTING.md:24), doesn't reduce runtime tokens, and its "retire jq-sync-check" claim is incorrect.

The real value of this whole effort already shipped: #94 (the token-baseline measurement tool) and #95 (decline-token parser contract coverage). Those are worth keeping; the compression headline was not real.

Verification

tests/token-baseline.sh --check → reductions shown; snapshot refreshed.
28/28 verdict-parser fixtures, enum-sync, token-baseline self-test, frontmatter — all green.

🤖 Generated via /gh-issue-driven (milestone v0.14.0 wind-down)

The v0.14.0 "token + precision optimization" milestone was re-audited against the actual files (using the token-baseline tool from #94). The premise turned out to be largely invalid: - The "~28% / ~6,400 token" compression is not achievable: slash commands load whole (no runtime include / conditional load), so relocating sections to an appendix saves nothing, and the bulk of start.md/ship.md is load-bearing executable spec that must not be compressed. - The claimed precision bugs were phantom: step-18b precedence is already an If/Else-if chain; verdict last-wins is already explicit (and now test-guarded by #95); the propose.md "parallel Skill" instruction is correct (batched Skill calls are supported); the propose.md "regex mismatch" is a harmless subset, not a contradiction. This commit ships the ONLY verified-safe, genuinely-beneficial residue: - start.md: delete a verbatim-redundant `lang != "en"` localization line (649) that duplicated line 647. - goal.md: convert the red-verdict force-continue prose (phase-aware bullets) into a compact decision table, preserving every load-bearing detail (the gate2.binary_gate `fail` exception, phase routing, continue-to steps). Net effect (per tests/token-baseline.sh): TOTAL ~78,424 -> ~78,355 tokens (-69 tokens, -0.09%). The negligible number is itself the finding — it demonstrates the milestone's compression premise was unfounded, and the token-baseline tool (#87/#94) measuring it is working as intended. Closes #89 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

JFK marked this pull request as ready for review June 13, 2026 05:57

JFK merged commit 6210090 into main Jun 13, 2026
1 check passed

JFK deleted the 89-refactor/v0140-precision-fixes-safe-dedup branch June 13, 2026 05:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: v0.14.0 verified-safe cleanups (close the inflated compression milestone)#96

refactor: v0.14.0 verified-safe cleanups (close the inflated compression milestone)#96
JFK merged 1 commit into
mainfrom
89-refactor/v0140-precision-fixes-safe-dedup

JFK commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JFK commented Jun 13, 2026

TL;DR

What the re-audit found

What this PR actually does (the safe residue)

Milestone disposition

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant