feat(goal): cheap Haiku inner-review tier + selectable model + optional Copilot by JFK · Pull Request #78 · JFK/gh-issue-driven

JFK · 2026-06-04T05:17:02Z

Summary

/goal already implemented the "short invocation, policy in the file, DoD over micro-steps" design well, so this is additive, not a rewrite. Three changes, all scoped to the review pipeline and its cost:

Cheap inner-review tier (step 5b). Before /ship's heavy gate2 cascade, a reviewer subagent reviews the working-tree diff (git diff — correct for the pre-PR stage, no PR exists yet) on the goal.inner_review.model tier (default haiku). Fix findings → re-check, capped at max_rounds (default 2). Implementation stays on the normal session model; only the reviewer downshifts. gate2 remains the authoritative gate — the Haiku pass is a fast filter, never a replacement.
Tighter TDD wording (step 5b). Red→Green→Refactor framed as an invariant, not a forced march: drive test-first where there's a test surface, keep cycles tight, and explicitly skip the ritual (no token tests, no mechanical "refactor") when there's nothing to assert.
Cost/selectability controls (reusing /ship's existing contract — no new Copilot keys):
- PR review (the expensive Copilot loop) is optional via the inherited review.provider, or the per-run no-copilot flag (forwarded to /ship).
- Inner-review model is selectable via goal.inner_review.model or the per-run review-model=<tier> flag.

Files

commands/goal.md — step 5b rewrite, flag parsing (no-copilot, review-model=), config resolution, 5c forwarding, plan block, deps table, frontmatter.
commands/config.md — goal.inner_review.* defaults + goal.* table rows + cost-control note.
README.md / README.ja.md — /goal flag-table parity.

Config (new keys, backward-compatible defaults)

"goal": {
  "autonomy": "red-only",
  "max_issues_per_run": 20,
  "inner_review": { "enabled": true, "model": "haiku", "max_rounds": 2 }
}

Defaults keep Copilot on and inner-review on — the cost levers exist but aren't forced. Flip review.provider / inner_review.enabled for a cheaper default posture.

Test results

All existing suites green, no new drift:

enum-sync-check — 33/33 in sync
jq-sync-check — 5/5 in sync
test_state_schema — 60/60 pass
copilot-detection, test_verdict_parser — pass
Built-in-defaults JSON re-validated with the new goal.inner_review block

Notes / follow-ups

No CHANGELOG entry or version bump — those are tag-time, owned by /tag (the CHANGELOG is a release index).
No automated test added for the new flags/config (the command files are prose specs; existing tests cover schema/enum sync, which still pass). A fixture exercising no-copilot forwarding + review-model override could be added if you want coverage there.
Three review layers now exist (Haiku inner pass → optional /code-review fallback → gate2 cascade); the inner tier is deliberately the cheap filter so this doesn't become review bloat.

🤖 Generated with Claude Code

…al Copilot Step 5b now runs a cheap inner-review pass over the working-tree diff (default Haiku tier) before /ship's heavy gate2 cascade, keeping implementation on the normal model and only downshifting the reviewer. gate2 remains the authoritative gate; the Haiku pass is a fast filter. Also tightens the 5b TDD wording (Red-Green-Refactor as an invariant, not a forced march) and adds cost/selectability controls for /goal, reusing /ship's existing contract rather than inventing new keys: - goal.inner_review.{enabled,model,max_rounds} config (default haiku, 2 rounds) - review-model=<tier> per-run flag to re-tier the inner review - no-copilot per-run flag, forwarded to /ship, to skip the expensive Copilot loop (review.provider stays the persistent lever) Docs: config.md goal.* table + cost-control note, README/README.ja flag parity. No CHANGELOG/version bump (tag-time, owned by /tag). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR updates the /gh-issue-driven:goal milestone driver to add a low-cost “inner review” pass before delegating to /ship, along with new configuration defaults and README flag documentation to control review cost and model tier selection.

Changes:

Add an inner-review loop in /goal step 5b (default Haiku tier) with max-rounds and per-run model override.
Make PR review cost controllable by forwarding no-copilot to /ship and documenting inherited review.provider.
Document new goal.inner_review.* defaults in /config and keep README flag tables in sync (EN/JA).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
`README.md`	Updates `/goal` command synopsis to document inner-review and new flags.
`README.ja.md`	Japanese parity update for `/goal` synopsis and flags.
`commands/goal.md`	Adds inner-review tier, tighter TDD guidance, new flags (`no-copilot`, `review-model=`), and forwarding behavior to `/ship`.
`commands/config.md`	Introduces `goal.inner_review.*` defaults and adds cost-control documentation for `/goal` runs.

@@ -1,8 +1,8 @@
 ---
-description: Phase G of gh-issue-driven — drive a whole milestone to PR. For each open issue it runs start → implement → /code-review → ship (gate2 + PR + Copilot review loop + session-summary), checkpointing to resumable state between issues. Gates on red verdicts (HITL); green/yellow auto-continue. (Full green/yellow unattended — suppressing the delegated start/ship prompts — is wired in #74.)
+description: Phase G of gh-issue-driven — drive a whole milestone to PR. For each open issue it runs start → implement (TDD + a cheap Haiku inner-review pass over the diff) → ship (gate2 + PR + Copilot review loop + session-summary), checkpointing to resumable state between issues. Gates on red verdicts (HITL); green/yellow auto-continue. (Full green/yellow unattended — suppressing the delegated start/ship prompts — is wired in #74.)


+
+**Inner review — cheap tier first, then fix-and-recheck.** Before handing off to `/ship`'s heavy gate2 cascade (5c), run one fast pass over the **working-tree diff** (`git diff` — no PR exists yet at this stage, so this reviews the diff directly):
+- When `INNER_REVIEW_ENABLED` (default `true`): **dispatch a reviewer subagent on the `INNER_REVIEW_MODEL` tier** (default `"haiku"`; the `review-model=<tier>` flag overrides for this run) to review the diff against the issue's acceptance criteria and the repo's conventions, returning concrete, actionable findings only (no praise, no restating the diff). Fix the valid findings, re-run the relevant checks, and repeat this cheap pass until it returns no new findings — capped at `INNER_REVIEW_MAX_ROUNDS` (default `2`). Leftover findings beyond the cap are not lost: gate2 (5c) is the authoritative gate. Log `goal: inner-review (<model>) — <n> finding(s) applied, <rounds> round(s)` so the recap shows it ran.
+- When `INNER_REVIEW_ENABLED` is `false`, or the requested model tier is unavailable: fall back to **`/code-review`** at an effort level matched to the change (`low` for docs/small, `medium` for features, `high`/`max` for risky or wide changes); apply its findings.


+**Controlling review cost.** Because `/goal` fans the review machinery across *every* open issue in a milestone, the Copilot loop is the dominant cost. Two independent levers, neither adding a new `goal.*` key:
+- **PR review (the expensive async Copilot loop)** — optional via the inherited `review.provider`: set it to `"none"` (no automated PR review) or `"code-review"` (single-shot, no polling loop) for a persistent, cheaper default; or pass the per-run `no-copilot` flag to `/gh-issue-driven:goal`, which forwards to `/ship` as `no-copilot` for that run only. With Copilot off, each issue's PR is opened and left for manual review (recorded `needs_human`).
+- **Inner review (the cheap pre-PR pass)** — `goal.inner_review.model` (default `"haiku"`) keeps it on a low-cost tier; the `review-model=<tier>` flag re-tiers it per run, and `goal.inner_review.enabled=false` removes it entirely (falling back to `/code-review`). Implementation always runs on the normal session model — only the review subagent is downshifted.


 arguments:
  - name: target
-    description: "The milestone to finish, e.g. 'finish milestone 0.5.0', a bare milestone title ('v0.5.0'), or a milestone number. Optional trailing flags: 'dry-run' (plan the order and print what would run, touch nothing), 'force' (treat red verdicts as yellow — fully unattended, no HITL at all), 'resume' (continue the most recent unfinished run for this milestone)."
+    description: "The milestone to finish, e.g. 'finish milestone 0.5.0', a bare milestone title ('v0.5.0'), or a milestone number. Optional trailing flags: 'dry-run' (plan the order and print what would run, touch nothing), 'force' (treat red verdicts as yellow — fully unattended, no HITL at all), 'resume' (continue the most recent unfinished run for this milestone), 'no-copilot' (skip the expensive Copilot PR-review loop for this run — forwarded to /ship as the per-run reviewer override), 'review-model=<tier>' (model for the cheap inner-review pass — a model alias like haiku|sonnet|opus or a full id; overrides goal.inner_review.model for this run)."


Copilot AI review requested due to automatic review settings June 4, 2026 05:17

Copilot started reviewing on behalf of JFK June 4, 2026 05:17 View session

JFK merged commit 70cec0b into main Jun 4, 2026
2 checks passed

JFK deleted the feat/goal-haiku-inner-review-optional-copilot branch June 4, 2026 05:20

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(goal): cheap Haiku inner-review tier + selectable model + optional Copilot#78

feat(goal): cheap Haiku inner-review tier + selectable model + optional Copilot#78
JFK merged 1 commit into
mainfrom
feat/goal-haiku-inner-review-optional-copilot

JFK commented Jun 4, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JFK commented Jun 4, 2026

Summary

Files

Config (new keys, backward-compatible defaults)

Test results

Notes / follow-ups

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants