JFK · JFK · Jun 4, 2026 · Jun 4, 2026
diff --git a/commands/goal.md b/commands/goal.md
@@ -152,7 +152,7 @@ Implementation has **two orthogonal layers**, exactly as `/start` step 17b/17c d
 Run the change to green on the issue's acceptance criteria, then run the repo's relevant checks (tests, plus lint/typecheck/build scaled to what the change touches).
 
 **Inner review — cheap tier first, then fix-and-recheck.** Before handing off to `/ship`'s heavy gate2 cascade (5c), run one fast pass over the **working-tree diff** (`git diff` — no PR exists yet at this stage, so this reviews the diff directly):
-- When `INNER_REVIEW_ENABLED` (default `true`): **dispatch a reviewer subagent on the `INNER_REVIEW_MODEL` tier** (default `"haiku"`; the `review-model=<tier>` flag overrides for this run) to review the diff against the issue's acceptance criteria and the repo's conventions, returning concrete, actionable findings only (no praise, no restating the diff). Fix the valid findings, re-run the relevant checks, and repeat this cheap pass until it returns no new findings — capped at `INNER_REVIEW_MAX_ROUNDS` (default `2`). Leftover findings beyond the cap are not lost: gate2 (5c) is the authoritative gate. Log `goal: inner-review (<model>) — <n> finding(s) applied, <rounds> round(s)` so the recap shows it ran.
+- When `INNER_REVIEW_ENABLED` (default `true`): **dispatch a reviewer subagent on the `INNER_REVIEW_MODEL` tier** (default `"haiku"`; the `review-model=<tier>` flag overrides for this run) to review the diff against the issue's acceptance criteria and the repo's conventions. **Direct the reviewer with three operationalized rules** — not the vague "fresh eyes" idiom, which is too weak to steer a small model: (1) **judge the diff on its own merits — do not assume it is correct just because it was written in this same session** (anti-anchoring, the guard against rubber-stamping); (2) **actively hunt for acceptance-criteria gaps and convention violations** — gate2 (5c) is authoritative, so a false positive here is cheap while a miss is costly, so bias toward searching; (3) **ground every finding in evidence** — cite `file:line` and the concrete failure mode. Return concrete, actionable findings only (no praise, no restating the diff). Fix the valid findings, re-run the relevant checks, and repeat this cheap pass until it returns no new findings — capped at `INNER_REVIEW_MAX_ROUNDS` (default `2`). Leftover findings beyond the cap are not lost: gate2 (5c) is the authoritative gate. Log `goal: inner-review (<model>) — <n> finding(s) applied, <rounds> round(s)` so the recap shows it ran.
 - When `INNER_REVIEW_ENABLED` is `false`, or the requested model tier is unavailable: fall back to **`/code-review`** at an effort level matched to the change (`low` for docs/small, `medium` for features, `high`/`max` for risky or wide changes); apply its findings.
 
 This keeps **implementation on the normal session model and only the *inner* review cheap** — the authoritative, expensive review remains the gate2 cascade `/ship` runs in 5c, so the Haiku pass is a fast filter for obvious issues, never a replacement for the real gate.