fix: Apply per-stage model tier uniformly + inject effort via subagent prompt#413
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the Fab pipeline “per-stage model resolution” contract so every dispatch consistently surfaces the resolved {model, effort} (for visibility) and makes the resolved effort consumable at runtime by injecting it into sub-agent prompts (since the Agent tool lacks an effort parameter).
Changes:
- Update skill-stage dispatch prose to (a) surface resolved
model=/effort=at each dispatch site and (b) apply tiers via two seams: Agentmodelparam + prompt effort instruction. - Reconcile specs/findings/memory docs to match the uniform post-intake dispatch model and the new effort-via-prompt seam.
- Add a Go regression test pinning that
.history.jsonlstage-transition shape is driver-blind (modulo the optionaldriverfield).
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/kit/skills/fab-fff.md | Updates fff-only ship/review-pr dispatch instructions to surface tier + inject effort via prompt. |
| src/kit/skills/fab-ff.md | Aligns wrapper’s per-stage model note with the two-seam (model+prompt-effort) contract. |
| src/kit/skills/fab-continue.md | Reframes /fab-continue as a one-stage sequencer and documents tier application + nested reviewer tiering. |
| src/kit/skills/_preamble.md | Canonicalizes the two-seam dispatch contract and the “compliance visibility” requirement. |
| src/kit/skills/_pipeline.md | Updates per-stage dispatch instructions (incl. rework loop) to surface tier + inject effort via prompt. |
| src/go/fab/internal/status/mutators_test.go | Adds a regression test asserting history transition shape is identical regardless of driver. |
| docs/specs/stage-models.md | Reconciles spec with uniform post-intake dispatch + effort-via-prompt seam + visibility expectation. |
| docs/specs/skills/SPEC-fab-fff.md | Mirrors fab-fff.md per-stage model dispatch changes. |
| docs/specs/skills/SPEC-fab-ff.md | Mirrors fab-ff.md per-stage model note changes. |
| docs/specs/skills/SPEC-fab-continue.md | Mirrors fab-continue.md sequencer model + tier/effort behavior. |
| docs/specs/skills/SPEC-_preamble.md | Mirrors _preamble.md per-stage model resolution contract updates. |
| docs/specs/skills/SPEC-_pipeline.md | Mirrors _pipeline.md dispatch + visibility + effort-via-prompt updates. |
| docs/findings/per-stage-model-tier-application.md | Marks gaps as closed/ հասցed and documents the residual harness ask. |
| docs/memory/pipeline/schemas.md | Records the driver-blind history-shape invariant and links the new test. |
| docs/memory/pipeline/index.md | Updates “Last Updated” dates for pipeline memory indexes. |
| docs/memory/pipeline/execution-skills.md | Updates execution-skills memory to reflect visibility + effort-via-prompt + uniform dispatch. |
| docs/memory/distribution/index.md | Updates “Last Updated” date. |
| docs/memory/_shared/index.md | Updates shared memory index entry for context-loading. |
| docs/memory/_shared/context-loading.md | Updates shared context-loading memory with the two-seam + visibility contract. |
| fab/changes/260613-m3d4-uniform-stage-model-tier/plan.md | Adds/updates the change plan artifact for this workstream. |
| fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml | Updates change status/progress/metrics for the change artifact. |
| fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl | Updates change history log for the change artifact. |
| fab/changes/260613-fgxx-collapse-post-intake-dual-mode/plan.md | Adds/updates the referenced prior change plan artifact. |
| fab/changes/260613-fgxx-collapse-post-intake-dual-mode/.status.yaml | Updates change status/progress/metrics for the referenced artifact. |
| fab/changes/260613-fgxx-collapse-post-intake-dual-mode/.history.jsonl | Updates change history log for the referenced artifact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 2. **Triage + rework action**: triage the prioritized findings, select exactly one path per the decision heuristics below, and apply its edits (uncheck tasks / edit `plan.md` / edit `## Requirements`). | ||
| 3. **Re-dispatch apply**: re-run `fab resolve-agent apply` and apply the resolved model/effort, then dispatch `/fab-continue` as a subagent — Apply Behavior, same prompt contract as Step 1 (do NOT run `fab status`; return results only). On success, run `fab status finish <change> apply {driver}` (auto-activates review). | ||
| 4. **Fresh re-review**: re-run `fab resolve-agent review` (once, governing both reviewers + merge) and apply the resolved model/effort, then dispatch a **fresh** `/fab-continue` Review Behavior subagent, same prompt contract as Step 2. Never reuse a prior review subagent's context. | ||
| 3. **Re-dispatch apply**: re-run `fab resolve-agent apply`, surface the resolved `model=/effort=`, and apply both halves (model via the Agent `model` param; effort via the imperative prompt instruction ``Operate at `<effort>` reasoning effort for this task.``, omitted when empty), then dispatch `/fab-continue` as a subagent — Apply Behavior, same prompt contract as Step 1 (do NOT run `fab status`; return results only). On success, run `fab status finish <change> apply {driver}` (auto-activates review). |
There was a problem hiding this comment.
Skipped — not a defect. The double-backtick code span … intentionally wraps content that itself contains single backticks (around <effort>), which is the standard CommonMark idiom for embedding backticks in a code span: the opening run of 2 backticks closes on the next run of exactly 2, so the inner single backticks render literally. Verified by CommonMark tokenization — the span is well-formed and renders correctly.
|
|
||
| | Change | Date | Summary | | ||
| |--------|------|---------| | ||
| | 260613-m3d4-uniform-stage-model-tier | 2026-06-13 | **Per-stage dispatch sites surface the resolved tier + inject effort via the subagent prompt (two-seam dispatch contract)** (Change C — skills/docs prose only, no Go; gated on `260613-fgxx`). Every per-stage dispatch site in the `fab-ff`/`fab-fff` orchestrator family — the shared `_pipeline.md` bracket (Behavior "Per-stage model resolution" note + Steps 1 apply / 2 review / 3 hydrate + Auto-Rework re-apply/re-review items 3/4) and `fab-fff.md` Steps 4–5 (ship/review-pr) — now (1) **surfaces** the resolved `model=/effort=` after `fab resolve-agent <stage>` (carried into the dispatch prompt and/or echoed in step output, so a skipped/mis-resolved tier is **visible rather than silent** — Gap 1b compliance visibility), and (2) applies the tier through **two seams**: the **model** half via the Agent tool's `model` param (unchanged) and the **effort** half via an imperative instruction in the subagent prompt (``Operate at `<effort>` reasoning effort for this task.``, omitted when empty — Gap 2's effort half, previously left as fgxx's deferred follow-up; the Agent tool has no `effort` param). `review` resolves **once** for both reviewers + merge (same model + same effort instruction). `fab-continue.md`'s one-stage-sequencer header note + post-intake dispatch contract + nested-reviewers note carry the same two-seam phrasing; `fab-ff.md`'s sibling note (structural twin of fab-fff's) reconciled to match. The harness-adapter boundary records the **id→alias** detail (the `model` param takes `opus`/`sonnet`/`haiku`/`fable`, not the full `claude-*` id the resolver emits). The lone residual is the **harness ask** — a first-class per-sub-agent `effort` param on the Agent tool, out of fab's control, not built. SPEC mirrors updated same-PR: `SPEC-_pipeline.md`, `SPEC-_preamble.md`, `SPEC-fab-fff.md`, `SPEC-fab-ff.md`, `SPEC-fab-continue.md`. `docs/findings/per-stage-model-tier-application.md` (Gap 1a closed by fgxx, Gap 1b + Gap 2-effort addressed here, item 4 = residual) and `docs/specs/stage-models.md` reconciled. Canonical contract: [_shared/context-loading.md](../_shared/context-loading.md) § Per-Stage Model Resolution. | |
There was a problem hiding this comment.
Skipped — not a defect. The double-backtick code span … intentionally wraps content that itself contains single backticks (around <effort>), which is the standard CommonMark idiom for embedding backticks in a code span: the opening run of 2 backticks closes on the next run of exactly 2, so the inner single backticks render literally. Verified by CommonMark tokenization — the span is well-formed and renders correctly.
|
|
||
| | Change | Date | Summary | | ||
| |--------|------|---------| | ||
| | 260613-m3d4-uniform-stage-model-tier | 2026-06-13 | **Per-Stage Model Resolution gains the two-seam dispatch contract + compliance visibility** (Change C — skills/docs prose only, no Go; gated on `260613-fgxx`). The resolved tier's **two halves now dispatch through two seams**: the **model** rides the Agent tool's `model` param (unchanged), the **effort** is injected as an imperative instruction in the subagent prompt (``Operate at `<effort>` reasoning effort for this task.``, omitted when empty) — closing Gap 2's effort half, which fgxx had left as a deferred follow-up (the Agent tool has no `effort` param). Added a **compliance-visibility** expectation (Gap 1b): every dispatch site surfaces the resolved `model=/effort=` (into the dispatch prompt and/or step output) so a skipped/mis-resolved `fab resolve-agent` call is visible rather than silent, plus a prose all-empty-resolution surface/assert note (no Go guard — dispatch is harness-internal). Harness-adapter boundary now records the **id→alias** detail (the `model` param takes `opus`/`sonnet`/`haiku`/`fable`, not the full `claude-*` id the resolver emits — the orchestrator maps at the seam) and names the prompt as the effort adapter. The lone residual is the **harness ask** — a first-class per-sub-agent `effort` param on the Agent tool, out of fab's control, not built. Mirrors `_preamble.md` § Per-Stage Model Resolution; the `_pipeline.md`/`fab-ff`/`fab-fff`/`fab-continue` dispatch sites and their SPEC mirrors carry the same two-seam phrasing (see [pipeline/execution-skills.md](../pipeline/execution-skills.md)); `docs/findings/per-stage-model-tier-application.md` and `docs/specs/stage-models.md` reconciled to this reality. | |
There was a problem hiding this comment.
Skipped — not a defect. The double-backtick code span … intentionally wraps content that itself contains single backticks (around <effort>), which is the standard CommonMark idiom for embedding backticks in a code span: the opening run of 2 backticks closes on the next run of exactly 2, so the inner single backticks render literally. Verified by CommonMark tokenization — the span is well-formed and renders correctly.
| func loadFixtureInFabRoot(t *testing.T) (statusFile *sf.StatusFile, statusPath, fabRoot, changeDir string) { | ||
| t.Helper() | ||
| fabRoot = t.TempDir() | ||
| // statusFile.Name in the fixture is "260310-abcd-my-change". | ||
| changeDir = filepath.Join(fabRoot, "changes", "260310-abcd-my-change") | ||
| if err := os.MkdirAll(changeDir, 0o755); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| statusPath = filepath.Join(changeDir, ".status.yaml") | ||
| if err := os.WriteFile(statusPath, []byte(setAcceptanceFixture), 0o644); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| var err error | ||
| statusFile, err = sf.Load(statusPath) | ||
| if err != nil { | ||
| t.Fatalf("Load: %v", err) | ||
| } | ||
| return statusFile, statusPath, fabRoot, changeDir | ||
| } |
There was a problem hiding this comment.
Deferred — valid latent concern, but out of scope for this PR. This change (m3d4) is scoped skills/prose + docs only with an explicit NO-Go boundary (plan acceptance A-010/R2: "No Go file under src/go/** is modified"). The loadFixtureInFabRoot helper and its hard-coded fixture name were introduced by the cherry-picked 260613-fgxx dependency commit (this PR is gated on fgxx), not by m3d4. Deriving the folder from statusFile.Name via sf.Load is a reasonable improvement, but it belongs to a Go-touching change rather than this prose-only one — flagging it here would expand scope past the documented boundary.
5fb91eb to
cc502df
Compare
Meta
Pipeline: intake ✓ → apply ✓ → review ✓ → hydrate ✓ → ship → review-pr
Impact:
impl: +51/−37 (net +14)
tests: +164/−0 (net +164)
total: +215/−37 (net +178) ← excludes
fab/,docs/Summary
fab assigns each pipeline stage a model tier (
thinking/doing/fast) with a resolved{model, effort}profile. Whilefab resolve-agent <stage>correctly resolves these, two gaps remained: (1b) orchestrators could silently skip the mandatedfab resolve-agentcall with no visibility, and (2) the Claude Code Agent tool has no per-subagenteffortparameter, so the resolved effort was unconsumable at dispatch time. This change closes Gap 1b by emitting resolvedmodel=/effort=lines at every per-stage dispatch site (making skipped tiers visible), and closes Gap 2's effort half by injecting the resolved effort as an explicit instruction in the subagent prompt. It also reconciles docs to reflect that Gap 1a (foreground advisory path) is closed by the already-landed Change A (260613-fgxx), and documents the residual harness ask (a per-subagent effort param on the Agent tool).Changes
src/kit/skills/_pipeline.md— emit/log resolved tier at Steps 1–3 + Auto-Rework items 3/4; inject effort into all dispatch promptssrc/kit/skills/fab-fff.md— Steps 4–5 (ship, review-pr): emit resolved tier + inject effort into dispatch promptssrc/kit/skills/_preamble.md— Per-Stage Model Resolution: document effort-via-prompt seam; remove foreground-advisory paragraph (Gap 1a closed by Change A)src/kit/skills/fab-continue.md— reconcile foreground-advisory caveat in header notedocs/findings/per-stage-model-tier-application.md— Gap 1a "Closed by Change A" → closed; Gap 1b + Gap 2 addressed; harness-ask residual documenteddocs/specs/stage-models.md— reconcile Foreground limitation + Skill wiring sections with uniform behavior + effort-via-promptdocs/specs/skills/SPEC-_pipeline.md,SPEC-_preamble.md,SPEC-fab-fff.md,SPEC-fab-continue.md— mirror skill-file edits (constitution requirement)docs/memory/_shared/context-loading.md,docs/memory/pipeline/execution-skills.md— updated hydrated memory