From be0a0a5aa4784a19d19f653010de6f883e7ee9a8 Mon Sep 17 00:00:00 2001 From: Sahil Ahuja Date: Sat, 13 Jun 2026 19:29:42 +0530 Subject: [PATCH 1/3] fix: apply per-stage model tier uniformly + inject effort into subagent prompt --- .../per-stage-model-tier-application.md | 38 +-- docs/memory/_shared/context-loading.md | 15 +- docs/memory/_shared/index.md | 2 +- docs/memory/pipeline/execution-skills.md | 3 +- docs/memory/pipeline/index.md | 4 +- docs/specs/skills/SPEC-_pipeline.md | 11 +- docs/specs/skills/SPEC-_preamble.md | 20 +- docs/specs/skills/SPEC-fab-continue.md | 4 +- docs/specs/skills/SPEC-fab-ff.md | 2 +- docs/specs/skills/SPEC-fab-fff.md | 2 +- docs/specs/stage-models.md | 39 +-- .../.history.jsonl | 9 + .../.status.yaml | 40 ++- .../plan.md | 247 ++++++++++++++++++ src/kit/skills/_pipeline.md | 12 +- src/kit/skills/_preamble.md | 15 +- src/kit/skills/fab-continue.md | 6 +- src/kit/skills/fab-ff.md | 2 +- src/kit/skills/fab-fff.md | 6 +- 19 files changed, 391 insertions(+), 86 deletions(-) create mode 100644 fab/changes/260613-m3d4-uniform-stage-model-tier/plan.md diff --git a/docs/findings/per-stage-model-tier-application.md b/docs/findings/per-stage-model-tier-application.md index 1347b1d5..500c7d19 100644 --- a/docs/findings/per-stage-model-tier-application.md +++ b/docs/findings/per-stage-model-tier-application.md @@ -2,7 +2,7 @@ **Date**: 2026-06-13 **Area**: subagent dispatch, per-stage model resolution (`fab resolve-agent`) -**Status**: open +**Status**: largely addressed — Gap 1a closed by `260613-fgxx` (intake-is-the-context-boundary); Gap 1b (compliance visibility) and Gap 2's effort half (effort injected via the subagent prompt) closed by `260613-m3d4` (uniform-stage-model-tier). **Residual**: a first-class per-sub-agent `effort` parameter on the Claude Code Agent tool (§ Suggested directions item 4) — a harness ask outside fab's control. **Severity**: medium — correctness-of-execution gap (stages may silently run on the wrong model/effort), not a crash --- @@ -25,19 +25,19 @@ $ fab resolve-agent review → model=claude-opus-4-8 effort=xhigh $ fab resolve-agent ship → model=claude-sonnet-4-6 effort=low ``` -But the resolved profile is **only ever applied at the Agent-tool dispatch seam**. This produces two distinct gaps — one a *compliance* gap (orchestrator can skip the resolution step), one an *architectural* gap (the harness adapter has no effort knob). +The resolved profile is applied at the Agent-tool dispatch seam. This finding originally identified two distinct gaps — one a *compliance* gap (orchestrator can skip the resolution step), one an *architectural* gap (the harness adapter has no effort knob). **All three sub-gaps below are now closed or addressed** (`260613-fgxx` closed Gap 1a; `260613-m3d4` addressed Gap 1b + Gap 2's effort half via visibility + prompt-injection); the lone residual is the harness ask in § Suggested directions item 4. Sections below are retained with their resolution annotations for the historical record. --- ## Gap 1 — Compliance: foreground stages and skipped resolution inherit the session model -### 1a. Foreground stages can't be tiered at all (by design, but easy to forget) +### 1a. Foreground stages can't be tiered at all — CLOSED by `260613-fgxx` -`_preamble.md` § Per-Stage Model Resolution and `fab-continue.md`'s header note both state that per-stage selection is honored **only on orchestrated/sub-agent dispatch**. When a stage runs **directly in the foreground** — i.e. plain `/fab-continue` for apply or hydrate — `fab` cannot switch the session model mid-run, so the configured tier is **advisory only**. The skill "MAY note 'this stage is configured for X; you're on Y' but MUST NOT attempt to switch." +**Historical (pre-`260613-fgxx`)**: `_preamble.md` § Per-Stage Model Resolution and `fab-continue.md`'s header note both stated that per-stage selection was honored **only on orchestrated/sub-agent dispatch**. When a stage ran **directly in the foreground** — i.e. plain `/fab-continue` for apply or hydrate — `fab` could not switch the session model mid-run, so the configured tier was **advisory only**. -Net effect, by path: +Net effect *at the time*, by path: -| Stage | resolve-agent tier | Path A: `/fab-continue` (manual) | Paths B/C/D: `/fab-ff`, `/fab-fff`, `/fab-proceed` | +| Stage | resolve-agent tier | Path A: `/fab-continue` (then-foreground) | Paths B/C/D: `/fab-ff`, `/fab-fff`, `/fab-proceed` | |-------|--------------------|----------------------------------|-----------------------------------------------------| | apply | opus + high | foreground → **session model** (advisory) | subagent → tier applied | | review | opus + xhigh | subagents → tier applied | subagents → tier applied | @@ -45,20 +45,18 @@ Net effect, by path: | ship | sonnet + low | foreground → session model | subagent → tier applied | | review-pr | opus + high | foreground → session model | subagent → tier applied | -(In Path A, `review` is the lone exception: `_review.md` mandates inward+outward **subagents**, so the tier *is* applied there even in the manual path.) +> **CLOSED by [intake-is-the-context-boundary](intake-is-the-context-boundary.md), landed as `260613-fgxx`.** That change collapsed the post-intake dual execution mode: **every** post-intake stage now dispatches a sub-agent regardless of A/B/C/D — plain `/fab-continue` is a one-stage sequencer that resolves `fab resolve-agent ` and dispatches its stage's block just like an orchestrator. With **no foreground execution path left**, `fab resolve-agent` applies uniformly across apply/review/hydrate and Gap 1a is gone — the per-path split above no longer exists (every cell is "subagent → tier applied"). The only residual "advisory" case is a stage skill genuinely run with no dispatch at all, which `fab` cannot switch mid-run by design. This finding's Gap 1b and Gap 2-effort were then closed by `260613-m3d4` (below). -This is intended behavior — but it means "did apply run on opus+high?" has a different answer depending on whether the user ran `/fab-continue` or an orchestrator. Worth stating plainly in user-facing docs. +### 1b. Orchestrators can silently skip the mandated `resolve-agent` call — ADDRESSED by `260613-m3d4` -> **Closed by [intake-is-the-context-boundary](intake-is-the-context-boundary.md)**: if every post-intake stage dispatches a subagent regardless of A/B/C/D (the principle in that finding), there is no foreground execution path left to be the exception — `fab resolve-agent` applies uniformly and Gap 1a disappears. Only Gap 2 (below) survives. The two findings should be planned together. - -### 1b. Orchestrators can silently skip the mandated `resolve-agent` call - -`_pipeline.md` Steps 1/2/3 (and `fab-fff.md` Steps 4/5) each mandate: +`_pipeline.md` Steps 1/2/3 (and `fab-fff.md` Steps 4/5) each mandated (the original Gap-1b prose, since refined by `260613-m3d4` — see below): > immediately **before** dispatching each stage's sub-agent, run `fab resolve-agent ` and pass the resolved model AND effort into the Agent dispatch. If an orchestrator agent **omits** that call and dispatches with no `model` param, the subagent inherits the **session** model/effort instead of being pinned to the stage tier. Nothing enforces the call — it's a prose instruction, not a code-level guard. Observed in the wild: an orchestrated run dispatched apply/review subagents with no model override and the subagents ran on the inherited session profile rather than `opus+high` / `opus+xhigh`. The agent's self-diagnosis correctly identified the symptom ("subagents ran on inherited session settings"), but mis-attributed cause to "in-process Agent dispatch doesn't route through resolve-agent" — the routing *is* defined; the run simply didn't execute the step. +> **ADDRESSED by `260613-m3d4` (compliance visibility).** A true code-level guard is impossible (dispatch is harness-internal — `fab` cannot observe Agent-tool calls). The available seam is **visibility**: every per-stage dispatch site now **surfaces** the resolved `model=/effort=` lines (carried into the dispatch prompt and/or echoed in the orchestrator's step output), so a *skipped* `fab resolve-agent` call — where the sub-agent silently inherits the session profile — is **visible in output rather than silent**. The canonical contract (`_preamble.md` § Per-Stage Model Resolution) also notes that an all-empty resolution is worth surfacing/asserting rather than dispatching blind. This does not *prevent* a skip (no enforcement seam exists) but makes one *detectable* — the cheap, prose-level fix the finding called for. + --- ## Gap 2 — Architectural: the Claude Code Agent tool exposes `model` but no `effort` @@ -70,7 +68,9 @@ Even a fully compliant orchestrator that calls `fab resolve-agent apply` and get - The **model** half of the tier can be pinned per-subagent. ✅ - The **effort** half (`xhigh` vs `high` vs `low`) **cannot** be injected per-subagent through the current Agent-tool adapter. ❌ The subagent runs at whatever effort the session/harness governs. -This is a real harness-adapter limitation, not a compliance miss. It means the `effort=` line `fab resolve-agent` emits is currently **unconsumable** on the per-subagent dispatch path in Claude Code, regardless of orchestrator correctness. +This is a real harness-adapter limitation, not a compliance miss. It means the `effort=` line `fab resolve-agent` emits is **unconsumable through a dispatch *parameter*** in Claude Code, regardless of orchestrator correctness. + +> **Effort half ADDRESSED by `260613-m3d4` (effort-via-prompt); the clean fix remains the residual.** Since the Agent tool has no `effort` param, `260613-m3d4` injects the resolved effort into the **subagent prompt** as an explicit imperative instruction (e.g., ``Operate at `xhigh` reasoning effort for this task.``; omitted when the resolved effort is empty) at every per-stage dispatch site, so the sub-agent self-selects its reasoning effort. The model half stays on the Agent tool's `model` param (unchanged). This is **imperfect** — it relies on the sub-agent honoring the instruction rather than the harness enforcing it — but it is the only per-sub-agent effort seam available today. The clean fix — a first-class per-sub-agent `effort` parameter on the Agent tool (§ Suggested directions item 4) — is **out of fab's control** and remains the **residual** after this change. --- @@ -80,12 +80,12 @@ The whole point of `doing` vs `thinking` tiers is cost/quality calibration — a --- -## Suggested directions (not yet decided) +## Suggested directions -1. **Close Gap 1b with a self-check.** Have orchestrators emit the resolved `model=/effort=` lines into the dispatch prompt and/or log them, so a skipped resolution is visible in output rather than silent. Cheap, prose-level. -2. **Close Gap 2's effort half via the prompt.** Since the Agent tool can't take an effort param, inject the resolved effort into the **subagent prompt** as an explicit instruction ("operate at `xhigh` reasoning effort"), so the subagent self-selects. Imperfect (relies on the subagent honoring it) but it's the only seam available today. -3. **Document the foreground-advisory reality (Gap 1a)** in user-facing docs so users know manual `/fab-continue` does not tier apply/hydrate/ship. -4. **Harness ask**: a per-subagent `effort` parameter on the Agent tool would make Gap 2 closable cleanly. Out of fab's control — flag upstream. +1. **Close Gap 1b with a self-check.** ✅ **Done (`260613-m3d4`).** Orchestrators emit/surface the resolved `model=/effort=` lines (into the dispatch prompt and/or step output), so a skipped resolution is visible in output rather than silent. Cheap, prose-level — as proposed. +2. **Close Gap 2's effort half via the prompt.** ✅ **Done (`260613-m3d4`).** Since the Agent tool can't take an effort param, the resolved effort is injected into the **subagent prompt** as an explicit instruction (``Operate at `xhigh` reasoning effort for this task.``; omitted when empty), so the subagent self-selects. Imperfect (relies on the subagent honoring it) but it's the only seam available today. +3. **Document the foreground-advisory reality (Gap 1a).** ✅ **Superseded by `260613-fgxx`.** There is no longer a foreground-advisory path to document for apply/hydrate/ship — every post-intake stage dispatches a sub-agent and is tiered uniformly. The narrow residual ("a stage skill run with no dispatch at all") is captured in `docs/specs/stage-models.md` § Foreground limitation. +4. **Harness ask** *(residual — not built)*: a per-subagent `effort` parameter on the Claude Code Agent tool would make Gap 2 closable cleanly — injecting effort directly at the dispatch seam instead of via prose in the prompt. This is **out of fab's control** — flag upstream. It is the **only residual** after `260613-fgxx` + `260613-m3d4`; fab builds nothing for it. --- diff --git a/docs/memory/_shared/context-loading.md b/docs/memory/_shared/context-loading.md index d2d69b95..c8e34e02 100644 --- a/docs/memory/_shared/context-loading.md +++ b/docs/memory/_shared/context-loading.md @@ -1,5 +1,5 @@ --- -description: "Smart context loading convention — descriptive 7-file always-load layer (skill file wins; exception set rule-derived from skill files, never enumerated in the preamble — d9rs), opt-in skill helpers (6-value allowlist incl. `_srad`/`_pipeline`) + stage-conditional loading, standard subagent context (orchestrators incl. `/fab-proceed`), per-stage model resolution at the dispatch seam (`fab resolve-agent `, provider-neutral, Claude-Code adapter named, review-resolves-once — l3ja; applies on every post-intake stage since the single-dispatch collapse, advisory only for a genuinely no-dispatch run — fgxx), selective domain loading, SRAD protocol pointer, scoped Next Steps Convention, generic fab-command failure rule (unconditional non-zero exit → STOP; `fab log command` exits 0 by contract)" +description: "Smart context loading convention — descriptive 7-file always-load layer (skill file wins; exception set rule-derived from skill files, never enumerated in the preamble — d9rs), opt-in skill helpers (6-value allowlist incl. `_srad`/`_pipeline`) + stage-conditional loading, standard subagent context (orchestrators incl. `/fab-proceed`), per-stage model resolution at the dispatch seam (`fab resolve-agent `, provider-neutral, Claude-Code adapter named, review-resolves-once — l3ja; applies on every post-intake stage since the single-dispatch collapse, advisory only for a genuinely no-dispatch run — fgxx; the two halves dispatch through two seams — model via the Agent `model` param (short alias opus/sonnet/haiku/fable, id→alias mapped at the seam), effort via an imperative subagent-prompt instruction (no Agent effort param; omit when empty), plus a compliance-visibility expectation that each site surface the resolved `model=/effort=` — m3d4; residual = a per-sub-agent effort param on the Agent tool, a harness ask), selective domain loading, SRAD protocol pointer, scoped Next Steps Convention, generic fab-command failure rule (unconditional non-zero exit → STOP; `fab log command` exits 0 by contract)" --- # Context Loading @@ -107,14 +107,18 @@ This is a subset of the Always Load layer — it includes the 5 `fab/project/**` Per-stage model selection is wired into the **sub-agent dispatch seam** (`_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution is the canonical contract). **Immediately before dispatching each pipeline stage's sub-agent**, the dispatching skill — the orchestrators `/fab-ff`, `/fab-fff`, `/fab-proceed`, and `/fab-continue`'s own sub-agent dispatch — runs `fab resolve-agent ` and passes the resolved profile into the Agent dispatch: - **Output** is two byte-stable stdout lines, `model=` and `effort=` (the `effort=` line is omitted when the resolved tier has no effort). `_cli-fab.md` § fab resolve-agent is the CLI reference. -- **Empty model** ⇒ omit the dispatch `model` param entirely (inherit the orchestrator/session model — today's behavior). **Empty effort** ⇒ omit the effort. +- **Empty model** ⇒ omit the dispatch `model` param entirely (inherit the orchestrator/session model — today's behavior). **Empty effort** ⇒ omit the effort instruction (see § The two halves dispatch through two seams below). - The resolver maps `` → its fixed fab-owned tier → a `{model, effort}` profile (the `agent.tiers` project override per-field-merged over fab-kit's default — see [configuration.md](configuration.md) § `agent`). The stage→tier mapping is NOT user-overridable; `agent.tiers` (tier redefinition) is the sole override surface. Full design: [`docs/specs/stage-models.md`](../../specs/stage-models.md). -**Harness-adapter boundary (Claude Code).** The resolution (stage→tier→`{model, effort}`) is **provider-neutral**; injecting the resolved model into the actual dispatch is **harness-specific**. For Claude Code that adapter is the **Agent tool's `model` parameter** — named explicitly as the Claude-Code adapter, not as universal truth. The coupling is not new: fab's entire subagent-dispatch design is already Claude-Code-shaped, so per-stage selection is exactly as portable as fab's existing dispatch (no more, no less). +**The two halves dispatch through two seams (model param + prompt instruction — m3d4).** The resolved profile has a model half and an effort half, and Claude Code consumes them through *different* seams: the **model** rides the Agent tool's `model` parameter (empty ⇒ omit/inherit); the **effort** is injected as an explicit imperative line in the dispatched subagent prompt — ``Operate at `` reasoning effort for this task.`` — because the Agent tool has **no `effort` parameter**. Empty effort ⇒ omit the instruction (mirroring the empty-model omit rule). The effort-via-prompt seam is imperfect (it relies on the sub-agent honoring the instruction rather than the harness enforcing it) but is the only per-sub-agent effort seam available today; a first-class per-sub-agent `effort` param on the Agent tool would close it cleanly and is the **residual harness ask** ([`docs/specs/stage-models.md`](../../specs/stage-models.md) § Skill wiring + `docs/findings/per-stage-model-tier-application.md` § Suggested directions item 4 — out of fab's control, not built). -**Review resolves once.** The `review` stage spawns two reviewer sub-agents (inward + outward) plus a merge. The dispatcher resolves `fab resolve-agent review` **once** and applies the same profile to all three — the mechanical merge runs at the reviewer's tier, an accepted stage-granularity tradeoff. +**Compliance visibility (m3d4).** Each dispatch site MUST **surface the resolved `model=/effort=` lines** — carry them into the dispatch prompt (the effort line is already there per the seam above; co-locate the model line) and/or echo them in the orchestrator's own step output — so a *skipped* `fab resolve-agent` call (the sub-agent silently inherits the session profile) or a mis-resolved tier is **visible rather than silent**. There is no code-level guard fab can install (dispatch is harness-internal), so visibility is the available seam; the canonical contract also notes an all-empty resolution (both `model=` and `effort=` empty) is itself worth surfacing/asserting rather than dispatching blind. -**Per-stage selection applies on every post-intake stage (fgxx).** Per-stage selection is a property of dispatched sub-agent runs — and since fgxx collapsed the post-intake dual execution mode, **every post-intake stage now dispatches a sub-agent**, including plain `/fab-continue` (a one-stage sequencer that resolves `fab resolve-agent ` and dispatches its stage's block). So there is no post-intake foreground execution path left to be the exception; `fab resolve-agent` applies uniformly across apply/review/hydrate regardless of caller. Intake is pre-boundary — it runs in the main session and is not tiered. The only residual advisory case is a stage skill genuinely run with **no dispatch at all**: fab cannot switch the session model mid-run, so such a skill MAY note "this stage is configured for X; you're on Y" but MUST NOT attempt to switch. (Full uniform-tier closure — incl. the per-sub-agent effort knob — is a deferred follow-up; this paragraph only states the post-intake single-mode reality.) +**Harness-adapter boundary (Claude Code).** The resolution (stage→tier→`{model, effort}`) is **provider-neutral**; injecting it into the actual dispatch is **harness-specific**. For Claude Code the model adapter is the **Agent tool's `model` parameter** and the effort adapter is the subagent-prompt instruction above. One concrete harness detail: the Agent tool's `model` param takes a **short alias** (`opus`/`sonnet`/`haiku`/`fable`), **not** the full versioned id (`claude-opus-4-8`) that `fab resolve-agent` emits — the orchestrator maps the resolved id → alias at the dispatch seam. Named explicitly as the Claude-Code adapter, not as universal truth; the coupling is not new (fab's entire subagent-dispatch design is already Claude-Code-shaped), so per-stage selection is exactly as portable as fab's existing dispatch. + +**Review resolves once.** The `review` stage spawns two reviewer sub-agents (inward + outward) plus a merge. The dispatcher resolves `fab resolve-agent review` **once** and applies the same model AND the same effort-prompt instruction to all three — the mechanical merge runs at the reviewer's tier, an accepted stage-granularity tradeoff. + +**Per-stage selection applies on every post-intake stage (fgxx).** Per-stage selection is a property of dispatched sub-agent runs — and since fgxx collapsed the post-intake dual execution mode, **every post-intake stage now dispatches a sub-agent**, including plain `/fab-continue` (a one-stage sequencer that resolves `fab resolve-agent ` and dispatches its stage's block). So there is no post-intake foreground execution path left to be the exception; `fab resolve-agent` applies uniformly across apply/review/hydrate regardless of caller. Intake is pre-boundary — it runs in the main session and is not tiered. The only residual advisory case is a stage skill genuinely run with **no dispatch at all**: fab cannot switch the session model mid-run, so such a skill MAY note "this stage is configured for X; you're on Y" but MUST NOT attempt to switch. (The effort half of the tier is now injected via the subagent-prompt instruction — see § The two halves dispatch through two seams above (m3d4) — so it is no longer a deferred follow-up; the lone remaining residual is a first-class per-sub-agent `effort` param on the Agent tool, a harness ask outside fab's control.) This subsection documents *where the resolution call sits* and *how the profile is consumed* (a dispatch-seam concern, parallel to Standard Subagent Context above). The config schema (`agent.tiers`, the tiers, the fixed mapping) and the design rationale (no-validation, fixed-mapping-vs-budget) live in [configuration.md](configuration.md). @@ -214,6 +218,7 @@ The exception set is **declared by the skill files themselves** (the preamble ne | Change | Date | Summary | |--------|------|---------| +| 260613-m3d4-uniform-stage-model-tier | 2026-06-13 | **Per-Stage Model Resolution gains the two-seam dispatch contract + compliance visibility** (Change C — skills/docs prose only, no Go; gated on `260613-fgxx`). The resolved tier's **two halves now dispatch through two seams**: the **model** rides the Agent tool's `model` param (unchanged), the **effort** is injected as an imperative instruction in the subagent prompt (``Operate at `` reasoning effort for this task.``, omitted when empty) — closing Gap 2's effort half, which fgxx had left as a deferred follow-up (the Agent tool has no `effort` param). Added a **compliance-visibility** expectation (Gap 1b): every dispatch site surfaces the resolved `model=/effort=` (into the dispatch prompt and/or step output) so a skipped/mis-resolved `fab resolve-agent` call is visible rather than silent, plus a prose all-empty-resolution surface/assert note (no Go guard — dispatch is harness-internal). Harness-adapter boundary now records the **id→alias** detail (the `model` param takes `opus`/`sonnet`/`haiku`/`fable`, not the full `claude-*` id the resolver emits — the orchestrator maps at the seam) and names the prompt as the effort adapter. The lone residual is the **harness ask** — a first-class per-sub-agent `effort` param on the Agent tool, out of fab's control, not built. Mirrors `_preamble.md` § Per-Stage Model Resolution; the `_pipeline.md`/`fab-ff`/`fab-fff`/`fab-continue` dispatch sites and their SPEC mirrors carry the same two-seam phrasing (see [pipeline/execution-skills.md](../pipeline/execution-skills.md)); `docs/findings/per-stage-model-tier-application.md` and `docs/specs/stage-models.md` reconciled to this reality. | | 260613-fgxx-collapse-post-intake-dual-mode | 2026-06-13 | **Per-Stage Model Resolution reconciled to the single post-intake dispatch mode** (minimal reconcile — full uniform-tier closure deferred to a follow-up). The "Foreground is advisory-only" paragraph (and its `description:` frontmatter mention) is rewritten: since fgxx collapsed the post-intake dual execution mode, **every** post-intake stage dispatches a sub-agent — including plain `/fab-continue`, now a one-stage sequencer that resolves `fab resolve-agent ` and dispatches its stage's block — so per-stage selection applies uniformly across apply/review/hydrate regardless of caller, and the residual advisory case narrows to a stage skill genuinely run with no dispatch at all. Intake stays pre-boundary (not tiered). Does NOT write the full uniform-tiering / per-sub-agent-effort-knob story. Mirrors `_preamble.md` § Per-Stage Model Resolution (see [pipeline/execution-skills.md](../pipeline/execution-skills.md) for the fab-continue rewrite). | | 260613-l3ja-per-stage-model-tiers | 2026-06-13 | New **Per-Stage Model Resolution** requirements section documenting the sub-agent dispatch-seam wiring (`_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution): the orchestrators (`/fab-ff`, `/fab-fff`, `/fab-proceed`) and `/fab-continue`'s dispatch run `fab resolve-agent ` immediately before each stage's sub-agent and pass the resolved `model=`/`effort=` (empty model ⇒ omit/inherit; empty effort ⇒ omit). The resolution is **provider-neutral**; the Claude Code adapter (Agent tool `model` param) is named as harness-specific, not universal — and the coupling is not new (fab's dispatch is already Claude-Code-shaped). `review` resolves **once** for both reviewers + merge; foreground runs are advisory-only (MAY note mismatch, MUST NOT switch). The config schema + rationale live in [configuration.md](configuration.md). | | 260612-d9rs-docs-reality-sweep | 2026-06-12 | **Always-load exception set is now rule-derived** (skills-audit batch 5/5, root-cause guard): `_preamble.md` §1 no longer enumerates the exception skills — the skill file itself (its `## Context Loading` section or a header context note) is the authoritative override source; the preamble keeps examples only (fab-setup/docs-hydrate-memory skip, fab-operator 3-file). The zc9m-era four-skip list had drifted (missed `/fab-help`, `/fab-archive`, `/docs-hydrate-specs`, `/docs-reorg-*`, `/fab-proceed`); `/docs-hydrate-memory` gained the explicit Context Loading section the rule keys on. Exception Skills section rewritten to the skill-file-declared shipped set. **Orchestrator list completed**: `_preamble.md` § Subagent Dispatch now names `/fab-proceed` as the prefix-step orchestrator alongside `/fab-ff`/`/fab-fff`. **Scoring invokers completed**: § Confidence Scoring Invocation now lists `/fab-draft` alongside `/fab-new`, and scopes `/fab-clarify`'s re-persist to both modes (Suggest Step 7 + Auto Mode step 4). | diff --git a/docs/memory/_shared/index.md b/docs/memory/_shared/index.md index 4c87f9de..ab306cf2 100644 --- a/docs/memory/_shared/index.md +++ b/docs/memory/_shared/index.md @@ -8,4 +8,4 @@ description: "Cross-cutting concerns spanning all domains — config.yaml/consti | File | Description | Last Updated | |------|-------------|-------------| | [configuration](configuration.md) | `config.yaml` schema (incl. `fab_version`, `review_tools`, `true_impact_exclude`, `test_paths`, `stage_hooks`, `agent.tiers` per-stage-model override — sole surface, per-field merge over fab-kit defaults, NO validation/provider-neutral, fixed non-overridable stage→tier mapping, l3ja; `stage_directives` + `model_tiers` removed in c5tr via migration 2.1.6-to-2.2.0), single fab-module parser `internal/config` + nil-safe accessors incl. the coupled-Unmarshal fallback caveat (ye8r), companion files (`context.md`, `code-quality.md`, `code-review.md` incl. `## Parsimony Pass` toggle and the wired `## Rework Budget` Max-cycles knob), `constitution.md` governance, 5 Cs of Quality, lifecycle management | 2026-06-13 | -| [context-loading](context-loading.md) | Smart context loading convention — descriptive 7-file always-load layer (skill file wins; exception set rule-derived from skill files, never enumerated in the preamble — d9rs), opt-in skill helpers (6-value allowlist incl. `_srad`/`_pipeline`) + stage-conditional loading, standard subagent context (orchestrators incl. `/fab-proceed`), per-stage model resolution at the dispatch seam (`fab resolve-agent `, provider-neutral, Claude-Code adapter named, review-resolves-once — l3ja; applies on every post-intake stage since the single-dispatch collapse, advisory only for a genuinely no-dispatch run — fgxx), selective domain loading, SRAD protocol pointer, scoped Next Steps Convention, generic fab-command failure rule (unconditional non-zero exit → STOP; `fab log command` exits 0 by contract) | 2026-06-13 | +| [context-loading](context-loading.md) | Smart context loading convention — descriptive 7-file always-load layer (skill file wins; exception set rule-derived from skill files, never enumerated in the preamble — d9rs), opt-in skill helpers (6-value allowlist incl. `_srad`/`_pipeline`) + stage-conditional loading, standard subagent context (orchestrators incl. `/fab-proceed`), per-stage model resolution at the dispatch seam (`fab resolve-agent `, provider-neutral, Claude-Code adapter named, review-resolves-once — l3ja; applies on every post-intake stage since the single-dispatch collapse, advisory only for a genuinely no-dispatch run — fgxx; the two halves dispatch through two seams — model via the Agent `model` param (short alias opus/sonnet/haiku/fable, id→alias mapped at the seam), effort via an imperative subagent-prompt instruction (no Agent effort param; omit when empty), plus a compliance-visibility expectation that each site surface the resolved `model=/effort=` — m3d4; residual = a per-sub-agent effort param on the Agent tool, a harness ask), selective domain loading, SRAD protocol pointer, scoped Next Steps Convention, generic fab-command failure rule (unconditional non-zero exit → STOP; `fab log command` exits 0 by contract) | 2026-06-13 | diff --git a/docs/memory/pipeline/execution-skills.md b/docs/memory/pipeline/execution-skills.md index 5f558cf0..7cc868fc 100644 --- a/docs/memory/pipeline/execution-skills.md +++ b/docs/memory/pipeline/execution-skills.md @@ -17,7 +17,7 @@ Execution behavior (apply, review, hydrate) is accessed via `/fab-continue`, whi **Status-transition ownership (orchestrator/sequencer owns — single post-intake dispatch mode, fgxx)**: Intake is the **sole context boundary** — it is the only stage `/fab-continue` runs in the main session. **Every post-intake stage (apply / review / hydrate) is ALWAYS dispatched as a sub-agent**, the SAME dispatch the orchestrators (`_pipeline.md`) perform; there is no foreground execution path for those stages. The dual execution mode was collapsed in fgxx: the three former caller-aware "When invoked as a subagent: do NOT run `fab status`" conditionals (one each in Apply/Review/Hydrate Behavior) are removed, because with one execution mode the conditional has no second branch to express. The dispatch is identical whether the caller is **manual `/fab-continue`** (itself a **one-stage sequencer**) or an **orchestrator** (`/fab-ff`/`/fab-fff`). -The owning sequencer is a **pure sequencer**: *dispatch block → read returned status/findings → decide proceed / loop / stop*. It runs `fab resolve-agent ` immediately before each post-intake dispatch and owns the `fab status` transitions (`finish`/`fail`/`reset`) itself; it never reaches into block internals. Every `/fab-continue`-behavior subagent prompt includes "do NOT run `fab status` commands; return results only" — this is now the **universal block contract** carried in the dispatch prompt (stated once in `_pipeline.md`'s Behavior note), not a per-caller skip rule baked into the block. The block returns results/findings only; it never branches on caller and takes no §Verdict-style decision itself. (For hydrate the orchestrator runs the finish, e.g. `fab status finish hydrate fab-ff` for the auto path or `... fab-continue` for the manual sequencer.) The **failure-policy fork is invocation-level**, in the orchestrator, not the block: Path A (manual `/fab-continue`) runs the interactive § Verdict rework menu on the dispatched review block's returned findings; Paths B/C/D (`/fab-ff`/`/fab-fff`) run `_pipeline.md`'s autonomous Auto-Rework Loop. The block always returns findings; **who** acts on a fail verdict is the orchestrator's concern — no "skip §Verdict when subagent" flag re-encodes the removed conditional. +The owning sequencer is a **pure sequencer**: *dispatch block → read returned status/findings → decide proceed / loop / stop*. It runs `fab resolve-agent ` immediately before each post-intake dispatch, **surfaces the resolved `model=/effort=`** (carried into the dispatch prompt and/or echoed in step output — Gap 1b compliance visibility, so a skipped/mis-resolved tier is visible rather than silent), and applies the tier through **two seams** (m3d4): the **model** half via the Agent tool's `model` param (empty ⇒ omit/inherit; the param takes a short alias `opus`/`sonnet`/`haiku`/`fable`, not the full `claude-*` id the resolver emits — the orchestrator maps id→alias at the seam) and the **effort** half via an imperative instruction in the subagent prompt (``Operate at `` reasoning effort for this task.``; omitted when empty — the Agent tool has no `effort` param). This same two-seam + visibility contract governs the whole `fab-ff`/`fab-fff` orchestrator family — the shared `_pipeline.md` bracket's per-stage dispatch note + Steps 1–3 + Auto-Rework re-apply/re-review, and `fab-fff.md` Steps 4–5 (ship/review-pr) — with `review` resolving **once** for both reviewers + merge (same model + same effort instruction). The canonical contract is `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution (mirrored in [_shared/context-loading.md](../_shared/context-loading.md) § Per-Stage Model Resolution); the lone residual is a first-class per-sub-agent `effort` param on the Agent tool, a harness ask outside fab's control. It owns the `fab status` transitions (`finish`/`fail`/`reset`) itself; it never reaches into block internals. Every `/fab-continue`-behavior subagent prompt includes "do NOT run `fab status` commands; return results only" — this is now the **universal block contract** carried in the dispatch prompt (stated once in `_pipeline.md`'s Behavior note), not a per-caller skip rule baked into the block. The block returns results/findings only; it never branches on caller and takes no §Verdict-style decision itself. (For hydrate the orchestrator runs the finish, e.g. `fab status finish hydrate fab-ff` for the auto path or `... fab-continue` for the manual sequencer.) The **failure-policy fork is invocation-level**, in the orchestrator, not the block: Path A (manual `/fab-continue`) runs the interactive § Verdict rework menu on the dispatched review block's returned findings; Paths B/C/D (`/fab-ff`/`/fab-fff`) run `_pipeline.md`'s autonomous Auto-Rework Loop. The block always returns findings; **who** acts on a fail verdict is the orchestrator's concern — no "skip §Verdict when subagent" flag re-encodes the removed conditional. Ship and review-pr are the exception: `/git-pr` and `/git-pr-review` manage their own stage transitions internally, so `/fab-continue`'s ship and review-pr dispatch rows — which since w7dp pass the resolved change explicitly (`/git-pr {name}` / `/git-pr-review {name}`, the transient-override + branch-guard contract) and cite `active` only (`ready` is unreachable: ship's AllowedStates are `{pending, active, done, skipped}`, review-pr's add `failed`) — run `finish` only if the stage is still `active` after the behavior returns. The review-pr **Fail** branch carries the same only-if-still-active guard (uliv) — git-pr-review's Step 6 normally runs `fail` itself, so fab-continue runs `fail review-pr` only when the stage is still `active` after the behavior returns, never as a second (CLI-rejected) `fail`. And the review-pr **timeout** outcome deliberately leaves the stage `active` — report and stop, no re-finish. This matches `_review.md`'s "verdict transitions remain in each orchestrator's own file" and eliminates textually-mandated double-finish CLI errors. @@ -439,6 +439,7 @@ Steps execute 1→3 for safety. If interrupted, re-run detects folder already in | Change | Date | Summary | |--------|------|---------| +| 260613-m3d4-uniform-stage-model-tier | 2026-06-13 | **Per-stage dispatch sites surface the resolved tier + inject effort via the subagent prompt (two-seam dispatch contract)** (Change C — skills/docs prose only, no Go; gated on `260613-fgxx`). Every per-stage dispatch site in the `fab-ff`/`fab-fff` orchestrator family — the shared `_pipeline.md` bracket (Behavior "Per-stage model resolution" note + Steps 1 apply / 2 review / 3 hydrate + Auto-Rework re-apply/re-review items 3/4) and `fab-fff.md` Steps 4–5 (ship/review-pr) — now (1) **surfaces** the resolved `model=/effort=` after `fab resolve-agent ` (carried into the dispatch prompt and/or echoed in step output, so a skipped/mis-resolved tier is **visible rather than silent** — Gap 1b compliance visibility), and (2) applies the tier through **two seams**: the **model** half via the Agent tool's `model` param (unchanged) and the **effort** half via an imperative instruction in the subagent prompt (``Operate at `` reasoning effort for this task.``, omitted when empty — Gap 2's effort half, previously left as fgxx's deferred follow-up; the Agent tool has no `effort` param). `review` resolves **once** for both reviewers + merge (same model + same effort instruction). `fab-continue.md`'s one-stage-sequencer header note + post-intake dispatch contract + nested-reviewers note carry the same two-seam phrasing; `fab-ff.md`'s sibling note (structural twin of fab-fff's) reconciled to match. The harness-adapter boundary records the **id→alias** detail (the `model` param takes `opus`/`sonnet`/`haiku`/`fable`, not the full `claude-*` id the resolver emits). The lone residual is the **harness ask** — a first-class per-sub-agent `effort` param on the Agent tool, out of fab's control, not built. SPEC mirrors updated same-PR: `SPEC-_pipeline.md`, `SPEC-_preamble.md`, `SPEC-fab-fff.md`, `SPEC-fab-ff.md`, `SPEC-fab-continue.md`. `docs/findings/per-stage-model-tier-application.md` (Gap 1a closed by fgxx, Gap 1b + Gap 2-effort addressed here, item 4 = residual) and `docs/specs/stage-models.md` reconciled. Canonical contract: [_shared/context-loading.md](../_shared/context-loading.md) § Per-Stage Model Resolution. | | 260613-fgxx-collapse-post-intake-dual-mode | 2026-06-13 | **Collapsed the post-intake dual execution mode — one execution mode, orchestrators as pure sequencers** (skills+specs prose refactor + one Go test; no production Go change, no new CLI command). Guiding principle (newly named): **intake is the sole context boundary** — the main session runs up to and including intake; after intake the intake artifact IS the context and every post-intake stage (apply/review/hydrate) runs as a dispatched sub-agent fed by `intake.md` + `plan.md` + `docs/memory/`. `fab-continue.md`: the three caller-aware "When invoked as a subagent: do NOT run `fab status`" blockquote conditionals (Apply/Review/Hydrate Behavior) are **removed** — with one execution mode there is no second branch; plain `/fab-continue` is now a **one-stage sequencer** that runs `fab resolve-agent ` then dispatches the stage's block exactly as the orchestrators do, and owns the `finish`/`fail`/`reset` transition after it returns. The "do NOT run `fab status`; return results only" prompt instruction is now the **universal block contract** (carried in the dispatch prompt), not a per-caller skip rule in the block; it is NOT re-encoded as any "skip §Verdict when subagent" flag. The interactive § Verdict rework menu (Path A) and `_pipeline.md`'s autonomous Auto-Rework Loop (Paths B/C/D) **remain** as the invocation-level failure-policy fork. Header note reconciled from "foreground = advisory-only" to the one-stage-sequencer framing. `_pipeline.md` Behavior note affirmed as **pure sequencer** (dispatch → read → proceed/loop/stop) with the universal-contract framing; per-cycle rework choreography (fail+reset pair) and the history-shape invariant **unchanged in semantics**. `_preamble.md` § Per-Stage Model Resolution "Foreground is advisory-only" minimally reconciled (residual advisory only for a genuinely no-dispatch run; closes Gap 1a — uniform tiering across apply/review/hydrate; full uniform-tier closure deferred to a follow-up). New Go test `TestHistoryShape_IdenticalRegardlessOfDriver` in `internal/status/mutators_test.go` pins that the manual (`driver=""`) and dispatched (`driver="fab-fff"`) paths leave `.history.jsonl` entries of identical shape — agreeing on every caller-blind field (stage/action/from/reason), equal modulo the per-run `ts` timestamp and the optional `driver` field (see [schemas.md](schemas.md)). SPEC mirrors updated same-PR: `SPEC-fab-continue.md` (f006 → universal block contract), `SPEC-_pipeline.md`, `SPEC-_preamble.md`, `docs/specs/stage-models.md` § Foreground limitation. `runtime/operator.md` verified — no foreground-post-intake assumption (no-op). | | 260612-w7dp-orchestrator-dispatch-review-pr-recovery | 2026-06-12 | **Orchestrator dispatch targets + review-pr failure recovery** (skills-audit batch 2/5, Themes 2+4 — markdown only, no Go changes). `/git-pr` + `/git-pr-review`: optional explicit `` argument (Step 0 transient resolution; **hard STOP** on explicit-arg resolution failure; value-based classification vs git-pr's 7 type tokens) and a **branch-matches-change guard** before any status mutation/commit/push (two-form match — exact folder equality or folder-as-substring; STOP with `/git-branch`//`/fab-switch`/pass-the-change guidance, no autonomous checkout); git-pr's detached-HEAD STOP moved before Step 0a's start (verify-before-mutate parity); the Step 1b mismatch nudge **removed** as superseded. `fab-fff.md` Steps 4–5 dispatch `/git-pr {name}` / `/git-pr-review {name}` (folder name — collision-proof vs type tokens). git-pr-review's Step 6/6.5/phase gates key on "a change resolved in Step 0 (active or explicit)"; push-failure + Phase-2 timeout re-run guidance conditionally name the explicit change. `fab-continue.md`: new **`review-pr`/`failed` dispatch row** (re-execute `/git-pr-review {name}` — `start` owns failed→active; never `reset`, whose From-set excludes `failed`; a failed PR review no longer reads "complete"); ship/review-pr rows pass `{name}` explicitly and cite `active` only (`ready` unreachable per AllowedStates); **idempotent Reset Flow** (target already `active` → skip the CLI call; `failed` → the failed dispatch rows; `pending` → error with guidance); reset event line `done/ready/skipped → active`; `intake.md`-missing error points at `/fab-continue intake`. `_pipeline.md`: the five unexecutable `/fab-clarify intake` recovery pointers replaced by the executable, **override-aware** `/fab-continue intake` → `/fab-clarify ` route (+ delete-`plan.md` note); stop template, gate-fail, and task-fail guidance name the change; `{driver}` claim scoped (fail/recovery commands deliberately driver-less); decision heuristics made **disjoint** (code-fails-correct-requirement → Fix code; requirement-itself-wrong → Revise requirements). fab-proceed: `/git-branch` dropped from the `/fab-new`-prefixed dispatch rows (Step 11 creates the branch inline) and the promptless fab-new dispatch defined as defer-and-surface (see [planning-skills.md](planning-skills.md)). New design decisions "Ship-Time Skills Take an Explicit Change…" and "Recovery Guidance Is Override-Aware". 13 SPEC mirrors + `docs/specs/skills.md`/`glossary.md`/`architecture.md` § Git Integration updated same-PR. | | 260612-g8st-git-state-hardening | 2026-06-12 | **Git-state hardening for the autonomous skills** (skills-audit batch 3/5, Theme 3 — every item a flagged behavior change). `/git-pr`: detached-HEAD STOP ahead of the Step 2 guard (empty `git branch --show-current`; previously committed then emitted a refspec-less `git push -u origin`), Step 1b nudge skipped on empty branch; Step 1 resolves `{default_branch}` (symbolic-ref → `gh repo view` → probed literal `main`/`master`, always non-empty) for the Step 2 guard + nudge; Step 3a replaces `git add -A` with the **expected-area guard** — evaluated before anything is staged (STOP listing untracked files outside `source_paths`/`docs/`/`fab/`; absent config → `docs/`+`fab/`), then `git add -u` + in-area untracked; Step 3 branches on the previously-unread PR `state` — OPEN short-circuit / CLOSED fresh PR / MERGED STOP at Step 3 entry (Key Properties idempotency row now conditions "already shipped" on an OPEN PR). `/git-pr-review`: Step 5 failure handling **split** — commit-fail = reset+STOP (no partial state, true there); push-fail = **keep the commit** + recovery guidance + STOP without replies (the false unsplit "(no partial state)" claim stranded fixes permanently); new **unpushed-commit re-run gate** before "No changes needed" (`@{u}..HEAD`, no-upstream = unpushed → push, then replies cite the now-pushed SHA); Step 6 exception text matches the split. `/fab-archive`: "safe to re-run" qualified with the **dirty-tree disclosure** (uncommitted moves + backlog/index edits for the caller to commit; no autonomous commit — `/git-pr` owns commits); Go `ArchiveWithBacklog` still attempts `backlog.MarkDone` on `ErrAlreadyArchived` (best-effort, error propagation unchanged — exit-code semantics stay k4ge's seam) + unit test + `_cli-fab.md` row. New design decision "Autonomous Git Paths Verify State Before Mutating". Twin-table/branch edits documented in [planning-skills.md](planning-skills.md) + [change-lifecycle.md](change-lifecycle.md); operator default-branch resolution in [runtime/operator.md](../runtime/operator.md). Six SPEC mirrors updated same-PR. | diff --git a/docs/memory/pipeline/index.md b/docs/memory/pipeline/index.md index 4eb5d0cf..e59ef295 100644 --- a/docs/memory/pipeline/index.md +++ b/docs/memory/pipeline/index.md @@ -9,7 +9,7 @@ description: "The change pipeline — stage lifecycle & state machine, planning/ |------|-------------|-------------| | [change-lifecycle](change-lifecycle.md) | Change naming, folder structure, `.status.yaml` (6-stage pipeline + `plan:` block, flock-serialized writes + fsync durability + sparse-key insertion), `.fab-status.yaml` symlink (atomic temp+rename swap, dangling-target validation), `stage_hooks` pre/post commands around start/finish (documented in c5tr), git integration (bookkeeping decoupled; ship-time branch↔change hard guard since w7dp), `/fab-status` (six-glyph progress legend incl. `⏭` skipped), `/fab-switch` (six-state `{state}` qualifier), backlog scanning | 2026-06-12 | | [clarify](clarify.md) | `/fab-clarify` skill — intake-only (j6cs), dual modes (suggest/auto), intake taxonomy scan, structured questions, coverage reports, audit trail, grade reclassification, always-recompute intake score; hosts the [AUTO-MODE] Skill Invocation Protocol and is sole bulk-confirm authority (zc9m); bulk confirm evaluated before the zero-gaps exit, confirmations graded by recomputed composite (no fiat Certain), unified audit-trail placement rules (c5tr) | 2026-06-12 | -| [execution-skills](execution-skills.md) | Apply (unified `plan.md` `## Requirements`+`## Tasks`+`## Acceptance` generated at entry), review, hydrate, archive, and orchestrator behavior — `/fab-continue` for pipeline stages, `/fab-archive` for housekeeping, `/fab-proceed` for context-aware pipeline orchestration; `/git-pr` ship behavior with the mechanically-rendered `## Meta` block via `fab pr-meta`, a unified Step 0 resolution (szxd), and git-state hardening — detached-HEAD STOP, expected-area staging guard, PR-state branching, resolved default-branch guard (g8st); `/git-pr-review` split commit/push failure semantics + unpushed-commit re-run gate (g8st); fab-continue loads `_generation`/`_review` stage-conditionally at point of use (zc9m); the shared ff/fff bracket lives in `_pipeline.md` with once-stated rework choreography (cycle cap = the `{max_cycles}` knob from code-review.md § Rework Budget, default 3 — c5tr), exhaustion terminal state review:failed, and fab-continue's review-failed rework-menu dispatch row (szxd); re-archiving a genuinely archived change soft-skips (exit 0) and restore `--switch` surfaces `pointer: failed` (k4ge); archive/restore index writes are atomic with honest `index: failed` reporting on both paths (hv7t); `/git-pr`/`/git-pr-review` accept an optional explicit `` argument and guard branch↔change correspondence before any mutation, fab-continue gained the review-pr/failed dispatch row + idempotent reset, and recovery guidance is override-aware (w7dp). Operator coordination moved to runtime/operator.md. | 2026-06-12 | +| [execution-skills](execution-skills.md) | Apply (unified `plan.md` `## Requirements`+`## Tasks`+`## Acceptance` generated at entry), review, hydrate, archive, and orchestrator behavior — `/fab-continue` for pipeline stages, `/fab-archive` for housekeeping, `/fab-proceed` for context-aware pipeline orchestration; `/git-pr` ship behavior with the mechanically-rendered `## Meta` block via `fab pr-meta`, a unified Step 0 resolution (szxd), and git-state hardening — detached-HEAD STOP, expected-area staging guard, PR-state branching, resolved default-branch guard (g8st); `/git-pr-review` split commit/push failure semantics + unpushed-commit re-run gate (g8st); fab-continue loads `_generation`/`_review` stage-conditionally at point of use (zc9m); the shared ff/fff bracket lives in `_pipeline.md` with once-stated rework choreography (cycle cap = the `{max_cycles}` knob from code-review.md § Rework Budget, default 3 — c5tr), exhaustion terminal state review:failed, and fab-continue's review-failed rework-menu dispatch row (szxd); re-archiving a genuinely archived change soft-skips (exit 0) and restore `--switch` surfaces `pointer: failed` (k4ge); archive/restore index writes are atomic with honest `index: failed` reporting on both paths (hv7t); `/git-pr`/`/git-pr-review` accept an optional explicit `` argument and guard branch↔change correspondence before any mutation, fab-continue gained the review-pr/failed dispatch row + idempotent reset, and recovery guidance is override-aware (w7dp). Operator coordination moved to runtime/operator.md. | 2026-06-13 | | [planning-skills](planning-skills.md) | `/fab-new`, `/fab-continue`, `/fab-ff`, `/fab-clarify` — the planning stage (intake only) and the shared `_generation.md` partial (Intake + unified Plan procedures); requirement capture + plan generation live at apply entry (spec stage removed in j6cs); fab-new/fab-draft re-run/resume semantics (ID collisions route to resume, 9u91); change_type is hook-owned — skills verify via .status.yaml and override only if wrong (uliv); SRAD framework lives in the `_srad` helper, declared by the 6 planning skills (zc9m); fab-draft is a thin delta over fab-new Steps 0–9, fab-ff/fab-fff are thin wrappers over the shared `_pipeline` bracket, fab-new Step 11 is a single first-match-wins branch table (szxd; 6 rows since g8st — remote-only `--track` checkout, dirty-tree carried-over note excluding the change's own artifacts, same-change rename); SRAD grades by half-open thresholds with one `R<25 AND A<25` Critical-Rule definition, the plan walk emits `## Assumptions` explicitly, artifacts always carry the section (omit-when-zero is display-only), fab-new output puts the Assumptions block last (c5tr); fab-new's backlog-ID collision pre-check is exact-ID anchored (`fab resolve --id` equality) and fab-proceed's promptless fab-new dispatch carries the defer-and-surface contract with the matching `_srad` Critical-Rule carve-out (w7dp); fab-new/fab-draft `Next:` lines derive at runtime per the _preamble Lookup Procedure and `_generation` names fab-continue in both consumer groups (d9rs) | 2026-06-12 | | [preflight](preflight.md) | `lib/preflight.sh` script — validation, accessor-based architecture, structured YAML output, skill integration | 2026-06-12 | -| [schemas](schemas.md) | Workflow schema authority — the Go state machine (`internal/status` transitions + `internal/statusfile` stage order/progress; declarative `workflow.yaml` retired in c5tr): 6-stage pipeline, states, transitions, validation rules; `.status.yaml` `plan:` (`## Requirements`-aware), `confidence:` (indicative retired), and lazy `true_impact:` block schemas (incl. the `tests` sub-block + render-time `impl` residual, 7t5a); `fab impact` and `fab pr-meta` helper subcommands (rj31); allowed-states-enforced transition targets, `fab score --check-gate` non-zero gate-fail exit, iterations-preserving reset cascade (k4ge); `fab score` normal-mode hard-fail on load/persist/read errors (hv7t); per-stage allowed-states + transition-event tables enumerated, pinned by the exhaustive 216-cell matrix test (tb6f) | 2026-06-12 | +| [schemas](schemas.md) | Workflow schema authority — the Go state machine (`internal/status` transitions + `internal/statusfile` stage order/progress; declarative `workflow.yaml` retired in c5tr): 6-stage pipeline, states, transitions, validation rules; `.status.yaml` `plan:` (`## Requirements`-aware), `confidence:` (indicative retired), and lazy `true_impact:` block schemas (incl. the `tests` sub-block + render-time `impl` residual, 7t5a); `fab impact` and `fab pr-meta` helper subcommands (rj31); allowed-states-enforced transition targets, `fab score --check-gate` non-zero gate-fail exit, iterations-preserving reset cascade (k4ge); `fab score` normal-mode hard-fail on load/persist/read errors (hv7t); per-stage allowed-states + transition-event tables enumerated, pinned by the exhaustive 216-cell matrix test (tb6f) | 2026-06-13 | diff --git a/docs/specs/skills/SPEC-_pipeline.md b/docs/specs/skills/SPEC-_pipeline.md index 860959bc..cb6c0ea1 100644 --- a/docs/specs/skills/SPEC-_pipeline.md +++ b/docs/specs/skills/SPEC-_pipeline.md @@ -2,7 +2,7 @@ ## Summary -Shared pipeline bracket executed by `/fab-ff` and `/fab-fff` (added in 260611-szxd — f007/f071). The bracket is a **pure sequencer** (affirmed 260613-fgxx): dispatch block → read returned status/findings → decide proceed / loop / stop; it owns the `fab status` transitions and `fab resolve-agent ` resolution and never reaches into block internals. The subagent dispatch note's "do NOT run `fab status`; return results only" instruction is the **universal block contract** for post-intake `/fab-continue` blocks (Apply/Review/Hydrate) — not an override this orchestrator imposes — because plain `/fab-continue` is itself a one-stage sequencer that dispatches those blocks identically and runs their transitions after they return (see `SPEC-fab-continue.md`). Single authoritative source for everything the two drivers share: pre-flight (intake prerequisite + the single intake confidence gate, flat 3.0), context loading, the subagent dispatch note ("do NOT run `fab status`; return results only"), resumability (skip `done` stages; review-`failed` recovery via `fab status start review`), Steps 1–3 (apply with internal plan generation → review → hydrate — each preceded by `fab resolve-agent ` per-stage model resolution since 260613-l3ja; review resolves once for both reviewers + merge; rework items 3/4 re-resolve before re-dispatch), the auto-rework loop with its **explicit per-cycle choreography**, the exhaustion stop, and the shared error rows. All user-facing stop/re-run guidance in the bracket (gate-fail, task-fail, exhaustion stop) names the change in the suggested commands — the run may be driving a non-active override, and argless re-runs would resolve the active change (260612-w7dp). +Shared pipeline bracket executed by `/fab-ff` and `/fab-fff` (added in 260611-szxd — f007/f071). The bracket is a **pure sequencer** (affirmed 260613-fgxx): dispatch block → read returned status/findings → decide proceed / loop / stop; it owns the `fab status` transitions and `fab resolve-agent ` resolution and never reaches into block internals. The subagent dispatch note's "do NOT run `fab status`; return results only" instruction is the **universal block contract** for post-intake `/fab-continue` blocks (Apply/Review/Hydrate) — not an override this orchestrator imposes — because plain `/fab-continue` is itself a one-stage sequencer that dispatches those blocks identically and runs their transitions after they return (see `SPEC-fab-continue.md`). Single authoritative source for everything the two drivers share: pre-flight (intake prerequisite + the single intake confidence gate, flat 3.0), context loading, the subagent dispatch note ("do NOT run `fab status`; return results only"), resumability (skip `done` stages; review-`failed` recovery via `fab status start review`), Steps 1–3 (apply with internal plan generation → review → hydrate — each preceded by `fab resolve-agent ` per-stage model resolution since 260613-l3ja, and since 260613-m3d4 each site **surfaces** the resolved `model=/effort=` (visibility — a skipped resolution is then detectable, not silent) and dispatches via two seams: model on the Agent `model` param, effort as an imperative instruction in the dispatch prompt (no Agent effort param; omitted when empty); review resolves once and applies the same model + same effort-prompt instruction to both reviewers + merge; rework items 3/4 re-resolve, re-surface, and re-inject before re-dispatch), the auto-rework loop with its **explicit per-cycle choreography**, the exhaustion stop, and the shared error rows. All user-facing stop/re-run guidance in the bracket (gate-fail, task-fail, exhaustion stop) names the change in the suggested commands — the run may be driving a non-active override, and argless re-runs would resolve the active change (260612-w7dp). **Parameters** (declared by each driver's own file): @@ -43,9 +43,12 @@ Driver (fab-ff / fab-fff) reads _pipeline.md with {driver}/{terminal} bound ├─ Resumability: skip done stages; {terminal}: done → already complete; │ review failed → fab status start review, resume from Step 2 │ -│ (each stage dispatch first runs `fab resolve-agent ` and passes the -│ resolved model+effort to the Agent dispatch — 260613-l3ja; empty ⇒ -│ omit/inherit; see _preamble.md § Subagent Dispatch → Per-Stage Model Resolution) +│ (each stage dispatch first runs `fab resolve-agent `, surfaces the +│ resolved model=/effort= (visibility — 260613-m3d4), then dispatches via two +│ seams: model → Agent `model` param (empty ⇒ omit/inherit), effort → imperative +│ instruction in the dispatch prompt (no Agent effort param; empty ⇒ omit; +│ 260613-m3d4) — 260613-l3ja established the resolve call; see _preamble.md +│ § Subagent Dispatch → Per-Stage Model Resolution) │ ├─ Step 1: Apply (fab resolve-agent apply → subagent: /fab-continue Apply Behavior — plan co-gen + tasks) │ ├─ fab status finish intake {driver} (if intake not done) diff --git a/docs/specs/skills/SPEC-_preamble.md b/docs/specs/skills/SPEC-_preamble.md index 4196c1bb..d1cf551e 100644 --- a/docs/specs/skills/SPEC-_preamble.md +++ b/docs/specs/skills/SPEC-_preamble.md @@ -2,7 +2,7 @@ ## Summary -Shared context preamble loaded by every Fab skill. Defines path conventions, context loading layers (always-load — descriptive, with a skill-file-wins override and a derived, never-enumerated exception set; change context; memory lookup; source code), the **Skill Helper Declaration** frontmatter convention (including stage-conditional in-body loading), inlined **Naming Conventions**, inlined **Run-Kit (rk) Reference**, the **Common fab Commands** headline table, the next-steps convention (with a skill-file-declared ending opt-out) with state table, a pointer to the skill invocation protocol (defined in `fab-clarify.md` since 260611-zc9m), subagent dispatch pattern with standard subagent context and **Per-Stage Model Resolution** (260613-l3ja — `fab resolve-agent ` before each pipeline-stage dispatch; resolved model+effort passed to the Agent dispatch with empty ⇒ omit/inherit; Claude Code adapter = Agent tool `model` param while resolution stays provider-neutral; review resolves once for both reviewers + merge; per-stage selection applies on every post-intake stage — every post-intake stage now dispatches a sub-agent (including plain `/fab-continue` as a one-stage sequencer), so `fab resolve-agent` applies uniformly across apply/review/hydrate, with the residual advisory case narrowed to a stage skill genuinely run with no dispatch at all (260613-fgxx)), a pointer to the SRAD autonomy framework (extracted to `_srad.md` in 260611-zc9m), and slimmed confidence scoring (gate threshold + invocation; schema/formula/template moved to `_cli-fab.md` § fab score). +Shared context preamble loaded by every Fab skill. Defines path conventions, context loading layers (always-load — descriptive, with a skill-file-wins override and a derived, never-enumerated exception set; change context; memory lookup; source code), the **Skill Helper Declaration** frontmatter convention (including stage-conditional in-body loading), inlined **Naming Conventions**, inlined **Run-Kit (rk) Reference**, the **Common fab Commands** headline table, the next-steps convention (with a skill-file-declared ending opt-out) with state table, a pointer to the skill invocation protocol (defined in `fab-clarify.md` since 260611-zc9m), subagent dispatch pattern with standard subagent context and **Per-Stage Model Resolution** (260613-l3ja — `fab resolve-agent ` before each pipeline-stage dispatch; resolved model+effort passed to the Agent dispatch with empty ⇒ omit/inherit; review resolves once for both reviewers + merge; per-stage selection applies on every post-intake stage — every post-intake stage now dispatches a sub-agent (including plain `/fab-continue` as a one-stage sequencer), so `fab resolve-agent` applies uniformly across apply/review/hydrate, with the residual advisory case narrowed to a stage skill genuinely run with no dispatch at all (260613-fgxx); **the two halves dispatch through two seams (260613-m3d4)** — model via the Agent tool `model` param (which takes a short alias `opus`/`sonnet`/`haiku`/`fable`, so the orchestrator maps the resolved full id → alias at the seam) and effort via an explicit imperative instruction in the subagent prompt (the Agent tool has no effort param; omitted when empty), plus a **compliance-visibility** expectation that each site surface the resolved `model=/effort=` so a skipped/mis-resolved tier is visible rather than silent; resolution itself stays provider-neutral; the lone residual is a first-class per-sub-agent effort param on the Agent tool — a harness ask outside fab's control), a pointer to the SRAD autonomy framework (extracted to `_srad.md` in 260611-zc9m), and slimmed confidence scoring (gate threshold + invocation; schema/formula/template moved to `_cli-fab.md` § fab score). This is an internal partial (`user-invocable: false`) — it is never invoked directly. Skills reference it via the opening instruction: "Read the `_preamble` skill first (deployed to `.claude/skills/` via `fab sync`). Then follow its instructions before proceeding." @@ -87,18 +87,24 @@ Skill reads _preamble.md │ │ context.md*, code-quality.md*, │ │ code-review.md* │ │ (applied at every nesting level) -│ └─ Per-Stage Model Resolution (260613-l3ja) +│ └─ Per-Stage Model Resolution (260613-l3ja, m3d4) │ Bash: fab resolve-agent before each -│ pipeline-stage sub-agent dispatch; pass -│ model+effort to the Agent dispatch -│ (empty model ⇒ omit/inherit; empty effort ⇒ omit). -│ Claude Code adapter = Agent tool `model` param -│ (resolution itself is provider-neutral); +│ pipeline-stage sub-agent dispatch; SURFACE the +│ resolved model=/effort= (visibility — a skip is +│ then detectable; 260613-m3d4), then dispatch via +│ TWO SEAMS: model → Agent tool `model` param +│ (empty ⇒ omit/inherit; param takes a short alias +│ opus/sonnet/haiku/fable, orchestrator maps id→alias) +│ and effort → imperative instruction in the subagent +│ prompt (no Agent effort param; empty ⇒ omit; m3d4). +│ Resolution itself is provider-neutral; │ review resolves once for both reviewers + merge; │ per-stage selection applies on every post-intake │ stage (each now dispatches a sub-agent, incl. plain │ /fab-continue as a one-stage sequencer) — advisory │ only for a genuinely no-dispatch run (260613-fgxx). +│ Residual: a per-sub-agent effort param on the Agent +│ tool (harness ask, not built). │ ├─ SRAD Autonomy Framework (pointer) │ (framework extracted to _srad.md — loaded via diff --git a/docs/specs/skills/SPEC-fab-continue.md b/docs/specs/skills/SPEC-fab-continue.md index f8b7d6a8..adeb8798 100644 --- a/docs/specs/skills/SPEC-fab-continue.md +++ b/docs/specs/skills/SPEC-fab-continue.md @@ -4,11 +4,11 @@ Advances through the 6-stage pipeline one step at a time. Each invocation handles the current stage's work and transitions to the next. Supports reset to a given stage (legacy `tasks`/`spec` targets error with a pointer to the `apply` and `intake` reset routes). Handles all six stages: intake (the only planning stage), apply (co-generates `plan.md` `## Requirements` + `## Tasks` + `## Acceptance` at entry then runs tasks), review (sub-agent), hydrate, ship (delegates to `/git-pr` behavior), and review-pr (delegates to `/git-pr-review` behavior). -**Single post-intake execution mode — one-stage sequencer** (260613-fgxx): intake is the sole context boundary. Intake is the only stage `/fab-continue` runs in the main session; **every post-intake stage (apply / review / hydrate) is always dispatched as a sub-agent**, the same dispatch the orchestrators (`_pipeline.md`) perform. There is no foreground execution path for apply/review/hydrate. In this mode `/fab-continue` is a **one-stage sequencer**: it runs `fab resolve-agent ` immediately before the dispatch, passes the resolved model/effort to the Agent dispatch, includes the universal **"do NOT run `fab status`; return results only"** prompt contract, reads the returned status/findings, and owns the `finish`/`fail`/`reset` transition itself — identical sequencer/block split whether the caller is manual `/fab-continue` or an orchestrator. The three former dual-mode "When invoked as a subagent: do NOT run `fab status`" conditionals (Apply/Review/Hydrate Behavior) are removed; the instruction is now the universal block contract, not a per-caller override, and is NOT re-encoded as any "skip §Verdict when subagent" flag. Ship and review-pr are excluded from this rewrite — they self-manage their transitions (the dispatch exception). +**Single post-intake execution mode — one-stage sequencer** (260613-fgxx): intake is the sole context boundary. Intake is the only stage `/fab-continue` runs in the main session; **every post-intake stage (apply / review / hydrate) is always dispatched as a sub-agent**, the same dispatch the orchestrators (`_pipeline.md`) perform. There is no foreground execution path for apply/review/hydrate. In this mode `/fab-continue` is a **one-stage sequencer**: it runs `fab resolve-agent ` immediately before the dispatch and applies the resolved tier (the two-seam model-param + effort-via-prompt mechanics are described in the **Per-stage model** paragraph below), includes the universal **"do NOT run `fab status`; return results only"** prompt contract, reads the returned status/findings, and owns the `finish`/`fail`/`reset` transition itself — identical sequencer/block split whether the caller is manual `/fab-continue` or an orchestrator. The three former dual-mode "When invoked as a subagent: do NOT run `fab status`" conditionals (Apply/Review/Hydrate Behavior) are removed; the instruction is now the universal block contract, not a per-caller override, and is NOT re-encoded as any "skip §Verdict when subagent" flag. Ship and review-pr are excluded from this rewrite — they self-manage their transitions (the dispatch exception). **Failure recovery + idempotent reset** (260612-w7dp): a `review-pr`/`failed` dispatch row — keyed off `progress.review-pr == failed`, the same progress-map guard mechanism as the review row — re-executes `/git-pr-review` behavior (its Step 0 `start` accepts `failed → active` for review-pr; never `reset`, whose From-set `{done, ready, skipped}` excludes `failed`), so a failed PR review no longer falls through to "Change is complete." The ship and review-pr rows (incl. the failed row) pass the resolved change **explicitly** to `/git-pr`/`/git-pr-review` (`{name}` as the `` argument — the explicit-arg contract); the ship and review-pr **`active`** rows key on `active` only — `ready` is not in either stage's AllowedStates — while the review-pr failed row keys on the progress map's `failed`. The Reset Flow handles all non-resettable target states (reset From-set `{done, ready, skipped}`): already-`active` → skip the call and proceed (re-running a reset is a state-wise no-op — Constitution III); `failed` → route via the matching failed dispatch row (`start` owns failed→active, review/review-pr only); `pending` → error with advance guidance. All recovery pointers are executable: the unexecutable `/fab-clarify intake` form is replaced by `/fab-continue intake` then argless `/fab-clarify` (argless is correct in fab-continue's own messages — the change reference of the current invocation is implied, active or `[change-name]` override, and an Error Handling note tells override users to re-run with the same ``; cross-context sites like `_pipeline.md`'s stop guidance instead name the change in every command), with an explicit delete-`plan.md` note where plan regeneration is the intent; the `intake.md`-missing error points at `/fab-continue intake` instead of looping through plain `/fab-continue`. The Review Behavior call site reads `change_type` from `.status.yaml` and passes it in the inward sub-agent prompt per `_review.md`'s context contract. -**Per-stage model** (260613-l3ja, 260613-fgxx): the one-stage sequencer resolves `fab resolve-agent ` immediately before dispatching each post-intake stage's block and applies the resolved model/effort to the dispatch (empty model ⇒ omit/inherit; empty effort ⇒ omit). Since every post-intake stage now dispatches, per-stage selection applies uniformly across apply/review/hydrate regardless of caller — there is no foreground post-intake path left to be the advisory-only exception (this closes Gap 1a of the model-tier finding; full uniform-tier closure incl. the per-sub-agent effort knob is a deferred follow-up). The Review block additionally resolves `fab resolve-agent review` **once** for its own nested reviewer sub-agents (inward + outward) and the merge — independent of the sequencer's resolution of the `review` stage (the Claude Code adapter is the Agent tool `model` param; resolution is provider-neutral — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution). Intake is pre-boundary and is not tiered by `/fab-continue`. +**Per-stage model** (260613-l3ja, 260613-fgxx, 260613-m3d4): the one-stage sequencer resolves `fab resolve-agent ` immediately before dispatching each post-intake stage's block, **surfaces** the resolved `model=/effort=` (visibility — a skip is detectable, not silent; m3d4), and applies both halves via two seams: model on the Agent `model` param (empty ⇒ omit/inherit) and effort as an imperative instruction in the sub-agent's prompt (no Agent effort param; omitted when empty; m3d4). Since every post-intake stage now dispatches, per-stage selection applies uniformly across apply/review/hydrate regardless of caller — there is no foreground post-intake path left to be the advisory-only exception (this closes Gap 1a of the model-tier finding; Gap 1b visibility + Gap 2's effort half are closed by m3d4 via the surface + prompt-injection seams; the lone residual is a first-class per-sub-agent effort param on the Agent tool — a harness ask, not built). The Review block additionally resolves `fab resolve-agent review` **once** for its own nested reviewer sub-agents (inward + outward) and the merge — applying the same model + same effort-prompt instruction to all three, independent of the sequencer's resolution of the `review` stage (the Claude Code adapter is the Agent tool `model` param, effort rides the prompt; resolution is provider-neutral — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution). Intake is pre-boundary and is not tiered by `/fab-continue`. **Helpers**: Declares `helpers: [_srad]` in frontmatter; `_generation` and `_review` are loaded **stage-conditionally** at point of use (apply entry / intake regeneration → `_generation`; Review Behavior entry → `_review`) per `_preamble.md` § Skill Helper Declaration stage-conditional loading. Hydrate/ship/review-pr invocations and apply-resumes load neither. diff --git a/docs/specs/skills/SPEC-fab-ff.md b/docs/specs/skills/SPEC-fab-ff.md index 3fcd3e9c..e7fe7ac3 100644 --- a/docs/specs/skills/SPEC-fab-ff.md +++ b/docs/specs/skills/SPEC-fab-ff.md @@ -2,7 +2,7 @@ ## Summary -Fast-forward apply → review → hydrate (everything after intake) in one invocation. Since 260611-szxd the skill file is a **thin wrapper over the shared pipeline bracket in `_pipeline.md`** (see `SPEC-_pipeline.md`): it declares Purpose, Arguments, and the two bracket parameters — `{driver}` = `fab-ff`, `{terminal}` = `hydrate` — plus its own Output block. The bracket owns the single intake confidence gate (flat 3.0, all types), context loading, resumability (incl. the review-`failed` recovery via `fab status start review`), Steps 1–3, the auto-rework loop (`{max_cycles}`-cycle cap — the code-review.md Rework Budget knob, default 3 (c5tr); explicit per-cycle choreography, escalation after 2 consecutive fix-code), the exhaustion stop (terminal state: review `failed` — `/fab-continue`'s review-failed row presents the rework menu from there), and the shared error rows. No `/fab-clarify` runs inside the bracket — clarification is intake-only. All sub-skill invocations are dispatched as sub-agents with "do NOT run `fab status` commands; return results only" — the orchestrator owns those stages' transitions. Each stage dispatch in the bracket first resolves `fab resolve-agent ` and passes the model+effort to the Agent dispatch (260613-l3ja; empty ⇒ omit/inherit — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution). Accepts `--force` to bypass the intake gate. The wrapper's `{driver}` parameter row scopes the driver claim to the commands the bracket shows it on — the fail/recovery commands are deliberately driver-less (260612-w7dp). +Fast-forward apply → review → hydrate (everything after intake) in one invocation. Since 260611-szxd the skill file is a **thin wrapper over the shared pipeline bracket in `_pipeline.md`** (see `SPEC-_pipeline.md`): it declares Purpose, Arguments, and the two bracket parameters — `{driver}` = `fab-ff`, `{terminal}` = `hydrate` — plus its own Output block. The bracket owns the single intake confidence gate (flat 3.0, all types), context loading, resumability (incl. the review-`failed` recovery via `fab status start review`), Steps 1–3, the auto-rework loop (`{max_cycles}`-cycle cap — the code-review.md Rework Budget knob, default 3 (c5tr); explicit per-cycle choreography, escalation after 2 consecutive fix-code), the exhaustion stop (terminal state: review `failed` — `/fab-continue`'s review-failed row presents the rework menu from there), and the shared error rows. No `/fab-clarify` runs inside the bracket — clarification is intake-only. All sub-skill invocations are dispatched as sub-agents with "do NOT run `fab status` commands; return results only" — the orchestrator owns those stages' transitions. Each stage dispatch in the bracket first resolves `fab resolve-agent ` (260613-l3ja), **surfaces** the resolved `model=/effort=` (visibility — a skip is detectable, not silent; 260613-m3d4), then dispatches via two seams: model on the Agent `model` param (empty ⇒ omit/inherit) and effort as an imperative instruction in the dispatch prompt (no Agent effort param; omitted when empty; 260613-m3d4) — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Accepts `--force` to bypass the intake gate. The wrapper's `{driver}` parameter row scopes the driver claim to the commands the bracket shows it on — the fail/recovery commands are deliberately driver-less (260612-w7dp). **Helpers**: Declares `helpers: [_generation, _review, _srad, _pipeline]` in frontmatter per `docs/specs/skills.md § Skill Helpers`. diff --git a/docs/specs/skills/SPEC-fab-fff.md b/docs/specs/skills/SPEC-fab-fff.md index 356f1efe..a509a95b 100644 --- a/docs/specs/skills/SPEC-fab-fff.md +++ b/docs/specs/skills/SPEC-fab-fff.md @@ -2,7 +2,7 @@ ## Summary -Full pipeline gated on the single intake gate (identical to fab-ff). Since 260611-szxd the skill file is a **thin wrapper over the shared pipeline bracket in `_pipeline.md`** (see `SPEC-_pipeline.md`) with the two bracket parameters — `{driver}` = `fab-fff`, `{terminal}` = `review-pr` — plus the fff-only Steps 4–5 (ship, review-pr), its own Output block, and the fff-only error rows. The bracket owns the gate, resumability, Steps 1–3, the auto-rework loop (`{max_cycles}`-cycle cap — the code-review.md Rework Budget knob, default 3 (c5tr); explicit per-cycle choreography, exhaustion terminal state review `failed`), and the shared error rows. Ship and review-pr are self-managed: `/git-pr` and `/git-pr-review` run their own stage transitions internally (the dispatch exception is noted in `fab-fff.md`); `/git-pr-review`'s timeout outcome deliberately leaves `review-pr` `active` and replaces `Pipeline complete.` with a re-run pointer. Since 260612-w7dp, Steps 4–5 dispatch with the **explicit change argument** (`/git-pr {name}`, `/git-pr-review {name}` — `{name}` is the change's folder name from preflight, chosen over the 4-char `{id}` because git-pr classifies any argument spelling a PR type word as a ``, and a 4-char id can collide with `feat`/`docs`/`test`; a folder name never matches a type token) so the subagents resolve the pipeline's target as a transient override instead of self-resolving the active change — and their branch-matches-change guards verify the checked-out branch before mutating anything. Accepts `--force` to bypass the intake gate. **Per-stage model** (260613-l3ja): the fff-only Steps 4–5 run `fab resolve-agent ship` / `fab resolve-agent review-pr` before dispatching `/git-pr` / `/git-pr-review` (the bracket handles Steps 1–3) — resolved model+effort passed to the Agent dispatch, empty ⇒ omit/inherit; see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. +Full pipeline gated on the single intake gate (identical to fab-ff). Since 260611-szxd the skill file is a **thin wrapper over the shared pipeline bracket in `_pipeline.md`** (see `SPEC-_pipeline.md`) with the two bracket parameters — `{driver}` = `fab-fff`, `{terminal}` = `review-pr` — plus the fff-only Steps 4–5 (ship, review-pr), its own Output block, and the fff-only error rows. The bracket owns the gate, resumability, Steps 1–3, the auto-rework loop (`{max_cycles}`-cycle cap — the code-review.md Rework Budget knob, default 3 (c5tr); explicit per-cycle choreography, exhaustion terminal state review `failed`), and the shared error rows. Ship and review-pr are self-managed: `/git-pr` and `/git-pr-review` run their own stage transitions internally (the dispatch exception is noted in `fab-fff.md`); `/git-pr-review`'s timeout outcome deliberately leaves `review-pr` `active` and replaces `Pipeline complete.` with a re-run pointer. Since 260612-w7dp, Steps 4–5 dispatch with the **explicit change argument** (`/git-pr {name}`, `/git-pr-review {name}` — `{name}` is the change's folder name from preflight, chosen over the 4-char `{id}` because git-pr classifies any argument spelling a PR type word as a ``, and a 4-char id can collide with `feat`/`docs`/`test`; a folder name never matches a type token) so the subagents resolve the pipeline's target as a transient override instead of self-resolving the active change — and their branch-matches-change guards verify the checked-out branch before mutating anything. Accepts `--force` to bypass the intake gate. **Per-stage model** (260613-l3ja, m3d4): the fff-only Steps 4–5 run `fab resolve-agent ship` / `fab resolve-agent review-pr` before dispatching `/git-pr` / `/git-pr-review` (the bracket handles Steps 1–3) — each site **surfaces** the resolved `model=/effort=` (visibility — a skip is detectable) and dispatches via two seams: model on the Agent `model` param (empty ⇒ omit/inherit) and effort as an imperative instruction in the dispatch prompt (no Agent effort param; omitted when empty; 260613-m3d4); see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. **Helpers**: Declares `helpers: [_generation, _review, _srad, _pipeline]` in frontmatter per `docs/specs/skills.md § Skill Helpers`. diff --git a/docs/specs/stage-models.md b/docs/specs/stage-models.md index 126543d4..d91b9736 100644 --- a/docs/specs/stage-models.md +++ b/docs/specs/stage-models.md @@ -203,15 +203,18 @@ override a tier to Haiku (pass-through doesn't forbid it); fab just doesn't ship ## Skill wiring — orchestrator/dispatch consume `fab resolve-agent` The orchestrators (`/fab-ff`, `/fab-fff`, `/fab-proceed`) and `/fab-continue`'s sub-agent dispatch call -`fab resolve-agent ` immediately before dispatching each stage's sub-agent, and pass the -resolved **model AND effort** to the Agent dispatch: +`fab resolve-agent ` immediately before dispatching each stage's sub-agent, **surface** the +resolved `model=/effort=` lines (so a skipped or mis-resolved tier is visible in output rather than +silent — the available stand-in for an enforcement guard, since dispatch is harness-internal), and +apply the resolved **model AND effort** through their two seams: -- Empty model → omit the model param (inherit session/orchestrator model — today's behavior). -- Empty effort → omit the effort flag. +- **Model → the Agent tool's `model` param.** Empty model → omit it (inherit session/orchestrator model — today's behavior). +- **Effort → an explicit instruction in the subagent prompt.** The Agent tool has no `effort` param, so the resolved effort is injected as an imperative line in the dispatched prompt (e.g., ``Operate at `xhigh` reasoning effort for this task.``) and the sub-agent self-selects. Empty effort → omit the instruction. (The effort half is therefore **no longer dropped** — earlier wiring had no seam for it; it now rides the prompt. The clean fix, a first-class per-sub-agent effort parameter on the Agent tool, is a harness ask outside fab's control — see § Foreground limitation's scope note.) The **`review` stage resolves once** and applies the same `{model, effort}` to BOTH reviewer sub-agents -(inward + outward) and the merge — a consequence of stage(/tier)-granularity, documented as a known -tradeoff (the mechanical merge runs at the reviewer's tier; acceptable for config simplicity). +(inward + outward) and the merge — the same model param and the same effort-prompt instruction for all +three; a consequence of stage(/tier)-granularity, documented as a known tradeoff (the mechanical merge +runs at the reviewer's tier; acceptable for config simplicity). `_cli-fab.md` documents the `fab resolve-agent` command signature (Constitution constraint: CLI changes MUST update `_cli-fab.md`). `architecture.md` documents the `agent.tiers` config block alongside the @@ -227,11 +230,15 @@ Per-stage selection is **provider-neutral by construction**, not Claude-locked: provider's model IDs and effort vocabulary (`gpt-5 / reasoning_effort:high`, `gemini-* / `) and nothing in fab rejects it. - *Harness-specific layer (the adapter):* injecting the resolved model+effort into the actual - sub-agent dispatch is harness behavior. **For Claude Code that is the Agent tool's `model` - parameter** — the skill wiring names this explicitly as the Claude-Code adapter, not as universal - truth. This coupling is **not introduced by this feature** — fab's entire existing subagent-dispatch - design (`_preamble.md` § Subagent Dispatch) is already Claude-Code-shaped. Per-stage selection is - exactly as portable as fab's existing dispatch: no more, no less. + sub-agent dispatch is harness behavior, and the two halves use **two different seams** in Claude + Code. **The model rides the Agent tool's `model` parameter** (which takes a short alias — + `opus`/`sonnet`/`haiku`/`fable` — not the full versioned id `fab resolve-agent` emits, so the + orchestrator maps id → alias at the seam); **the effort rides an instruction in the subagent prompt** + (the Agent tool exposes no effort parameter). The skill wiring names both explicitly as the + Claude-Code adapter, not as universal truth. This coupling is **not introduced by this feature** — + fab's entire existing subagent-dispatch design (`_preamble.md` § Subagent Dispatch) is already + Claude-Code-shaped. Per-stage selection is exactly as portable as fab's existing dispatch: no more, + no less. - *Claude-flavored data (overridable):* fab-kit's shipped default table uses Claude model IDs/effort. These are documented as "fab-kit's Claude defaults," fully replaceable via `agent.tiers`. - *v1 scope is architecture-neutral + documented — NOT shipped/tested against a non-Claude harness.* No @@ -282,10 +289,12 @@ There fab cannot switch the session model mid-run, so the configured tier is **a skill MAY note "this stage is configured for X; you're on Y" but MUST NOT attempt to switch models. > **Scope note**: this section reconciles the foreground limitation with the single post-intake -> execution mode (260613-fgxx, Change A). **Full uniform-tier closure** — the per-sub-agent effort -> knob (the model-tier finding's Gap 2, a Claude Code Agent-tool limitation) and the complete -> uniform-tiering narrative — is a **separate follow-up change (Change C)** and is intentionally NOT -> written here. +> execution mode (260613-fgxx, Change A). The **effort half** of per-stage tiering — injected into the +> subagent prompt as an explicit instruction (since the Claude Code Agent tool has no effort parameter) +> — and the **compliance-visibility** behavior (surfacing the resolved `model=/effort=` at each +> dispatch site) are written in by 260613-m3d4 (Change C); see § Skill wiring above. The **lone +> residual** is a first-class per-sub-agent `effort` parameter on the Agent tool (the model-tier +> finding's Gap 2 clean fix) — a harness ask outside fab's control, deliberately not built here. --- diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl b/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl index b85a8188..6e45044c 100644 --- a/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl @@ -14,3 +14,12 @@ {"delta":"+0.0","event":"confidence","score":3.5,"trigger":"calc-score","ts":"2026-06-13T12:31:17Z"} {"delta":"+0.0","event":"confidence","score":3.5,"trigger":"calc-score","ts":"2026-06-13T12:31:25Z"} {"delta":"+0.0","event":"confidence","score":3.5,"trigger":"calc-score","ts":"2026-06-13T12:31:31Z"} +{"cmd":"fab-fff","event":"command","ts":"2026-06-13T13:23:14Z"} +{"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"apply","ts":"2026-06-13T13:24:02Z"} +{"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"review","ts":"2026-06-13T13:34:38Z"} +{"event":"review","result":"failed","ts":"2026-06-13T13:41:37Z"} +{"action":"re-entry","driver":"fab-fff","event":"stage-transition","stage":"apply","ts":"2026-06-13T13:41:37Z"} +{"action":"re-entry","driver":"fab-fff","event":"stage-transition","stage":"review","ts":"2026-06-13T13:45:15Z"} +{"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"hydrate","ts":"2026-06-13T13:51:37Z"} +{"event":"review","result":"passed","ts":"2026-06-13T13:51:37Z"} +{"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"ship","ts":"2026-06-13T13:58:47Z"} diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml b/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml index af0febbd..20c4b030 100644 --- a/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml @@ -5,17 +5,17 @@ created_by: sahil-noon change_type: fix issues: [] progress: - intake: ready - apply: pending - review: pending - hydrate: pending - ship: pending + intake: done + apply: done + review: done + hydrate: done + ship: active review-pr: pending plan: - generated: false - task_count: 0 - acceptance_count: 0 - acceptance_completed: 0 + generated: true + task_count: 15 + acceptance_count: 15 + acceptance_completed: 15 confidence: certain: 4 confident: 5 @@ -29,7 +29,25 @@ confidence: competence: 88.3 disambiguation: 86.7 stage_metrics: - intake: {started_at: "2026-06-13T11:25:53Z", driver: fab-new, iterations: 1} + intake: {started_at: "2026-06-13T11:25:53Z", driver: fab-new, iterations: 1, completed_at: "2026-06-13T13:24:02Z"} + apply: {started_at: "2026-06-13T13:41:37Z", driver: fab-fff, iterations: 2, completed_at: "2026-06-13T13:45:15Z"} + review: {started_at: "2026-06-13T13:45:15Z", driver: fab-fff, iterations: 2, completed_at: "2026-06-13T13:51:37Z"} + hydrate: {started_at: "2026-06-13T13:51:37Z", driver: fab-fff, iterations: 1, completed_at: "2026-06-13T13:58:47Z"} + ship: {started_at: "2026-06-13T13:58:47Z", driver: fab-fff, iterations: 1} prs: [] +true_impact: + added: 649 + deleted: 58 + net: 591 + excluding: + added: 195 + deleted: 24 + net: 171 + tests: + added: 164 + deleted: 0 + net: 164 + computed_at: "2026-06-13T13:58:47Z" + computed_at_stage: hydrate # true_impact: lazily created on first apply-finish (no placeholder here). -last_updated: 2026-06-13T12:31:31Z +last_updated: 2026-06-13T13:58:47Z diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/plan.md b/fab/changes/260613-m3d4-uniform-stage-model-tier/plan.md new file mode 100644 index 00000000..b036ebce --- /dev/null +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/plan.md @@ -0,0 +1,247 @@ +# Plan: Apply per-stage model tier uniformly + inject effort via subagent prompt + +**Change**: 260613-m3d4-uniform-stage-model-tier +**Intake**: `intake.md` + +## Requirements + +> This change is **Change C** of a three-part refactor, gated on **Change A** +> (`intake-is-the-context-boundary`, landed as `260613-fgxx`). Change A removed the +> post-intake foreground execution path, closing **Gap 1a** of the model-tier finding. +> This change is therefore scoped to **Gap 1b** (compliance visibility), **Gap 2's effort +> half** (inject effort via the subagent prompt), and the **doc reconciliation** that +> follows from the now-uniform behavior. Scope is **skills/prose + docs only — NO Go**. + +### Dispatch: Per-stage tier compliance visibility (Gap 1b) + +#### R1: Resolved tier surfaced at every per-stage dispatch site +The orchestrator/sequencer MUST surface (echo into the dispatch prompt and/or log) the +resolved `model=/effort=` lines immediately after running `fab resolve-agent ` and +before dispatching the stage's sub-agent, so a skipped or mis-resolved tier is **visible in +output** rather than silently consumed. This MUST apply at every per-stage dispatch site: +`_pipeline.md` Behavior "Per-stage model resolution" note + Steps 1 (apply), 2 (review), 3 +(hydrate), and Auto-Rework Loop items 3 (re-apply) and 4 (re-review); and `fab-fff.md` Steps +4 (ship) and 5 (review-pr). + +- **GIVEN** an orchestrator about to dispatch the apply sub-agent +- **WHEN** it runs `fab resolve-agent apply` (yielding `model=claude-opus-4-8 effort=high`) +- **THEN** the resolved `model=/effort=` lines are surfaced (echoed into the dispatch prompt and/or logged) +- **AND** a run that skipped resolution shows no such lines, making the miss visible + +#### R2: Lightweight non-empty guard where cleanly expressible (Gap 1b) +The dispatch contract SHOULD add a prose-level assertion that the resolved `model=/effort=` +lines are non-empty before dispatch **when it can be expressed cleanly**; where a guard would +add awkward control flow, visibility-in-output alone (R1) satisfies the requirement. Prose +only — NO Go. + +- **GIVEN** the canonical Per-Stage Model Resolution contract in `_preamble.md` +- **WHEN** the contract is stated +- **THEN** it notes that an all-empty resolution (both `model=` and `effort=` empty) is a signal worth surfacing/asserting, expressed as prose without awkward control flow +- **AND** no Go code or new skill mechanism is introduced for it + +### Dispatch: Effort injection via subagent prompt (Gap 2 effort half) + +#### R3: Resolved effort injected as an imperative subagent-prompt instruction +Because the Claude Code Agent tool has **no `effort` parameter**, the dispatching skill MUST +inject the resolved effort into the **subagent prompt** as an explicit imperative instruction +(the model half stays on the Agent tool's `model` param, unchanged). This MUST apply at every +per-stage dispatch site named in R1. Where the resolved effort is **empty**, the instruction +MUST be omitted (mirroring the existing "empty effort ⇒ omit" contract). + +- **GIVEN** an orchestrator dispatching the review sub-agent with resolved `effort=xhigh` +- **WHEN** it builds the Agent-tool dispatch prompt +- **THEN** the prompt carries an imperative instruction such as ``Operate at `xhigh` reasoning effort for this task.`` +- **AND** when the resolved effort is empty, no effort instruction is added + +#### R4: Review resolves once → same effort instruction to both reviewers + merge +The `review` stage MUST resolve `fab resolve-agent review` **once** and apply the **same** +effort instruction to BOTH reviewer sub-agents (inward + outward) and the merge — preserving +the existing "review resolves once" contract. + +- **GIVEN** the review block resolving its tier once (`effort=xhigh`) +- **WHEN** it dispatches the inward and outward reviewer sub-agents +- **THEN** both prompts carry the identical `xhigh` effort instruction, and the merge runs at the same tier + +### Docs: Reconcile findings + spec + preamble + header note with uniform behavior + +#### R5: Finding doc reflects Gap 1a closed, Gap 1b + Gap 2-effort addressed, harness ask residual +`docs/findings/per-stage-model-tier-application.md` MUST be updated so: Gap 1a's "Closed by..." +note reads as **closed** by the landed Change A (not a forward projection); `**Status:**` and +the per-path table reflect that there is no longer a foreground-vs-subagent split; Gap 1b and +Gap 2's effort-half are recorded as **addressed by this change** (visibility + prompt-injection); +and § Suggested directions item 4 (per-subagent `effort` param on the Agent tool) **remains** and +is marked as the **residual** after this change. + +- **GIVEN** the finding doc post-Change-A +- **WHEN** this change lands +- **THEN** Gap 1a is marked closed, Gap 1b + Gap 2-effort marked addressed (with the seam used), and item 4 remains the residual harness ask +- **AND** no Go code or skill mechanism is built for the harness ask + +#### R6: Spec reconciled with uniform behavior + effort-via-prompt +`docs/specs/stage-models.md` § Foreground limitation and § Skill wiring MUST be reconciled +with the now-uniform behavior: no foreground advisory path for post-intake stages (already +reflected post-A — reconcile only the tier-mechanism residue), and effort is now **injected via +the subagent prompt** rather than dropped. As human-curated pre-implementation design intent +(Constitution VI), this section is **edited**, not auto-generated. + +- **GIVEN** § Skill wiring stating "Empty effort → omit the effort flag" (which read as "effort is dropped") +- **WHEN** this change lands +- **THEN** the spec states effort is injected via the subagent prompt (omitted only when empty), and the § Foreground limitation scope note no longer defers the effort half as unwritten + +#### R7: Preamble documents the effort-via-prompt seam + harness-adapter alias detail +`src/kit/skills/_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution MUST document the +**effort-via-prompt** seam: model is injected via the Agent tool's `model` param, effort is +injected via an explicit subagent-prompt instruction (no Agent-tool effort param). It MUST +reconcile the residual-advisory paragraph for the tier mechanism (Gap 1a already removed by A — +do not re-remove what A removed) and add the compliance-visibility (R1) expectation. It SHOULD +note the harness-adapter detail that the Agent tool's `model` param takes a short alias +(`opus`/`sonnet`/`haiku`/`fable`), NOT the full `claude-opus-4-8` id that `fab resolve-agent` +emits — the orchestrator maps the resolved id to the alias. + +- **GIVEN** the canonical Per-Stage Model Resolution contract +- **WHEN** this change lands +- **THEN** it documents effort-via-prompt as the effort seam, surfaces/asserts the resolved lines (visibility), and notes the id→alias mapping at the harness-adapter boundary +- **AND** it does not re-remove the foreground-advisory language Change A already removed + +#### R8: fab-continue header note reconciled (Gap 1a residue) +`src/kit/skills/fab-continue.md` header per-stage-model note MUST be reconciled with the uniform +behavior. If Change A already removed the foreground-advisory caveat (it did — the note already +reframes to the one-stage sequencer with "no foreground execution path... to leave the tier +merely advisory"), leave it; this change only adds the effort-via-prompt mention if it improves +local consistency, without duplicating A's reconciliation. + +- **GIVEN** the `fab-continue.md` header note already reframed by Change A +- **WHEN** this change lands +- **THEN** the note is consistent with effort-via-prompt + the canonical contract, with no re-removal of A's already-deleted advisory language + +### Docs: SPEC mirrors (Constitution requirement) + +#### R9: Every edited skill file's SPEC mirror updated +Per the Constitution Additional Constraint (skill changes MUST update the corresponding +`docs/specs/skills/SPEC-*.md`), the SPEC mirror of every edited `src/kit/skills/*.md` MUST be +updated to reflect the effort-via-prompt + visibility behavior. Affected mirrors: +`SPEC-_pipeline.md`, `SPEC-_preamble.md`, `SPEC-fab-fff.md`, `SPEC-fab-continue.md`. + +- **GIVEN** edits to `_pipeline.md`, `_preamble.md`, `fab-fff.md`, `fab-continue.md` +- **WHEN** this change lands +- **THEN** each corresponding `SPEC-*.md` mirror documents the new effort-via-prompt + visibility behavior and the Gap 1a removal attribution + +### Non-Goals + +- **No Go change** — `src/go/**` untouched; `fab resolve-agent`'s signature is unchanged, so `_cli-fab.md` is untouched. +- **Not building the harness ask** — the per-subagent `effort` param on the Agent tool is documented as residual only; no skill mechanism or Go is added. +- **Not re-doing Change A's work** — the every-post-intake-stage-dispatches-a-subagent mechanic and the broad foreground-advisory deletions are A's deliverable, already landed; this change reconciles only the tier-mechanism-specific residue. +- **No migration** — no user-data restructure (config / `.status.yaml` / archive layout). +- **Memory not touched at apply** — `pipeline/execution-skills`, `pipeline/planning-skills` are handled at HYDRATE, not apply. + +### Design Decisions + +1. **Effort instruction wording**: ``Operate at `xhigh` reasoning effort for this task.`` (imperative, backtick-quoted level, single sentence). — *Why*: imperative + unambiguous per the intake's stated criterion; reads naturally appended to a dispatch prompt; backticks make the level scannable. — *Rejected*: "Use xhigh effort" (less imperative), a multi-sentence block (verbose for a per-dispatch line). +2. **Visibility seam = echo into the dispatch prompt AND log line**: surface the resolved `model=/effort=` in the dispatch prompt (it travels with the effort instruction anyway) and instruct the orchestrator to emit it in its step output. — *Why*: the prompt already must carry the effort line for R3, so co-locating the model line there is zero marginal cost and makes the per-dispatch tier self-documenting; the output-log half is what makes a *skipped* resolution visible. — *Rejected*: log-only (doesn't help the subagent self-document), prompt-only (a skipped dispatch with no prompt line is invisible in the orchestrator's own output). +3. **Guard expressed as prose assertion in the canonical contract only** (`_preamble.md`), not repeated at every site. — *Why*: the canonical Per-Stage Model Resolution contract is the single source the dispatch sites already defer to; asserting non-emptiness there avoids awkward per-site control flow (Assumption #9's "fall back to visibility if a guard would add awkward control flow"). — *Rejected*: a per-site `if empty then…` branch (awkward control flow, six duplicated copies). + +## Tasks + +### Phase 1: Canonical contract (the single source the sites defer to) + +- [x] T001 Update `src/kit/skills/_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution: (a) document the **effort-via-prompt** seam — model via Agent tool `model` param, effort via an explicit imperative subagent-prompt instruction (omit when empty); (b) add the **compliance-visibility** expectation (surface the resolved `model=/effort=` into the dispatch prompt and/or orchestrator output) plus the prose non-empty-assertion note; (c) reconcile the harness-adapter paragraph to note the Agent `model` param takes a short alias (`opus`/`sonnet`/`haiku`/`fable`), not the full `claude-*` id `fab resolve-agent` emits (orchestrator maps id→alias); (d) update the trailing parenthetical in the "Per-stage selection applies on every post-intake stage" paragraph so the per-sub-agent effort knob is described as injected-via-prompt-now with the Agent-tool effort param as the residual harness ask (do NOT re-remove A's already-deleted foreground-advisory language). + +### Phase 2: Dispatch sites (defer to the contract; add visibility + effort line) + +- [x] T002 [P] Update `src/kit/skills/_pipeline.md`: (a) the Behavior "Per-stage model resolution" note — extend so the resolved `model=/effort=` is surfaced and the effort half is injected into the dispatch prompt; (b) Step 1 (apply), Step 2 (review — once, both reviewers + merge), Step 3 (hydrate) dispatch instructions; (c) Auto-Rework Loop item 3 (re-apply) and item 4 (re-review) — each: surface the resolved tier and add the effort instruction to the dispatch prompt (omit effort when empty). +- [x] T003 [P] Update `src/kit/skills/fab-fff.md`: Step 4 (ship — `fab resolve-agent ship`) and Step 5 (review-pr — `fab resolve-agent review-pr`), plus the "Per-stage model" blockquote note: surface the resolved tier and inject the resolved effort instruction into the `/git-pr` / `/git-pr-review` dispatch prompts (omit effort when empty). +- [x] T004 [P] Reconcile `src/kit/skills/fab-continue.md` header per-stage-model note (~line 19) and the Review Behavior "Per-stage model resolution (nested reviewers)" note: ensure consistency with effort-via-prompt + the canonical contract; the nested-reviewers note applies the same once-resolved effort instruction to both reviewers + merge. Do NOT re-remove A's already-deleted foreground-advisory language. + +### Phase 3: Findings + spec doc reconciliation + +- [x] T005 [P] Update `docs/findings/per-stage-model-tier-application.md`: (a) `**Status:**` line; (b) Gap 1a's "Closed by..." note → marked CLOSED by landed Change A (`260613-fgxx`); (c) the per-path table (~lines 40–46) → reflect no foreground-vs-subagent split; (d) Gap 1b + Gap 2-effort marked addressed by this change (visibility + prompt-injection) with the residual being the harness ask; (e) ensure § Suggested directions item 4 (per-subagent effort param) remains and is marked the residual. +- [x] T006 [P] Update `docs/specs/stage-models.md` § Skill wiring and § Foreground limitation: § Skill wiring — effort is injected via the subagent prompt (omitted only when empty), not silently dropped; § Foreground limitation scope note — remove the deferral of the effort half as "intentionally NOT written here" now that this change writes it; keep the residual harness-ask framing for the Agent-tool effort param. Human-curated design intent — edit, do not auto-generate. + +### Phase 4: SPEC mirrors (Constitution requirement) + +- [x] T007 [P] Update `docs/specs/skills/SPEC-_preamble.md` Summary + the Per-Stage Model Resolution Flow node: document the effort-via-prompt seam, compliance visibility, and the id→alias harness-adapter detail (attribute to this change). +- [x] T008 [P] Update `docs/specs/skills/SPEC-_pipeline.md`: document that each per-stage dispatch site surfaces the resolved tier and injects the effort instruction into the dispatch prompt (Steps 1–3 + rework items 3/4). +- [x] T009 [P] Update `docs/specs/skills/SPEC-fab-fff.md`: document that Steps 4–5 surface the resolved tier and inject effort into the `/git-pr` / `/git-pr-review` dispatch prompts. +- [x] T010 [P] Update `docs/specs/skills/SPEC-fab-continue.md`: document the effort-via-prompt seam in the one-stage sequencer + nested reviewers (consistent with the canonical contract); note Gap 1a closed by A and the effort half now addressed by this change. + +### Phase 5: Meta-consistency pass + +- [x] T011 Cross-check that `_pipeline.md` / `_preamble.md` / `fab-fff.md` / `fab-continue.md` and their SPEC mirrors and the finding/stage-models docs are mutually consistent: the effort-via-prompt seam, the once-resolved-review rule, the visibility expectation, and the Gap 1a-closed / harness-ask-residual framing all agree across files. No new claim contradicts another. + +### Phase 6: Sibling-class completeness (rework — review cycle 1) + + + +- [x] T012 Reconcile `src/kit/skills/fab-ff.md` per-stage-model blockquote note (~line 37) — the structural twin of `fab-fff.md`'s note (T003): bring it to the same two-seam phrasing (surface the resolved `model=/effort=`; model via the Agent `model` param; effort via the imperative subagent-prompt instruction, omitted when empty), OR shorten it to defer to `_preamble.md` § Per-Stage Model Resolution without restating the now-stale single-seam mechanics. `fab-ff.md` runs the same `_pipeline.md` bracket as `fab-fff.md`, so its dispatch behavior (Steps 1–3) is already covered by T002; this fixes ONLY the fab-ff.md note that pre-states the superseded contract. Do NOT re-remove Change A's already-deleted foreground-advisory language. +- [x] T013 Update `docs/specs/skills/SPEC-fab-ff.md` (~line 5) to mirror the T012 edit and match the `SPEC-fab-fff.md` (T009) reconciliation — the two-seam phrasing with the `(…, 260613-m3d4)` change attribution, so the SPEC mirror does not contradict its skill file or its `SPEC-fab-fff.md` sibling. Constitution skill→SPEC requirement. +- [x] T014 Soften `docs/specs/skills/SPEC-fab-continue.md` line ~7 (the unedited `260613-fgxx` paragraph) so it no longer pre-states the superseded single-seam framing ("passes the resolved model/effort to the Agent dispatch") that the next paragraph (line ~11, edited in T010) overrides — defer to the "Per-stage model" description below it. Should-fix internal-consistency wrinkle of the same class as T012/T013. +- [x] T015 Re-run the T011 meta-consistency pass with scope widened to the **whole `fab-ff`/`fab-fff` orchestrator sibling class** (both skill files + both SPEC mirrors + `_pipeline.md`/`_preamble.md`/`fab-continue.md` + finding/stage-models docs): grep the repo for the stale single-seam phrasing ("passes the resolved model+effort to the Agent dispatch" / "model/effort to the Agent dispatch") and confirm every in-scope occurrence is reconciled. Memory files (`docs/memory/_shared/context-loading.md` etc.) carrying the stale phrasing are OUT of scope — deferred to hydrate — and must be left untouched. + +## Execution Order + +- T001 first (the canonical contract the dispatch sites defer to — establishes the wording/seam the rest mirror). +- T002–T006 may run in parallel after T001 (independent files; each mirrors the contract). +- T007–T010 (SPEC mirrors) run after their source skill edits (T007 after T001; T008 after T002; T009 after T003; T010 after T004), but are mutually independent `[P]`. +- T011 last in the original pass (consistency pass over the finished set). +- T012–T014 (rework cycle 1) run after T011; T013 after T012 (SPEC mirror follows its skill edit); T014 independent. T015 last (re-run the consistency pass with class-wide scope). + +## Acceptance + +### Functional Completeness + +- [x] A-001 R1: Every per-stage dispatch site in `_pipeline.md` (Behavior note + Steps 1–3 + Auto-Rework items 3/4) and `fab-fff.md` (Steps 4–5) surfaces the resolved `model=/effort=` (echoed into the dispatch prompt and/or output). +- [x] A-002 R3: Every per-stage dispatch site injects the resolved effort as an imperative subagent-prompt instruction, with explicit "omit when empty" handling. +- [x] A-003 R5: `docs/findings/per-stage-model-tier-application.md` marks Gap 1a closed (by landed Change A), Gap 1b + Gap 2-effort addressed by this change, and item 4 the residual harness ask. +- [x] A-004 R6: `docs/specs/stage-models.md` § Skill wiring + § Foreground limitation state effort-via-prompt (not dropped) and carry no unreconciled foreground-advisory tier residue. +- [x] A-005 R7: `src/kit/skills/_preamble.md` documents the effort-via-prompt seam, the compliance-visibility expectation, and the id→alias harness-adapter detail. +- [x] A-006 R9: `SPEC-_preamble.md`, `SPEC-_pipeline.md`, `SPEC-fab-fff.md`, `SPEC-fab-continue.md` each document the effort-via-prompt + visibility behavior and the Gap 1a-closed attribution. + +### Behavioral Correctness + +- [x] A-007 R3: Where resolved effort is empty, the dispatch prompt carries NO effort instruction (mirroring the existing empty-effort-omit contract), stated explicitly at the canonical contract. +- [x] A-008 R4: The review stage resolves `fab resolve-agent review` once and the SAME effort instruction governs both reviewer sub-agents (inward + outward) and the merge — the "review resolves once" contract is preserved, not duplicated per reviewer. +- [x] A-009 R8: The `fab-continue.md` header note is consistent with the canonical contract and effort-via-prompt; Change A's already-removed foreground-advisory language is NOT re-removed or re-introduced. + +### Removal Verification + +- [x] A-010 R2: No Go file under `src/go/**` is modified; `_cli-fab.md` is unchanged (resolver signature unchanged); the harness ask is documented but not built. + +### Edge Cases & Error Handling + +- [x] A-011 R2: The non-empty guard is expressed as prose (a surface/assert note in the canonical contract), not as awkward per-site control flow; if a clean guard was infeasible, visibility-in-output alone is present and the assumption row records the fallback. + +### Code Quality + +- [x] A-012 Pattern consistency: New prose matches the surrounding skill-file prose density, idiom, and blockquote-note conventions; SPEC-mirror edits match the existing `(260613-xxxx)` change-attribution style. +- [x] A-013 No unnecessary duplication: The effort/visibility behavior is stated canonically in `_preamble.md` and referenced (not re-derived verbatim) at each dispatch site; edits do not duplicate Change A's reconciliation. + +### Documentation Accuracy (checklist.extra_categories) + +- [x] A-014 Cross-references between `_pipeline.md` / `_preamble.md` / `fab-fff.md` / `fab-ff.md` / `fab-continue.md` and their SPEC mirrors and the finding/stage-models docs are mutually consistent (the meta-consistency pass, T011 + the class-wide re-run T015) — no file claims a behavior another contradicts. **In particular, the `fab-ff`/`fab-fff` orchestrator sibling pair (and their SPEC mirrors) agree on the two-seam dispatch contract** — neither retains the superseded single-seam "passes the resolved model+effort to the Agent dispatch" phrasing. +- [x] A-015 R9: No in-scope skill or SPEC file retains the stale single-seam phrasing ("passes the resolved model+effort to the Agent dispatch" / "model/effort to the Agent dispatch") — verified by a class-wide grep (T015). Memory files carrying the phrasing are out of scope (deferred to hydrate) and remain untouched. + +## Notes + +- Check items as you review: `- [x]` +- All acceptance items must pass before `/fab-continue` (hydrate) +- Markdown-only / CommonMark throughout; NO Go. +- This change is itself about the per-stage-model dispatch contract — the dispatch-site edits, their SPEC mirrors, and the `_preamble.md` contract must be mutually consistent (meta-consistency is load-bearing here). + +## Assumptions + +| # | Grade | Decision | Rationale | Scores | +|---|-------|----------|-----------|--------| +| 1 | Certain | Change A landed as `260613-fgxx`; scope is Gap 1b + Gap 2-effort + doc reconciliation only (Gap 1a already closed, not duplicated) | Verified in source: `_preamble.md` line 350, `fab-continue.md` line 19, and `stage-models.md` § Foreground limitation all reference `260613-fgxx` and state no foreground post-intake path remains. Intake "IMPORTANT DEPENDENCY" + both finding docs corroborate. | S:95 R:75 A:95 D:95 | +| 2 | Certain | Edit ONLY `src/kit/skills/*` (canonical) + `docs/*`; never `.claude/skills/*` (gitignored deployed copies); no Go | Constitution Principle V + Additional Constraints; intake constraints; `fab resolve-agent` signature unchanged | S:100 R:75 A:100 D:100 | +| 3 | Certain | Skill-file edits update their SPEC mirrors: `SPEC-_pipeline.md`, `SPEC-_preamble.md`, `SPEC-fab-fff.md`, `SPEC-fab-continue.md` | Constitution Additional Constraint (skill change → SPEC update); all four mirrors confirmed to exist | S:95 R:75 A:95 D:90 | +| 4 | Confident | Effort-instruction wording: ``Operate at `xhigh` reasoning effort for this task.`` (imperative, backtick level, single sentence) | Intake gives this as the example and explicitly delegates final wording to apply ("imperative, unambiguous, omitted when empty"); chosen for naturalness appended to a dispatch prompt | S:90 R:85 A:80 D:75 | +| 5 | Confident | Visibility seam = surface resolved `model=/effort=` into the dispatch prompt AND instruct the orchestrator to emit it in step output | Intake item 1 says "echoed into the dispatch prompt and/or logged"; both halves are cheap (the prompt already carries the effort line) and together cover both a mis-resolved tier (prompt) and a skipped resolution (output) | S:90 R:80 A:80 D:75 | +| 6 | Confident | Lightweight guard expressed as a prose non-empty assertion in the canonical `_preamble.md` contract only (not per-site control flow) | Assumption #9 (confirmed): add guard "when cleanly feasible; fall back to visibility-in-output alone if a guard would add awkward control flow." A canonical-only prose assertion is the clean form; per-site branching is the awkward form correctly avoided | S:90 R:80 A:75 D:75 | +| 7 | Confident | Capture the harness-adapter id→alias detail (Agent `model` param takes `opus`/`sonnet`/`haiku`/`fable`, not the full `claude-opus-4-8` id) in `_preamble.md` § Per-Stage Model Resolution | Orchestrator instruction names this as a real harness detail "worth capturing... mention it if it fits naturally, keep it concise"; it is exactly the harness-adapter boundary the section already describes | S:85 R:85 A:80 D:80 | +| 8 | Confident | `fab-continue.md` header note needs only light touch (A already reframed it); no re-removal of A's deleted advisory language | Intake "Note on overlap with Change A" + verified source: the header note already reads "no foreground execution path... to leave the tier merely advisory" | S:90 R:80 A:80 D:80 | + +8 assumptions (3 certain, 5 confident, 0 tentative). diff --git a/src/kit/skills/_pipeline.md b/src/kit/skills/_pipeline.md index b8adbee8..4c4a0349 100644 --- a/src/kit/skills/_pipeline.md +++ b/src/kit/skills/_pipeline.md @@ -47,7 +47,7 @@ Load per `_preamble.md` Sections 1-3 (config, constitution, intake, memory index > > **Dispatch**: All sub-skill invocations use the Agent tool (`general-purpose` subagent) per `_preamble.md` § Subagent Dispatch. Each subagent reads the target skill file, follows the specified behavior, and returns a structured result to the pipeline. Every `/fab-continue`-behavior subagent prompt MUST include: **"do NOT run `fab status` commands; return results only"** — the orchestrator runs those stages' transitions (finish/fail/reset) itself. This is the **universal block contract**, not an override this orchestrator imposes: post-intake `/fab-continue` blocks (Apply / Review / Hydrate Behavior) never own their transitions for **any** caller — plain `/fab-continue` is itself a one-stage sequencer that dispatches the block identically and runs the transition after it returns (see `fab-continue.md` Normal Flow Step 1). The orchestrator here is therefore a **pure sequencer**: dispatch block → read returned status/findings → decide proceed / loop / stop; it never reaches into block internals. > -> **Per-stage model resolution** (see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution for the canonical contract): immediately **before** dispatching each stage's sub-agent, run `fab resolve-agent ` and pass the resolved model AND effort into the Agent dispatch. Empty model ⇒ omit the dispatch `model` param (inherit the orchestrator/session model — today's behavior); empty effort ⇒ omit the effort. The Claude Code adapter is the Agent tool's `model` parameter; the resolution itself is provider-neutral. The `review` stage (Step 2) resolves **once** and applies the same profile to both reviewer sub-agents AND the merge. +> **Per-stage model resolution** (see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution for the canonical contract): immediately **before** dispatching each stage's sub-agent, run `fab resolve-agent ` and **surface** the resolved `model=/effort=` lines (carry them into the dispatch prompt and/or echo them in this orchestrator's step output, so a skipped or mis-resolved tier is visible rather than silent), then dispatch through the two seams: the **model** half via the Agent tool's `model` parameter (empty model ⇒ omit/inherit), and the **effort** half via an explicit imperative instruction in the dispatch prompt — ``Operate at `` reasoning effort for this task.`` (empty effort ⇒ omit the instruction; the Agent tool has no effort param). The Claude Code adapter is the Agent tool's `model` parameter; the resolution itself is provider-neutral. The `review` stage (Step 2) resolves **once** and applies the same model AND the same effort-prompt instruction to both reviewer sub-agents AND the merge. ### Resumability @@ -57,7 +57,7 @@ Check `progress` from preflight. Skip stages already `done`. If `{terminal}: don *(Skip if `progress.apply` is `done`.)* Since the intake gate already passed in pre-flight, if `progress.intake` is not `done`, finish intake: `fab status finish intake {driver}` (auto-activates apply). -Resolve the apply model: run `fab resolve-agent apply` and apply the resolved model/effort to the dispatch below (empty model ⇒ omit/inherit; empty effort ⇒ omit). Dispatch `/fab-continue` as subagent — Apply Behavior, change: `{id}` (prompt includes: do NOT run `fab status`; return results only). The subagent runs both apply sub-steps in a single invocation: (1) Plan Generation — co-generate `plan.md` (`## Requirements` + `## Tasks` + `## Acceptance`) from `intake.md` per **Plan Generation Procedure** (`_generation.md`), unless `plan.md` already exists; (2) Task Execution — parse unchecked tasks under `## Tasks`, execute in dependency order, run tests, mark `[x]` on completion. Returns completion status or failure with task ID and reason. +Resolve the apply model: run `fab resolve-agent apply`, surface the resolved `model=/effort=` (echo them and/or carry them into the dispatch prompt — a skipped resolution is then visible), and apply both halves to the dispatch below — model via the Agent `model` param (empty ⇒ omit/inherit), effort via an imperative prompt instruction ``Operate at `` reasoning effort for this task.`` (empty effort ⇒ omit). Dispatch `/fab-continue` as subagent — Apply Behavior, change: `{id}` (prompt includes: do NOT run `fab status`; return results only). The subagent runs both apply sub-steps in a single invocation: (1) Plan Generation — co-generate `plan.md` (`## Requirements` + `## Tasks` + `## Acceptance`) from `intake.md` per **Plan Generation Procedure** (`_generation.md`), unless `plan.md` already exists; (2) Task Execution — parse unchecked tasks under `## Tasks`, execute in dependency order, run tests, mark `[x]` on completion. Returns completion status or failure with task ID and reason. No `/fab-clarify` runs here. Under-specified requirements are resolved inline by the apply agent as graded SRAD assumptions in `plan.md` `## Assumptions` — not via any clarify ceremony. @@ -69,7 +69,7 @@ On success: run `fab status finish apply {driver}`. *(Skip if `progress.review` is `done`.)* -Resolve the review model **once**: run `fab resolve-agent review` and apply the resolved model/effort to the review subagent dispatch — the same profile governs both reviewer sub-agents (inward + outward) and the merge (empty model ⇒ omit/inherit; empty effort ⇒ omit). Dispatch `/fab-continue` as subagent — Review Behavior, change: `{id}` (prompt includes: do NOT run `fab status`; return results only — verdict transitions belong to this orchestrator). The subagent reads `_review.md` for review dispatch instructions — both inward and outward sub-agents are defined there. It dispatches both sub-agents in parallel, merges their findings, and returns structured findings (must-fix / should-fix / nice-to-have) with pass/fail status. +Resolve the review model **once**: run `fab resolve-agent review`, surface the resolved `model=/effort=`, and apply both halves to the review subagent dispatch — the same model AND the same effort-prompt instruction (``Operate at `` reasoning effort for this task.``) govern both reviewer sub-agents (inward + outward) and the merge (empty model ⇒ omit/inherit; empty effort ⇒ omit the instruction). Dispatch `/fab-continue` as subagent — Review Behavior, change: `{id}` (prompt includes: do NOT run `fab status`; return results only — verdict transitions belong to this orchestrator). The subagent reads `_review.md` for review dispatch instructions — both inward and outward sub-agents are defined there. It dispatches both sub-agents in parallel, merges their findings, and returns structured findings (must-fix / should-fix / nice-to-have) with pass/fail status. **Pass**: run `fab status finish review {driver}`. Proceed to Step 3 (Hydrate). @@ -85,8 +85,8 @@ The agent triages the sub-agent's prioritized findings and autonomously selects 1. **Status pair**: run `fab status fail review` then `fab status reset apply {driver}`. This fail+reset pair repeats on **every** failed review verdict that starts a new cycle — not just the first failure — so every conforming run leaves the same `.status.yaml` history shape. 2. **Triage + rework action**: triage the prioritized findings, select exactly one path per the decision heuristics below, and apply its edits (uncheck tasks / edit `plan.md` / edit `## Requirements`). -3. **Re-dispatch apply**: re-run `fab resolve-agent apply` and apply the resolved model/effort, then dispatch `/fab-continue` as a subagent — Apply Behavior, same prompt contract as Step 1 (do NOT run `fab status`; return results only). On success, run `fab status finish apply {driver}` (auto-activates review). -4. **Fresh re-review**: re-run `fab resolve-agent review` (once, governing both reviewers + merge) and apply the resolved model/effort, then dispatch a **fresh** `/fab-continue` Review Behavior subagent, same prompt contract as Step 2. Never reuse a prior review subagent's context. +3. **Re-dispatch apply**: re-run `fab resolve-agent apply`, surface the resolved `model=/effort=`, and apply both halves (model via the Agent `model` param; effort via the imperative prompt instruction ``Operate at `` reasoning effort for this task.``, omitted when empty), then dispatch `/fab-continue` as a subagent — Apply Behavior, same prompt contract as Step 1 (do NOT run `fab status`; return results only). On success, run `fab status finish apply {driver}` (auto-activates review). +4. **Fresh re-review**: re-run `fab resolve-agent review` (once, governing both reviewers + merge), surface the resolved `model=/effort=`, and apply both halves (same model + same effort-prompt instruction to both reviewers and the merge, omitted when empty), then dispatch a **fresh** `/fab-continue` Review Behavior subagent, same prompt contract as Step 2. Never reuse a prior review subagent's context. 5. **Verdict**: pass → run `fab status finish review {driver}` and proceed to Step 3. Fail → if fewer than `{max_cycles}` cycles have run, start the next cycle at item 1 (the fail+reset pair fires again); after the `{max_cycles}`-th failed cycle, stop per **Stop** below. **Decision heuristics** (applied at item 2 of each cycle — disjoint: each failure description routes to exactly one path): @@ -114,7 +114,7 @@ Run /fab-continue for manual rework options. *(Skip if `progress.hydrate` is `done`.)* -Resolve the hydrate model: run `fab resolve-agent hydrate` and apply the resolved model/effort to the dispatch below (empty model ⇒ omit/inherit; empty effort ⇒ omit). Dispatch `/fab-continue` as subagent — Hydrate Behavior, change: `{id}` (prompt includes: do NOT run `fab status`; return results only). The subagent validates review passed, hydrates into `docs/memory/`, and returns completion status. +Resolve the hydrate model: run `fab resolve-agent hydrate`, surface the resolved `model=/effort=`, and apply both halves to the dispatch below — model via the Agent `model` param (empty ⇒ omit/inherit), effort via the imperative prompt instruction ``Operate at `` reasoning effort for this task.`` (empty effort ⇒ omit). Dispatch `/fab-continue` as subagent — Hydrate Behavior, change: `{id}` (prompt includes: do NOT run `fab status`; return results only). The subagent validates review passed, hydrates into `docs/memory/`, and returns completion status. On success: run `fab status finish hydrate {driver}`. diff --git a/src/kit/skills/_preamble.md b/src/kit/skills/_preamble.md index 5e94de32..44867c18 100644 --- a/src/kit/skills/_preamble.md +++ b/src/kit/skills/_preamble.md @@ -339,15 +339,22 @@ fab resolve-agent and passes the resolved profile into the Agent dispatch: - Output is two byte-stable stdout lines, `model=` and `effort=` (the `effort=` line is omitted when the tier has no effort). See `_cli-fab.md` § fab resolve-agent. -- **Empty model** ⇒ omit the dispatch `model` param entirely (inherit the orchestrator/session model — today's behavior). **Empty effort** ⇒ omit the effort. +- **Empty model** ⇒ omit the dispatch `model` param entirely (inherit the orchestrator/session model — today's behavior). **Empty effort** ⇒ omit the effort instruction (below). - The resolver maps `` → its fixed fab-owned tier → a `{model, effort}` profile (`agent.tiers` project override per-field-merged over fab-kit's default). The stage→tier mapping is NOT user-overridable; `agent.tiers` (tier redefinition) is the sole override surface. Full design: `docs/specs/stage-models.md`. - **No validation**: the resolved strings are passed through verbatim — fab enforces no effort enum and corrects no incompatible pair. Compatibility is the harness's concern. -**Harness-adapter boundary (Claude Code).** The resolution (stage→tier→`{model, effort}`) is **provider-neutral**. Injecting the resolved model into the actual dispatch is **harness-specific**: for Claude Code that adapter is the **Agent tool's `model` parameter**. This is named explicitly as the Claude-Code adapter, not as universal truth — and the coupling is not new (fab's entire subagent-dispatch design is already Claude-Code-shaped). +**The two halves dispatch through different seams (model param + prompt instruction).** The resolved profile has a model half and an effort half, and Claude Code consumes them through *different* seams: -**Review resolves once.** The `review` stage spawns two reviewer sub-agents (inward + outward) plus a merge. Resolve `fab resolve-agent review` **once** and apply the same profile to all three — the mechanical merge runs at the reviewer's tier (an accepted stage-granularity tradeoff). +- **Model → the Agent tool's `model` parameter.** Empty model ⇒ omit it (inherit the session/orchestrator model). +- **Effort → an explicit instruction in the subagent prompt.** The Agent tool has **no `effort` parameter**, so the only available seam for the effort half is the dispatched prompt itself: append an imperative line such as ``Operate at `xhigh` reasoning effort for this task.`` so the sub-agent self-selects its reasoning effort. Empty effort ⇒ omit the instruction entirely (mirroring the empty-model omit rule). This is imperfect — it relies on the sub-agent honoring the instruction — but it is the only per-sub-agent effort seam available today; a per-sub-agent `effort` parameter on the Agent tool would close it cleanly and is tracked as the residual harness ask (`docs/findings/per-stage-model-tier-application.md` § Suggested directions item 4). -**Per-stage selection applies on every post-intake stage.** Per-stage selection is a property of dispatched sub-agent runs — and **every post-intake stage now dispatches a sub-agent**, including plain `/fab-continue` (which is a one-stage sequencer that resolves `fab resolve-agent ` and dispatches its stage's block — see `fab-continue.md` Normal Flow Step 1). So there is no post-intake foreground execution path left to be the exception; `fab resolve-agent` applies uniformly across apply/review/hydrate regardless of caller. Intake is pre-boundary (it runs in the main session and is not tiered by `/fab-continue`). The only residual "advisory" case is a stage skill genuinely run with no dispatch at all — fab cannot switch the session model mid-run, so such a skill MAY note "this stage is configured for X; you're on Y" but MUST NOT attempt to switch. *(Full closure of the model-tier finding's uniform-tier story — including the per-sub-agent effort knob — is tracked separately as a follow-up change; this paragraph only states the post-intake single-mode reality.)* +**Compliance visibility.** Each dispatch site MUST **surface the resolved `model=/effort=` lines** — carry them into the dispatch prompt (the effort line is already there per the seam above; co-locate the model line) and/or emit them in the orchestrator's own step output — so a *skipped* `fab resolve-agent` call (the sub-agent silently inherits the session profile) or a mis-resolved tier is **visible in output rather than silent**. There is no code-level guard fab can install (dispatch is harness-internal), so visibility is the available seam. An all-empty resolution (both `model=` and `effort=` empty) is itself worth surfacing — treat a non-empty resolved line as the expected case and a fully-empty result as a signal to flag rather than dispatch blind. + +**Harness-adapter boundary (Claude Code).** The resolution (stage→tier→`{model, effort}`) is **provider-neutral**. Injecting the resolved model into the actual dispatch is **harness-specific**: for Claude Code that adapter is the **Agent tool's `model` parameter**, and the effort half is injected via the subagent-prompt instruction described above. One concrete harness detail: the Agent tool's `model` param takes a **short alias** (`opus` / `sonnet` / `haiku` / `fable`), **not** the full versioned id (`claude-opus-4-8`) that `fab resolve-agent` emits — the orchestrator maps the resolved id to the alias at the dispatch seam. This is named explicitly as the Claude-Code adapter, not as universal truth — and the coupling is not new (fab's entire subagent-dispatch design is already Claude-Code-shaped). + +**Review resolves once.** The `review` stage spawns two reviewer sub-agents (inward + outward) plus a merge. Resolve `fab resolve-agent review` **once** and apply the same profile to all three — the same model and the same effort-prompt instruction govern both reviewers and the merge; the mechanical merge runs at the reviewer's tier (an accepted stage-granularity tradeoff). + +**Per-stage selection applies on every post-intake stage.** Per-stage selection is a property of dispatched sub-agent runs — and **every post-intake stage now dispatches a sub-agent**, including plain `/fab-continue` (which is a one-stage sequencer that resolves `fab resolve-agent ` and dispatches its stage's block — see `fab-continue.md` Normal Flow Step 1). So there is no post-intake foreground execution path left to be the exception; `fab resolve-agent` applies uniformly across apply/review/hydrate regardless of caller. Intake is pre-boundary (it runs in the main session and is not tiered by `/fab-continue`). The only residual "advisory" case is a stage skill genuinely run with no dispatch at all — fab cannot switch the session model mid-run, so such a skill MAY note "this stage is configured for X; you're on Y" but MUST NOT attempt to switch. *(The effort half of the tier is now injected via the subagent-prompt instruction described above — see "The two halves dispatch through different seams"; the lone remaining residual is a first-class per-sub-agent `effort` parameter on the Agent tool, which is a harness ask outside fab's control, not a fab gap.)* --- diff --git a/src/kit/skills/fab-continue.md b/src/kit/skills/fab-continue.md index 794d9883..3db8d78e 100644 --- a/src/kit/skills/fab-continue.md +++ b/src/kit/skills/fab-continue.md @@ -16,7 +16,7 @@ helpers: [_srad] Advance through the 6-stage Fab pipeline one step at a time. Each invocation handles the current stage's work and transitions to the next. When called with a stage argument, resets to that stage and re-runs from there. -> **Per-stage model (one-stage sequencer)**: post-intake `/fab-continue` is a **one-stage sequencer** — it dispatches its stage as a sub-agent (see Normal Flow Step 1), and per-stage model selection (`fab resolve-agent ` → `agent.tiers`, see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution) is resolved **once, immediately before that dispatch**, and applied to the sub-agent. There is no foreground execution path for apply/review/hydrate to leave the tier merely advisory: every post-intake stage runs dispatched, so `fab resolve-agent ` applies uniformly. (Intake is pre-boundary — it runs in the main session and is not tiered by `/fab-continue`.) +> **Per-stage model (one-stage sequencer)**: post-intake `/fab-continue` is a **one-stage sequencer** — it dispatches its stage as a sub-agent (see Normal Flow Step 1), and per-stage model selection (`fab resolve-agent ` → `agent.tiers`, see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution) is resolved **once, immediately before that dispatch**, surfaced (so a skipped/mis-resolved tier is visible), and applied through both seams — model via the Agent tool's `model` param, effort via an imperative instruction in the sub-agent's prompt (empty effort ⇒ omit). There is no foreground execution path for apply/review/hydrate to leave the tier merely advisory: every post-intake stage runs dispatched, so `fab resolve-agent ` applies uniformly. (Intake is pre-boundary — it runs in the main session and is not tiered by `/fab-continue`.) --- @@ -49,7 +49,7 @@ Dispatch on preflight's derived `stage` and `display_state`. If progress is `pen - **`active`** (intake) → Generate intake if missing and advance to `ready` (backward compat for interrupted generations) — main session, no dispatch - **`active`/`ready`** (post-intake execution) → resolve the stage's model, dispatch the stage's block as a sub-agent, then finish/fail/reset per the returned result -**Sub-agent dispatch contract (post-intake stages).** For apply / review / hydrate, before dispatching run `fab resolve-agent ` and pass the resolved model/effort into the Agent dispatch (empty model ⇒ omit/inherit; empty effort ⇒ omit) per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Dispatch a sub-agent (Agent tool, `general-purpose`) to execute the named Behavior section *of this file* — this is a fresh-context sub-agent run of that block, NOT a recursive `/fab-continue` invocation — and the prompt MUST include **"do NOT run `fab status` commands; return results only"** plus the standard subagent context files. The sequencer (this `/fab-continue` invocation) runs the `finish`/`fail`/`reset` transition after the sub-agent returns. This is the universal block contract — the block never owns its transitions, regardless of caller. +**Sub-agent dispatch contract (post-intake stages).** For apply / review / hydrate, before dispatching run `fab resolve-agent `, surface the resolved `model=/effort=` (so a skipped/mis-resolved tier is visible), and apply both halves to the Agent dispatch — model via the Agent `model` param (empty ⇒ omit/inherit), effort via an imperative instruction in the sub-agent's prompt (empty effort ⇒ omit; the Agent tool has no effort param) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Dispatch a sub-agent (Agent tool, `general-purpose`) to execute the named Behavior section *of this file* — this is a fresh-context sub-agent run of that block, NOT a recursive `/fab-continue` invocation — and the prompt MUST include **"do NOT run `fab status` commands; return results only"** plus the standard subagent context files. The sequencer (this `/fab-continue` invocation) runs the `finish`/`fail`/`reset` transition after the sub-agent returns. This is the universal block contract — the block never owns its transitions, regardless of caller. | Derived stage | State | Action | |---------------|-------|--------| @@ -158,7 +158,7 @@ Plan Generation sub-step is skipped when `plan.md` already exists (idempotent on Read `.claude/skills/_review/SKILL.md` (if not already loaded), then execute its **Shared Review Dispatch** end-to-end (Preconditions → Inward + Outward Sub-Agent Dispatch → Parallel Dispatch → Findings Merge). The `_review.md` skill defines both sub-agent dispatches (inward + outward) run in parallel, their preconditions, validation steps, structured output format, and the findings merge procedure. When dispatching the inward sub-agent, read `change_type` from the change's `.status.yaml` and pass it in the prompt per `_review.md`'s context contract (its Steps 7–8 skip condition keys on it). -> **Per-stage model resolution (nested reviewers)**: before dispatching the inward + outward reviewer sub-agents, run `fab resolve-agent review` **once** and apply the resolved model/effort to BOTH reviewers and the merge (empty model ⇒ omit/inherit; empty effort ⇒ omit) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. (This is the review block resolving the tier for its own nested sub-agents; it is independent of the sequencer's resolution of the `review` stage when it dispatched this block.) The Claude Code adapter is the Agent tool `model` parameter; the resolution itself is provider-neutral. +> **Per-stage model resolution (nested reviewers)**: before dispatching the inward + outward reviewer sub-agents, run `fab resolve-agent review` **once**, surface the resolved `model=/effort=`, and apply both halves to BOTH reviewers and the merge — the same model via the Agent `model` param (empty ⇒ omit/inherit) and the same imperative effort instruction (``Operate at `` reasoning effort for this task.``) in each reviewer's prompt (empty effort ⇒ omit) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. (This is the review block resolving the tier for its own nested sub-agents; it is independent of the sequencer's resolution of the `review` stage when it dispatched this block.) The Claude Code adapter is the Agent tool `model` parameter (effort rides the prompt); the resolution itself is provider-neutral. ### Verdict diff --git a/src/kit/skills/fab-ff.md b/src/kit/skills/fab-ff.md index 015878f5..bfe6d957 100644 --- a/src/kit/skills/fab-ff.md +++ b/src/kit/skills/fab-ff.md @@ -34,7 +34,7 @@ Execute the **shared pipeline bracket** (`_pipeline.md`, loaded via `helpers:`) The bracket defines everything else: pre-flight (intake prerequisite + intake gate), context loading, resumability, Steps 1–3 (apply → review → hydrate), the auto-rework loop with its per-cycle choreography, and the exhaustion stop. -> **Per-stage model**: each stage dispatch in the bracket resolves `fab resolve-agent ` first and passes the resolved model+effort to the Agent dispatch (empty ⇒ omit/inherit) — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. +> **Per-stage model**: each stage dispatch in the bracket resolves `fab resolve-agent ` first, surfaces the resolved `model=/effort=` (so a skipped or mis-resolved tier is visible, not silent), then dispatches through the two seams — model via the Agent tool's `model` param (empty ⇒ omit/inherit) and effort via an imperative instruction in the dispatch prompt (``Operate at `` reasoning effort for this task.``; empty effort ⇒ omit, since the Agent tool has no effort param) — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. --- diff --git a/src/kit/skills/fab-fff.md b/src/kit/skills/fab-fff.md index 96c7a5c5..e618f236 100644 --- a/src/kit/skills/fab-fff.md +++ b/src/kit/skills/fab-fff.md @@ -34,7 +34,7 @@ Execute the **shared pipeline bracket** (`_pipeline.md`, loaded via `helpers:`) The bracket defines pre-flight (intake prerequisite + intake gate), context loading, resumability, Steps 1–3 (apply → review → hydrate), the auto-rework loop with its per-cycle choreography, and the exhaustion stop. The two steps below are fff-only. -> **Per-stage model**: every stage dispatch (the bracket's Steps 1–3 and the fff-only Steps 4–5 below) resolves `fab resolve-agent ` first and passes the resolved model+effort to the Agent dispatch (empty ⇒ omit/inherit) — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. +> **Per-stage model**: every stage dispatch (the bracket's Steps 1–3 and the fff-only Steps 4–5 below) resolves `fab resolve-agent ` first, surfaces the resolved `model=/effort=` (so a skipped or mis-resolved tier is visible, not silent), then dispatches through the two seams — model via the Agent tool's `model` param (empty ⇒ omit/inherit) and effort via an imperative instruction in the dispatch prompt (``Operate at `` reasoning effort for this task.``; empty effort ⇒ omit, since the Agent tool has no effort param) — see `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. > **Dispatch exception (ship and review-pr)**: unlike the bracket's `/fab-continue`-behavior subagents, `/git-pr` and `/git-pr-review` manage their own stage transitions internally — their subagent prompts do NOT carry the "do not run `fab status`" instruction. @@ -44,7 +44,7 @@ The bracket defines pre-flight (intake prerequisite + intake gate), context load *(Skip if `progress.ship` is `done`.)* -Resolve the ship model: run `fab resolve-agent ship` and apply the resolved model/effort to the dispatch below (empty model ⇒ omit/inherit; empty effort ⇒ omit) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Dispatch `/git-pr` as subagent — the prompt instructs it to invoke `/git-pr {name}` (the **explicit change argument**, using the folder name per the `{name}` note above: git-pr resolves it as a transient override, so the subagent targets this pipeline's change rather than self-resolving the active one, and its branch-matches-change guard verifies the checked-out branch before mutating anything). The subagent commits, pushes, and creates a GitHub PR. Handles `fab status` integration internally (start/finish ship stage). Returns PR URL or error. +Resolve the ship model: run `fab resolve-agent ship`, surface the resolved `model=/effort=`, and apply both halves to the dispatch below — model via the Agent `model` param (empty ⇒ omit/inherit), effort via the imperative prompt instruction ``Operate at `` reasoning effort for this task.`` (empty effort ⇒ omit) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Dispatch `/git-pr` as subagent — the prompt instructs it to invoke `/git-pr {name}` (the **explicit change argument**, using the folder name per the `{name}` note above: git-pr resolves it as a transient override, so the subagent targets this pipeline's change rather than self-resolving the active one, and its branch-matches-change guard verifies the checked-out branch before mutating anything). The subagent commits, pushes, and creates a GitHub PR. Handles `fab status` integration internally (start/finish ship stage). Returns PR URL or error. **If git-pr fails**: STOP with the error from git-pr. The ship stage remains `active` for user retry. @@ -54,7 +54,7 @@ On success: `progress.ship` becomes `done`, `progress.review-pr` auto-activates. *(Skip if `progress.review-pr` is `done`.)* -Resolve the review-pr model: run `fab resolve-agent review-pr` and apply the resolved model/effort to the dispatch below (empty model ⇒ omit/inherit; empty effort ⇒ omit) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Dispatch `/git-pr-review` as subagent — the prompt instructs it to invoke `/git-pr-review {name}` (the **explicit change argument**, same transient-override + branch-guard contract as Step 4). The subagent detects existing reviews, triages comments, applies fixes, and pushes. If no reviews exist, it requests a Copilot review and polls up to 10 minutes — see the timeout outcome below. Handles `fab status` integration internally (start/finish/fail review-pr stage). Returns completion status. +Resolve the review-pr model: run `fab resolve-agent review-pr`, surface the resolved `model=/effort=`, and apply both halves to the dispatch below — model via the Agent `model` param (empty ⇒ omit/inherit), effort via the imperative prompt instruction ``Operate at `` reasoning effort for this task.`` (empty effort ⇒ omit) — per `_preamble.md` § Subagent Dispatch → Per-Stage Model Resolution. Dispatch `/git-pr-review` as subagent — the prompt instructs it to invoke `/git-pr-review {name}` (the **explicit change argument**, same transient-override + branch-guard contract as Step 4). The subagent detects existing reviews, triages comments, applies fixes, and pushes. If no reviews exist, it requests a Copilot review and polls up to 10 minutes — see the timeout outcome below. Handles `fab status` integration internally (start/finish/fail review-pr stage). Returns completion status. **If review-pr fails** (no PR found, processing error): STOP with the error. From f179674c04489e4b855a4bbfe54901c88a890856 Mon Sep 17 00:00:00 2001 From: Sahil Ahuja Date: Sat, 13 Jun 2026 19:30:25 +0530 Subject: [PATCH 2/3] Update ship status and record PR URL --- .../.history.jsonl | 1 + .../.status.yaml | 12 +++++++----- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl b/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl index 6e45044c..06b23e30 100644 --- a/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl @@ -23,3 +23,4 @@ {"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"hydrate","ts":"2026-06-13T13:51:37Z"} {"event":"review","result":"passed","ts":"2026-06-13T13:51:37Z"} {"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"ship","ts":"2026-06-13T13:58:47Z"} +{"action":"enter","driver":"git-pr","event":"stage-transition","stage":"review-pr","ts":"2026-06-13T14:00:21Z"} diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml b/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml index 20c4b030..1e9f624d 100644 --- a/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml @@ -9,8 +9,8 @@ progress: apply: done review: done hydrate: done - ship: active - review-pr: pending + ship: done + review-pr: active plan: generated: true task_count: 15 @@ -33,8 +33,10 @@ stage_metrics: apply: {started_at: "2026-06-13T13:41:37Z", driver: fab-fff, iterations: 2, completed_at: "2026-06-13T13:45:15Z"} review: {started_at: "2026-06-13T13:45:15Z", driver: fab-fff, iterations: 2, completed_at: "2026-06-13T13:51:37Z"} hydrate: {started_at: "2026-06-13T13:51:37Z", driver: fab-fff, iterations: 1, completed_at: "2026-06-13T13:58:47Z"} - ship: {started_at: "2026-06-13T13:58:47Z", driver: fab-fff, iterations: 1} -prs: [] + ship: {started_at: "2026-06-13T13:58:47Z", driver: fab-fff, iterations: 1, completed_at: "2026-06-13T14:00:21Z"} + review-pr: {started_at: "2026-06-13T14:00:21Z", driver: git-pr, iterations: 1} +prs: + - https://github.com/sahil87/fab-kit/pull/413 true_impact: added: 649 deleted: 58 @@ -50,4 +52,4 @@ true_impact: computed_at: "2026-06-13T13:58:47Z" computed_at_stage: hydrate # true_impact: lazily created on first apply-finish (no placeholder here). -last_updated: 2026-06-13T13:58:47Z +last_updated: 2026-06-13T14:00:21Z From cc502dfa076303ab917bb9789a6d5487bec91f56 Mon Sep 17 00:00:00 2001 From: Sahil Ahuja Date: Sat, 13 Jun 2026 19:37:09 +0530 Subject: [PATCH 3/3] Update review-pr status --- .../260613-m3d4-uniform-stage-model-tier/.history.jsonl | 1 + .../260613-m3d4-uniform-stage-model-tier/.status.yaml | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl b/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl index 06b23e30..94c5b737 100644 --- a/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/.history.jsonl @@ -24,3 +24,4 @@ {"event":"review","result":"passed","ts":"2026-06-13T13:51:37Z"} {"action":"enter","driver":"fab-fff","event":"stage-transition","stage":"ship","ts":"2026-06-13T13:58:47Z"} {"action":"enter","driver":"git-pr","event":"stage-transition","stage":"review-pr","ts":"2026-06-13T14:00:21Z"} +{"event":"review","result":"passed","ts":"2026-06-13T14:07:01Z"} diff --git a/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml b/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml index 1e9f624d..66df2fe1 100644 --- a/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml +++ b/fab/changes/260613-m3d4-uniform-stage-model-tier/.status.yaml @@ -10,7 +10,7 @@ progress: review: done hydrate: done ship: done - review-pr: active + review-pr: done plan: generated: true task_count: 15 @@ -34,7 +34,7 @@ stage_metrics: review: {started_at: "2026-06-13T13:45:15Z", driver: fab-fff, iterations: 2, completed_at: "2026-06-13T13:51:37Z"} hydrate: {started_at: "2026-06-13T13:51:37Z", driver: fab-fff, iterations: 1, completed_at: "2026-06-13T13:58:47Z"} ship: {started_at: "2026-06-13T13:58:47Z", driver: fab-fff, iterations: 1, completed_at: "2026-06-13T14:00:21Z"} - review-pr: {started_at: "2026-06-13T14:00:21Z", driver: git-pr, iterations: 1} + review-pr: {started_at: "2026-06-13T14:00:21Z", driver: git-pr, iterations: 1, completed_at: "2026-06-13T14:07:01Z"} prs: - https://github.com/sahil87/fab-kit/pull/413 true_impact: @@ -52,4 +52,4 @@ true_impact: computed_at: "2026-06-13T13:58:47Z" computed_at_stage: hydrate # true_impact: lazily created on first apply-finish (no placeholder here). -last_updated: 2026-06-13T14:00:21Z +last_updated: 2026-06-13T14:07:01Z