diff --git a/AGENTS.md b/AGENTS.md index f41139b..11f3013 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -371,10 +371,18 @@ For `no-mistakes`-mode ship tasks, when a crewmate's status says `done`, trigger Load `harness-adapters` for the target harness's skill invocation form; natural language also works if uncertain. The crewmate drives the no-mistakes pipeline (review, test, document, lint, push, PR, CI) itself. -The no-mistakes pipeline fixes auto-fix findings on its own (inside its own worktree); the crewmate advances each gate with `no-mistakes axi respond`, and must never edit or commit code while a run is active. +The no-mistakes pipeline owns every validation fix on the crewmate's branch in its own worktree; the crewmate advances each gate with `no-mistakes axi respond`, and must never hand-edit, commit, reset, checkout, abort, or re-run while a run is active. When it reports `needs-decision` (ask-user findings), relay the findings to the captain unless `yolo=on` permits routine approval on your judgment, then send the decision back as a short instruction (the crewmate responds via `no-mistakes axi respond`). Use chat for yes/no decisions; use lavish-axi when there are multiple findings or options to triage. +Judge a validating crewmate by the run's step status, never by whether its shell is still running; read it cheaply with `no-mistakes axi status`. + +- `running`/`fixing`/`ci` - the pipeline is working (a fix round, a test, or CI monitoring); these run for many minutes and quiet is normal, so leave it alone. +- `awaiting_approval`/`fix_review` - the run is parked waiting on the agent, surfaced as a top-level `awaiting_agent: parked ` line right after `status:` in `axi status`. + The crewmate owes a `respond`; if it is idle-waiting for the run to advance on its own, steer it to drive the gate, because a parked gate never self-resolves. +- Red flag - self-fix duplication: a validating crewmate making fresh hand-commits, aborting the run, or re-running it mid-validation is re-doing work the pipeline already owns. + Steer it back to respond-only: the pipeline applies every fix on the branch from `axi respond`, including the fix for a real bug the review found in the crewmate's own code, and hand-fixing forces a full re-validation. + ### PR ready For PR-based ship tasks, the ready signal depends on mode: `no-mistakes` reports `done: PR checks green` after CI is green, while `direct-PR` reports `done: PR ` after opening the PR. @@ -491,7 +499,7 @@ Watcher liveness is not enough if you are foreground-blocked. Whenever one or more tasks are in flight, do not run long foreground-blocking operations in your own session. This is about firstmate's own session: it includes a no-mistakes pipeline firstmate runs for this repo, long builds, and any other multi-minute command. Background that work so watcher wakes can interleave with it and the supervision loop stays responsive. -A crewmate driving its own `no-mistakes` validation does the opposite: it runs that gate drive in the foreground and drives it synchronously, never backgrounding or idle-waiting on its own validation run. +A crewmate driving its own `no-mistakes` validation does the opposite: it drives that gate loop synchronously and processes every return, never idle-waiting for its own validation run to advance on its own. Token discipline: status files before panes; default peeks to 40 lines; never stream a pane repeatedly through yourself; batch what you tell the captain. The context-% shown in a peek is not actionable as crew health; ignore it and intervene only on real signals (`signal`, `stale`, `needs-decision`, `blocked`), looping or confusion in the pane, or a question the brief already answers. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 0d0a72b..8dabf20 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -49,8 +49,9 @@ See the [no-mistakes quick start](https://kunchenguid.github.io/no-mistakes/star Tracked changes to firstmate itself - `AGENTS.md`, `README.md`, `CONTRIBUTING.md`, `.tasks.toml`, `.github/workflows/`, `bin/`, and agent skill files - ship through the `no-mistakes` pipeline on a feature branch and require an explicit merge approval. When supervising live crewmates, keep firstmate's own long validation or build commands in the background so watcher wakes can still be handled. -A crewmate driving its own `no-mistakes` validation does the opposite: it runs the gate in the foreground and lets each synchronous `no-mistakes axi run` or `no-mistakes axi respond` call return. -The pipeline owns auto-fix changes; the crewmate authorizes them with `no-mistakes axi respond --action fix --findings ` instead of editing or committing while the run is active. +Unlike firstmate, a crewmate owns its own validation gate loop: it processes every `no-mistakes axi run` or `no-mistakes axi respond` return, responds to gates, and never waits for a parked gate to self-resolve. +The pipeline owns every validation fix, including auto-fix findings and fixes for real bugs found in the crewmate's own code; the crewmate authorizes or answers with `no-mistakes axi respond` instead of editing, committing, aborting, or re-running while the run is active. +Do not use `--yes` for crewmate validation because it silently resolves `ask-user` findings without escalation. Local `.no-mistakes/` state and test evidence stay out of this repo; `.no-mistakes.yaml` keeps evidence in a temp directory instead. Check and test the toolbelt before pushing: diff --git a/bin/fm-brief.sh b/bin/fm-brief.sh index 1193d7e..b10b7f7 100755 --- a/bin/fm-brief.sh +++ b/bin/fm-brief.sh @@ -207,11 +207,24 @@ The task is complete only when committed on your branch. When you believe it is complete, append \`done: {summary}\` to the status file and stop. Firstmate will then instruct you to run /no-mistakes to validate and ship a PR. -During validation you drive the gates while the pipeline owns the fixes. Run it in the foreground and follow this contract: -- Never edit or \`git commit\` code yourself while a run is active; the pipeline applies every fix in its own worktree. -- When a gate shows auto-fix findings, advance it with \`no-mistakes axi respond --action fix --findings \` (the pipeline applies the fix and re-reviews). Escalate ask-user findings per rule 6. -- \`no-mistakes axi run\` and \`axi respond\` block synchronously for many minutes (test and CI especially); the pipeline often fixes findings itself with no gate, so when a call returns no \`gate:\` object that is normal - just let it return. -- Never cancel, abort, re-run, or background the run, and never idle-wait for a background notification: the call is in the foreground and returns on its own. +During validation the pipeline owns every fix; you only drive the gates. +Once a run is active, every fix - both auto-fix findings and the fix for a real bug the review finds in your own code - is applied by the pipeline on your branch, in its own worktree. +Never hand-edit, \`git commit\`, \`git reset\`/\`git checkout\`, abort, or re-run while a run is active: doing so duplicates the pipeline's work and forces a full re-validation. +You advance the work only by responding to gates: +- Process every return; never idle-wait. + \`no-mistakes axi run\` / \`axi respond\` return either at a gate (a \`gate:\` object) or at a terminal or CI-ready outcome (no \`gate:\`). + A run legitimately runs long - test, CI, and each fix round take many minutes - so a quiet call is working, not stalled. + Backgrounding the call is fine; idle-waiting for the run to advance on its own is not, because it never advances past a gate by itself. + Read every return: on a \`gate:\`, respond, and loop until you reach an outcome. +- Auto-fix findings: advance the gate with \`no-mistakes axi respond --action fix --findings \`; the pipeline applies the fix on your branch and re-reviews. + You never apply it yourself. +- Review findings always gate. + Review auto-fix is disabled, so every actionable review finding parks for your response instead of being self-fixed - drive it like any other gate. +- ask-user findings: escalate to firstmate (rule 6) and stop. + When the decision comes back, feed it to the gate with \`no-mistakes axi respond\` (the \`--action\`/answer the decision implies) and let the pipeline apply it. + Even when it is a real bug in your own code, do NOT implement the decided fix yourself and do NOT abort to go fix it - the pipeline applies it from your \`respond\`. +- Avoid \`--yes\`: it silently auto-resolves every finding, including \`ask-user\`, with zero escalation, so a decision the captain should make gets resolved without them. + Drive gates manually and escalate \`ask-user\` findings. After /no-mistakes reports CI green, append \`done: PR {url} checks green\` and stop. You are finished. EOF