Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ Hard rules, in priority order:
The one standing, captain-authorized relaxation is a project's `yolo` flag (section 7): with `yolo` on, firstmate makes routine approval decisions itself, but anything destructive, irreversible, or security-sensitive still escalates to the captain.
3. **Never tear down a worktree that holds unlanded work.**
`bin/fm-teardown.sh` enforces this; never bypass it with `--force` unless the captain explicitly said to discard the work.
The work is "landed" once `HEAD` is reachable from any remote-tracking branch (a fork counts as a remote - upstream-contribution PRs pushed to a fork satisfy this in any mode); for `local-only` ship tasks with no remote at all, the work may instead be merged into the local default branch.
The work is "landed" once `HEAD` is reachable from any remote-tracking branch (a fork counts as a remote - upstream-contribution PRs pushed to a fork satisfy this in any mode); for a normal ship task whose commits are not so reachable, it is also landed when its PR is merged and GitHub reports the current worktree HEAD as that PR's head (which covers the common squash-merge-then-delete-branch flow, where the branch's commits live nowhere on a remote yet the recorded work merged) or when its content is already present in the up-to-date default branch; for `local-only` ship tasks with no remote at all, the work may instead be merged into the local default branch.
Uncommitted changes are never landed.
The scout carve-out: a scout task's worktree is declared scratch from the start - its deliverable is the report, and teardown lets the worktree go once that report exists (section 7).
4. **Crewmates never address the captain.**
All crewmate communication flows through you.
Expand Down Expand Up @@ -81,7 +82,7 @@ projects/ cloned repos; gitignored; READ-ONLY for you
state/ volatile runtime signals; gitignored
<id>.status appended by crewmates: "<state>: <note>" lines
<id>.turn-ended touched by turn-end hooks
<id>.meta written by fm-spawn: window=, worktree=, project=, harness=, kind=, mode=, yolo=; kind=secondmate also records home= and projects= (fm-pr-check appends pr=)
<id>.meta written by fm-spawn: window=, worktree=, project=, harness=, kind=, mode=, yolo=; kind=secondmate also records home= and projects= (fm-pr-check appends pr= and verified pr_head= when available)
<id>.check.sh optional slow poll you write per task (e.g. merged-PR check)
.wake-queue durable queued wakes: epoch<TAB>seq<TAB>kind<TAB>key<TAB>payload
.afk durable away-mode flag; present = sub-supervisor may inject escalations (set by /afk, cleared on user return)
Expand Down Expand Up @@ -356,7 +357,7 @@ Because `fm-send` to a `kind=secondmate` target marks the request as from-firstm
A ship task's path from `done` to landed on `main` is set by the project's `mode` (recorded in meta; section 6); `yolo` decides who approves. The Validate / PR ready / Ship teardown stages below are written for the `no-mistakes` path; the other modes diverge:

- **no-mistakes** - the stages below as written: no-mistakes validation pipeline -> PR -> captain merge.
- **direct-PR** - no pipeline. The crewmate pushes and opens the PR itself (its brief says so) and reports `done: PR <url>`. Skip the Validate step and go straight to PR ready (run `fm-pr-check`, relay the PR). Teardown uses the normal pushed-branch check.
- **direct-PR** - no pipeline. The crewmate pushes and opens the PR itself (its brief says so) and reports `done: PR <url>`. Skip the Validate step and go straight to PR ready (run `fm-pr-check`, relay the PR). Teardown uses the normal landed-work check.
- **local-only** - no remote, no PR. The crewmate stops at `done: ready in branch fm/<id>`. Review the diff with `bin/fm-review-diff.sh <id>`, relay a one-paragraph summary to the captain, and on approval run `bin/fm-merge-local.sh <id>` to fast-forward local `main` (it refuses anything but a clean fast-forward - if it does, have the crewmate rebase). No `fm-pr-check`. Then teardown, whose safety check requires the branch already merged into local `main`, OR the work pushed to any remote (a fork counts - relevant for upstream-contribution PRs on a local-only-registered project).

When reviewing any crewmate branch diff, use `bin/fm-review-diff.sh <id>` rather than `git diff <default>...branch` directly.
Expand All @@ -377,7 +378,7 @@ Use chat for yes/no decisions; use lavish-axi when there are multiple findings o
### PR ready

For PR-based ship tasks, the ready signal depends on mode: `no-mistakes` reports `done: PR <url> checks green` after CI is green, while `direct-PR` reports `done: PR <url>` after opening the PR.
Run `bin/fm-pr-check.sh <id> <PR url>` - it records `pr=` in the task's meta and arms the watcher's merge poll.
Run `bin/fm-pr-check.sh <id> <PR url>` - it records `pr=` and a verified `pr_head=` when available in the task's meta and arms the watcher's merge poll.
Tell the captain: the PR's full URL (always the complete `https://...` link, never a bare `#number` - the captain's terminal makes a full URL clickable), a one-paragraph summary, and, for `no-mistakes`, the risk level it emitted.
(The check contract, for any custom `state/<id>.check.sh` you write yourself: print one line only when firstmate should wake, print nothing otherwise, and finish before `FM_CHECK_TIMEOUT`.)

Expand All @@ -389,7 +390,10 @@ If the captain says "merge it", run `gh-axi pr merge` yourself; that instruction
bin/fm-teardown.sh <id>
```

The script refuses if the worktree holds unpushed work; treat a refusal as a stop-and-investigate, not an obstacle.
The script refuses if the worktree holds uncommitted changes or committed work that has not landed; treat a refusal as a stop-and-investigate, not an obstacle.
"Landed" is broader than remote-reachable: for a normal ship task whose commits are not reachable from any remote-tracking branch, the script also accepts the work when its PR is merged and GitHub reports the current worktree HEAD as that PR's head, or when its content is already present in the up-to-date default branch.
This recognizes the common squash-merge-then-delete-branch flow, where the branch's own commits live nowhere on a remote yet the change is fully in `main`; a merged-and-deleted branch now tears down cleanly instead of false-refusing.
Genuinely unlanded work (no matching merged PR head and content not in the default branch) and dirty worktrees still refuse, and a gh lookup error falls back to the content check rather than silently allowing.
Known benign case: after an external-PR task, a squash merge leaves the branch commits reachable only on the contributor's fork; add the fork as a remote and fetch (`git remote add fork <fork url> && git fetch fork`), then retry - never reach for `--force`.
After a successful PR-based teardown, it also runs `bin/fm-fleet-sync.sh` for that project, best-effort, so the clone's local default catches up to the merge and the just-merged branch, now gone on the remote and free of its worktree, is pruned immediately.
Then update the backlog using the teardown reminder: run `tasks-axi done` when the compatible tool is available, otherwise move the task to Done in `data/backlog.md` manually with the full `https://...` PR URL or local merge note and date and keep Done to the 10 most recent.
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ tests/fm-update.test.sh # fast-forward-only self-update, rerea
tests/fm-secondmate-sync.test.sh # local-HEAD secondmate sync, no-fetch, bootstrap nudge gating, and spawn hook tests
tests/fm-secondmate-lifecycle-e2e.test.sh # persistent secondmate routing, seeding, backlog handoff, spawn, recovery, teardown, and FM_HOME flow tests
tests/fm-secondmate-safety.test.sh # secondmate home safety, idle charter, handoff validation, and teardown boundary tests
tests/fm-teardown.test.sh # fm-teardown.sh safety and reminder checks: local-only fork-remote allow, truly-unpushed refuse, merged-to-main allow, no-mistakes regression, tasks-axi reminder, --force override
tests/fm-teardown.test.sh # fm-teardown.sh landed-work safety and reminder checks: fork-remote allow, squash/content landings, dirty and unlanded refusals, PR-head metadata, tasks-axi reminder, --force override
[ "$(readlink CLAUDE.md)" = "AGENTS.md" ]
[ "$(readlink .claude/skills)" = "../.agents/skills" ]
FM_HEARTBEAT=2 FM_POLL=1 bin/fm-watch-arm.sh # watcher re-arm smoke test (prints arm status, then "heartbeat")
Expand Down
30 changes: 24 additions & 6 deletions bin/fm-pr-check.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#!/usr/bin/env bash
# Record a PR-ready task: appends pr=<url> to state/<id>.meta and arms the
# watcher's merge poll by writing state/<id>.check.sh, which prints one line iff
# the PR is merged (the watcher's check contract: output = wake firstmate,
# silence = keep sleeping).
# Record a PR-ready task: appends pr=<url> and a verified pr_head=<sha> to
# state/<id>.meta when available, then arms the watcher's merge poll by writing
# state/<id>.check.sh, which prints one line iff the PR is merged (the watcher's
# check contract: output = wake firstmate, silence = keep sleeping).
# Usage: fm-pr-check.sh <task-id> <pr-url>
set -eu

Expand All @@ -15,8 +15,26 @@ ID=$1
URL=$2

META="$STATE/$ID.meta"
if [ -f "$META" ] && ! grep -qxF "pr=$URL" "$META"; then
echo "pr=$URL" >> "$META"
if [ -f "$META" ]; then
WT=$(grep '^worktree=' "$META" | tail -1 | cut -d= -f2- || true)
LOCAL_HEAD=
PR_HEAD=
if [ -n "$WT" ] && [ -d "$WT" ]; then
LOCAL_HEAD=$(git -C "$WT" rev-parse --verify HEAD 2>/dev/null || true)
if [ -n "$LOCAL_HEAD" ] && command -v gh >/dev/null 2>&1; then
if REMOTE_HEAD=$(cd "$WT" && gh pr view "$URL" --json headRefOid -q .headRefOid 2>/dev/null); then
if [ "$LOCAL_HEAD" = "$REMOTE_HEAD" ]; then
PR_HEAD=$LOCAL_HEAD
fi
fi
fi
fi
if ! grep -qxF "pr=$URL" "$META"; then
echo "pr=$URL" >> "$META"
fi
if [ -n "$PR_HEAD" ] && ! grep -qxF "pr_head=$PR_HEAD" "$META"; then
echo "pr_head=$PR_HEAD" >> "$META"
fi
fi

cat > "$STATE/$ID.check.sh" <<EOF
Expand Down
2 changes: 1 addition & 1 deletion bin/fm-promote.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env bash
# Promote a scout task to a ship task in place: the crewmate keeps its window,
# worktree, and loaded context; only the contract changes. Flips kind= to ship in
# state/<task-id>.meta so fm-teardown.sh applies the full unpushed-work protection
# state/<task-id>.meta so fm-teardown.sh applies the full ship-task teardown protection
# again. After promoting, send the crewmate its ship instructions via fm-send.sh
# (inventory scratch state, reset to a clean default-branch base, carry over only
# intended fix changes, create branch fm/<task-id>, implement, then report done
Expand Down
129 changes: 115 additions & 14 deletions bin/fm-teardown.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,18 @@
# secondmate home, kill the tmux window, clear volatile state, refresh/prune
# the project's clone for PR-based ship tasks, then print a backlog-refresh
# reminder.
# REFUSES if the worktree holds work not on any remote, because treehouse return
# hard-resets the worktree and kills its processes. A fork counts as a remote,
# so upstream-contribution PRs pushed to a fork satisfy this in any mode.
# REFUSES if the worktree holds work that has not LANDED, because treehouse return
# hard-resets the worktree and kills its processes. Work has landed when it is
# reachable from any remote-tracking branch (a fork counts as a remote, so
# upstream-contribution PRs pushed to a fork satisfy this in any mode), OR - for a
# normal ship task whose commits are not so reachable - when its PR is merged and
# GitHub reports the current HEAD as that PR's head, or its content is already
# present in the up-to-date default branch. This recognizes the common
# squash-merge-then-delete-branch flow, where the branch's own commits live nowhere
# on a remote yet the change is fully in main.
# A gh lookup error falls back to the content check; if that is also inconclusive,
# teardown refuses rather than risk discarding unlanded work.
# Uncommitted changes are never landed.
# local-only projects additionally accept work merged into the local default
# branch (firstmate performs that merge on the captain's approval) as a fallback
# for the common case where there is no remote at all.
Expand All @@ -20,9 +29,9 @@
# never left leased forever. If the treehouse return fails, teardown leaves the
# leased home and state in place instead of hiding a still-held lease.
# Usage: fm-teardown.sh <task-id> [--force]
# --force skips the unpushed-work check for ordinary tasks and discards
# secondmate child work for kind=secondmate. Only use it when the captain has
# explicitly said to discard the work.
# --force skips ordinary-task dirty and landed-work checks, skips scout report
# checks, and discards secondmate child work for kind=secondmate. Only use it
# when the captain has explicitly said to discard the work.
set -eu

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
Expand Down Expand Up @@ -72,6 +81,79 @@ meta_value() {
grep "^$key=" "$meta" | cut -d= -f2- || true
}

# Resolve the PR number for a worktree branch via gh-axi. Echoes the number on a
# single match and returns 0; returns non-zero on no match or any lookup failure,
# so the caller treats it as "no PR found" (fail-safe).
pr_number_from_branch() {
local branch=$1 out n
[ -n "$branch" ] && [ "$branch" != HEAD ] || return 1
out=$( cd "$WT" && gh-axi pr list --state all --head "$branch" --limit 1 2>/dev/null ) || return 1
n=$(printf '%s\n' "$out" | sed -n 's/^[[:space:]]*\([0-9][0-9]*\),.*/\1/p' | head -1)
[ -n "$n" ] || return 1
printf '%s' "$n"
}

# Is the worktree's PR merged for this exact HEAD? Resolves the PR from the
# recorded pr= URL first, then from the branch name, and asks GitHub for both the
# PR state and head. Returns non-zero when the PR is not merged, the current HEAD
# is not the PR head, no PR is found, or any gh error occurs - the caller then
# falls back to the content check.
pr_is_merged() {
local branch=$1 target view state head current
if [ -n "$PR_URL" ]; then
target=$PR_URL
else
target=$(pr_number_from_branch "$branch") || return 1
fi
[ -n "$target" ] || return 1
view=$(cd "$WT" && gh pr view "$target" --json state,headRefOid -q '.state + "\t" + .headRefOid' 2>/dev/null) || return 1
state=${view%%$'\t'*}
head=${view#*$'\t'}
[ "$state" != "$view" ] || return 1
case "$state" in
MERGED|merged) ;;
*) return 1 ;;
esac
[ -n "$head" ] || return 1
current=$(git -C "$WT" rev-parse --verify HEAD 2>/dev/null) || return 1
[ "$current" = "$head" ]
}

# Is the branch's content already present in the up-to-date default branch? Fetches
# first, then 3-way merges the default branch with HEAD: when HEAD introduces nothing
# the default branch does not already contain (e.g. its change landed via squash) the
# merged tree equals the default branch's tree. This isolates branch-only changes, so
# unrelated commits the default branch gained past the merge-base do not count as
# "added". Returns non-zero when inconclusive (no default ref, or a merge conflict),
# so the caller refuses rather than guesses.
content_in_default() {
local name ref default_tree merged_tree
name=$(default_branch) || return 1
if git -C "$WT" remote get-url origin >/dev/null 2>&1; then
git -C "$WT" fetch --quiet origin "+refs/heads/$name:refs/remotes/origin/$name" >/dev/null 2>&1 || return 1
ref="refs/remotes/origin/$name"
elif git -C "$WT" rev-parse --quiet --verify "refs/heads/$name" >/dev/null 2>&1; then
ref="refs/heads/$name"
else
return 1
fi
default_tree=$(git -C "$WT" rev-parse --quiet --verify "$ref^{tree}" 2>/dev/null) || return 1
[ -n "$default_tree" ] || return 1
merged_tree=$(git -C "$WT" merge-tree --write-tree "$ref" HEAD 2>/dev/null) || return 1
merged_tree=$(printf '%s\n' "$merged_tree" | head -1)
[ "$merged_tree" = "$default_tree" ]
}

# Has the worktree's committed work actually LANDED, though its commits are not
# reachable from any remote-tracking branch? True when a merged PR proves the
# current HEAD, OR the content is already in the default branch (fallback, which
# also covers the no-PR and gh-error paths). False only for genuinely unlanded work.
work_is_landed() {
local branch=$1
pr_is_merged "$branch" && return 0
content_in_default
}

backlog_refresh_reminder() {
local pr done_cmd report_path
if fm_tasks_axi_compatible; then
Expand Down Expand Up @@ -429,9 +511,14 @@ if [ -d "$WT" ] && [ "$FORCE" != "--force" ]; then
else
# The fm-spawn hook file is ours, never work product; ignore it in the dirty check.
dirty=$(git -C "$WT" status --porcelain 2>/dev/null | grep -vE '^\?\? \.claude/' | head -1 || true)
# A worktree's work is "safely on a remote" once HEAD is reachable from ANY
# remote-tracking branch (empty result here). A fork is a remote too, so
# upstream-contribution PRs pushed to a fork satisfy this regardless of mode.
# Reachability test: is HEAD reachable from ANY remote-tracking branch? Empty
# means the work is already pushed (a fork is a remote too, so upstream-
# contribution PRs pushed to a fork pass here). Non-empty does NOT prove the work
# is unlanded: a squash or rebase merge rewrites the branch into a new commit on
# the default branch, and a repo that auto-deletes the head branch on merge also
# drops its remote-tracking ref - so a merged-and-deleted branch trips this test
# while being fully landed. We therefore treat reachability as a fast accept, not
# the sole verdict, and fall through to a landed-work check before refusing.
unpushed=$(git -C "$WT" log --oneline HEAD --not --remotes -- 2>/dev/null | head -5 || true)
if [ -n "$unpushed" ] && [ "$MODE" = local-only ]; then
# local-only ships have no remote in the common case, so the "on a remote"
Expand All @@ -447,12 +534,26 @@ if [ -d "$WT" ] && [ "$FORCE" != "--force" ]; then
echo "Merge the branch into local $DEFAULT first (bin/fm-merge-local.sh after the captain approves), or push to a fork/remote, or get the captain's explicit OK to discard, then --force." >&2
exit 1
fi
elif [ -n "$dirty" ] || [ -n "$unpushed" ]; then
echo "REFUSED: worktree $WT has work not on any remote." >&2
[ -n "$dirty" ] && echo "uncommitted changes present" >&2
[ -n "$unpushed" ] && printf 'unpushed commits:\n%s\n' "$unpushed" >&2
echo "Push the branch (or get the captain's explicit OK to discard, then --force)." >&2
elif [ -n "$dirty" ]; then
# Uncommitted changes are never landed and the reset would discard them; always
# refuse, regardless of whether the committed work itself has landed.
echo "REFUSED: worktree $WT has uncommitted changes." >&2
echo "uncommitted changes present" >&2
echo "Commit them (or get the captain's explicit OK to discard, then --force)." >&2
exit 1
elif [ -n "$unpushed" ]; then
# Commits not reachable from any remote. Before refusing, recognize LANDED work:
# a merged PR for the current HEAD or content already in the up-to-date default
# branch. On a gh lookup error work_is_landed falls back to the content check,
# and if that is also inconclusive it returns false - so we never silently allow
# teardown of possibly-unlanded work; only genuinely unlanded work is refused.
branch=$(git -C "$WT" rev-parse --abbrev-ref HEAD 2>/dev/null || echo HEAD)
if ! work_is_landed "$branch"; then
echo "REFUSED: worktree $WT has work not on any remote and not landed." >&2
printf 'unpushed commits:\n%s\n' "$unpushed" >&2
echo "Push the branch, land its PR, or get the captain's explicit OK to discard, then --force." >&2
exit 1
fi
fi
fi
fi
Expand Down
Loading