From aab39287dc4127077354e9f764b002d868fbbe44 Mon Sep 17 00:00:00 2001 From: e-jung <8334081+e-jung@users.noreply.github.com> Date: Sun, 21 Jun 2026 21:54:50 +0000 Subject: [PATCH 1/2] fix(watcher): lossless check wakes via watcher-side suppression Checks were the only wake source whose suppression lived inside an opaque script: an edge-triggered check advanced its own .babysit-*.seen marker before the print could become a wake, so a lost stdout (timeout / concurrent run / crash) permanently swallowed the transition. This is the root cause of the missed PR #3095 merge (see data/fm-git-events-s8/report.md): the .cli-printing-press-3095.seen sidecar advanced to MERGED but no check wake was ever emitted. Port the #29 enqueue-before-suppress invariant to checks: 1. Watcher-side suppression (bin/fm-watch.sh). The check always prints its current state (idempotent); the watcher dedups against .seen-check- and calls fm_wake_append (durable queue) BEFORE advancing the marker. A crash between detect and suppress leaves the wake in the queue (recovered next turn) and the marker un-advanced (re-detected next cycle). A lost check wake is now impossible - exactly the guarantee signals enjoy. 2. Backward-compatible with old edge-triggered checks: empty stdout never produces a wake, so they keep their quiet behavior. 3. Catch-all backstop: force-escalate any .babysit-*.seen sidecar showing a terminal state (MERGED/CLOSED) the watcher never delivered a wake for. Catches a swallowed transition within one sweep; deduped via .escalated- so each terminal state fires at most once. Tests (tests/fm-wake-queue.test.sh): - test_check_wake_survives_lost_delivery: the #3095 regression - a check wake survives a simulated crash between enqueue and suppress (queue recovery + re-detection). - test_check_dedup_suppresses_repeats: identical repeated output wakes once. - test_catch_all_escalates_swallowed_transition: a self-suppressed check whose sidecar shows MERGED is force-escalated within one sweep. AGENTS.md: the check contract now documents the stateless "always print current state" form as preferred, and .seen-check-*/.escalated-* join the enqueue-before-suppress marker list. shellcheck clean; all 14 wake-queue tests + 5 spawn-batch tests pass. --- AGENTS.md | 4 +- bin/fm-watch.sh | 58 ++++++++++++++++- tests/fm-wake-queue.test.sh | 123 ++++++++++++++++++++++++++++++++++++ 3 files changed, 181 insertions(+), 4 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 5d4da3d..9277bd9 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -355,7 +355,7 @@ Use chat for yes/no decisions; use lavish-axi when there are multiple findings o For PR-based ship tasks, the ready signal depends on mode: `no-mistakes` reports `done: PR checks green` after CI is green, while `direct-PR` reports `done: PR ` after opening the PR. Run `bin/fm-pr-check.sh ` - it records `pr=` in the task's meta and arms the watcher's merge poll. Tell the captain: the PR's full URL (always the complete `https://...` link, never a bare `#number` - the captain's terminal makes a full URL clickable), a one-paragraph summary, and, for `no-mistakes`, the risk level it emitted. -(The check contract, for any custom `state/.check.sh` you write yourself: print one line only when firstmate should wake, print nothing otherwise, and finish before `FM_CHECK_TIMEOUT`.) +(The check contract, for any custom `state/.check.sh` you write yourself: **print the current state every run** (idempotent), e.g. `echo "merged"` while merged. The watcher dedups against `.seen-check-` and enqueues to the durable queue *before* advancing that marker, so a lost stdout or a crashed watcher can never swallow a wake - the same lossless guarantee signals enjoy. Edge-triggered checks that self-suppress via their own `.babysit-*.seen` are tolerated (empty stdout = no wake), but a swallowed transition is only recovered by the watcher's catch-all backstop, so prefer the stateless "always print current state" form. Finish before `FM_CHECK_TIMEOUT`.) If the captain says "merge it", run `gh-axi pr merge` yourself; that instruction is the explicit approval. If `yolo=on`, merge a green/approved PR yourself and post the required FYI. @@ -389,7 +389,7 @@ From there the task is an ordinary ship task through its mode-specific validatio The watcher is the backbone. Whenever at least one task is in flight, `bin/fm-watch.sh` must be running as a background task. It costs zero tokens while running and exits with one reason line when something needs you. -It also writes each detected wake to the durable queue at `state/.wake-queue` before advancing suppression markers such as `.seen-*`, `.stale-*`, `.last-check`, or `.last-heartbeat`. +It also writes each detected wake to the durable queue at `state/.wake-queue` before advancing suppression markers such as `.seen-*`, `.stale-*`, `.seen-check-*`, `.escalated-*`, `.last-check`, or `.last-heartbeat`. At the start of every wake-handling turn and every recovery turn, run `bin/fm-wake-drain.sh` before peeking panes, reading status files beyond the reason line, or starting new work. The printed one-shot reason line is still useful, but the drained queue is the lossless backlog. After handling drained wakes, re-arm `bin/fm-watch.sh` before you end the turn. diff --git a/bin/fm-watch.sh b/bin/fm-watch.sh index 0028483..e877ec3 100755 --- a/bin/fm-watch.sh +++ b/bin/fm-watch.sh @@ -4,7 +4,9 @@ # signal: ... a crewmate wrote a status line or a turn-end hook fired; signals # landing within FM_SIGNAL_GRACE of each other coalesce into one wake # stale: a crewmate pane stopped changing and shows no busy signature -# check: