You can run one coding agent easily. But the moment you want three project tasks done in parallel - fixes, investigations, plans, audits - you become a tab-juggler: babysitting sessions, copy-pasting context between repos, forgetting which terminal had the failing test.
firstmate flips the model.
You talk to a single agent - the first mate - and it runs the crew for you: spawning autonomous agents in tmux windows, giving each a clean git worktree, supervising them to completion, and handing you finished PRs, approved local merges, or standalone investigation reports.
For larger fleets, you can opt in to persistent secondmates: domain supervisors that are still ordinary direct reports, but run from their own isolated firstmate homes.
There is no app to install; the whole orchestrator is an AGENTS.md file that any terminal coding agent can follow.
- One liaison - you never talk to a worker agent. The first mate dispatches, supervises, escalates only real decisions, and reports plain outcomes about work that is ready, blocked, or needs your call.
- A visible crew - every crewmate lives in a tmux window. Watch any of them work, or type into their window to intervene; the first mate reconciles.
- Persistent domain supervisors - route natural-language scopes through
data/secondmates.mdwhen a domain deserves its own long-lived supervisor. Each secondmate has a separateFM_HOME, local state, local projects, and its own session lock, while the main first mate still supervises it like any other direct report. - Guarded by construction - the first mate is read-only over your projects except for clean local default-branch refreshes, safe pruning of local branches whose remote is gone, and approved
local-onlyfast-forward merges; crewmates work in disposable treehouse worktrees. Ship tasks follow each project's delivery mode, and scout tasks produce local reports without pushing anything.
This is not an agent harness. This is not a skill. This is not a CLI.
This is.. a directory that turns any agent into your firstmate, and you the captain.
$ git clone https://github.com/kunchenguid/firstmate && cd firstmate
$ claude # launch your agent harness here; AGENTS.md takes over
> ahoy! look at my github project xyz, then fix the flaky login test and add dark mode
# firstmate checks its toolchain (asking your consent before installing anything),
# clones the project under projects/, and spawns two crewmates in tmux windows
# fm-fix-login-k3 and fm-dark-mode-p7.
# Minutes later:
PR ready for review, captain: https://github.com/you/xyz/pull/42
(fix flaky login test - risk: low - CI green)
> alright merge itPrerequisites (the first mate detects everything else and offers to install it):
# 1. a verified agent harness - claude, codex, opencode, or pi
# 2. git + GitHub auth
# 3. tmux - the crew lives in tmux windows (firstmate offers to install it if missing)
gh auth loginGet firstmate:
git clone https://github.com/kunchenguid/firstmate
cd firstmate && claudeThat is the whole install.
On first launch the first mate detects what its required toolchain is missing or too old (tmux, node, gh, treehouse with durable lease support, no-mistakes, gh-axi, chrome-devtools-axi, lavish-axi), lists it with the exact install commands, and installs only after you say go.
If compatible tasks-axi is already on PATH, bootstrap records it as an optional capability fact and firstmate uses its verbs for routine backlog mutations; when it is absent or incompatible, firstmate keeps hand-editing data/backlog.md exactly as before.
Run it inside tmux for the best experience.
firstmate works from any terminal - outside tmux, crewmates land in a detached firstmate session you can attach to - but launching your harness from inside tmux puts every crewmate window in your own session, one per task, where you can watch the crew work in real time or type into any window to intervene.
you (the captain)
│ chat: requests, decisions, "merge it"
▼
┌─────────────────────────────────────┐
│ firstmate (this repo) │
│ reads projects/ + firstmate routes │
│ writes guarded backlog/briefs/state │
└──┬──────────────┬───────────────┬───┘
│ tmux send-keys / status files │
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│fm-task1│ │fm-task2│ ... │fm-taskN│ tmux windows you can watch
│crewmate│ │crewmate│ │crewmate│ one autonomous agent each
└───┬────┘ └───┬────┘ └───┬────┘
▼ ▼ ▼
treehouse worktree or isolated secondmate home
│
├─ ship: project mode ► PR/local merge ► teardown
│
└─ scout: report at data/<id>/report.md ► relay findings ► teardown
- Event-driven supervision - a zero-token bash watcher (
bin/fm-watch.sh) sleeps on the fleet and wakes the first mate only when a crewmate reports, stalls, a PR merges, or an internal heartbeat review is due. Detected wakes are also written to a durable local queue (state/.wake-queue) before detector state advances, so a missed one-shot process exit can be recovered by draining the queue. Routine watcher polling, re-arm no-ops, elapsed waiting time, and unchanged heartbeat reviews stay silent; an idle crew costs you nothing. Routine re-arms go throughbin/fm-watch-arm.sh, which forks the watcher as a tracked child, verifies it is genuinely alive with a fresh liveness beacon, and prints exactly one honest status line (started/healthy/FAILED, the last exiting non-zero) - never a falsealready runningoff a dying process. Its--restartmode signals only the watcher recorded in the current home'sstate/.watch.lock, so restarting one home cannot kill sibling secondmate watchers. A pull-based guard (bin/fm-guard.sh) warns through supervision tool output if tasks are in flight and that watcher stops running or queued wakes are waiting to be drained, leading with a prominent bordered banner for the no-watcher case so it cannot be skimmed past. A presence-gated sub-supervisor (bin/fm-supervise-daemon.sh) extends this for walk-away supervision: the/afkskill activates it, after which it self-handles routine wakes in bash and escalates only captain-relevant events as one batched, single-line digest (prefixed with an in-band sentinel marker so firstmate can tell daemon injections apart from real messages). Its injection path sharesbin/fm-tmux-lib.shwithfm-send.sh, so dim-ghost-aware and border-aware composer detection plus verified submit retry stay consistent; stalled escalation delivery raisesstate/.subsuper-inject-wedgedafterFM_MAX_DEFER_SECSinstead of silently deferring forever. - Worktrees, not branches in your checkout - crewmates never touch your clone; treehouse pools clean worktrees so parallel tasks on one repo cannot collide.
- Two task shapes - ship tasks change projects and ship by project mode (
no-mistakes,direct-PR, orlocal-only); scout tasks investigate, plan, reproduce bugs, or audit, then leave a report atdata/<id>/report.mdand never push. - Optional secondmates -
data/secondmates.mdrecords persistent domain supervisors with natural-language scopes, project clone lists, and home paths.fm-home-seed.shprovisions the isolated home, clones the listed PR-based projects into it, initializes newly clonedno-mistakesprojects, copies the charter todata/charter.md, andfm-spawn.sh --secondmatelaunches it through the same tmux and status-file path as any direct report. When seeded with-, the home is a durable treehouse lease under the secondmate id, so it survives with no live process and is not recycled by latertreehouse getor pruning. Retirement or seed rollback returns the leased home; normal restart/recovery keeps it leased. If returning the lease fails during teardown, firstmate leaves the route and home intact instead of hiding a still-held lease. Seeding is transactional: if validation, cloning, initialization, or registry update fails, generated briefs, new homes, new project clones, and registry edits are rolled back.local-onlyprojects stay with the main first mate because they merge into the main local checkout instead of a remote-backed PR path. The same project may appear in multiple secondmate homes when their scopes differ, such as issue triage versus feature development. Secondmates are idle by default: after startup recovery reconciles only work already in their own home, an empty queue waits silently for routed tasks, and they never self-initiate surveys or audits. After seeding a secondmate,fm-backlog-handoff.shmoves already-judged in-scope queued items from the main backlog into that secondmate home so the domain queue starts in the right place. Idle secondmate panes are healthy; teardown is explicit and refuses while the secondmate home has in-flight work unless the captain has approved discard with--force. - Project modes are explicit -
data/projects.mdrecords each project's delivery mode and optional+yoloautonomy flag.no-mistakesprojects run the full validation pipeline,direct-PRprojects open PRs without that pipeline, andlocal-onlyprojects stay local until firstmate performs an approved fast-forward merge. - Project memory belongs to projects - durable project-intrinsic agent knowledge lives in each project's committed
AGENTS.md, withCLAUDE.mdas a symlink. Ship briefs prompt crewmates to create or update those files through the normal delivery path;data/projects.mdstays a thin private registry. - Local clones stay fresh - bootstrap and PR-based teardown refresh remote-backed project clones with clean default-branch fast-forwards when the clone is on the default branch and has no local work, and prune local branches whose remote is gone and that no worktree still needs.
- Self-updates stay safe -
/updatefirstmatefast-forwards the running firstmate repo and registered secondmate homes fromorigin, then re-reads updated instructions and nudges updated secondmates without touching project clones. The update is fast-forward only: dirty, diverged, offline, and off-default targets are reported and left untouched. - Restart-proof - all state lives in tmux, status files, local markdown under
data/,data/secondmates.md, and persistent secondmate homes. Kill the first mate session anytime; the next one reconciles and carries on.
The first mate drives these; you rarely need to, but they work by hand too.
| Script | Description |
|---|---|
fm-bootstrap.sh |
Detect required toolchain problems and optional capability facts; refresh clones best-effort; install tools only after consent |
fm-fleet-sync.sh |
Fetch clones, clean-fast-forward their checked-out default branches, and safely prune branches whose remote is gone |
fm-update.sh |
Self-update the running firstmate repo and registered secondmate homes with fast-forward-only pulls from origin |
fm-backlog-handoff.sh |
Move already-judged in-scope queued backlog items from the main home into a seeded secondmate home |
fm-brief.sh |
Scaffold a ship brief, a report-only scout brief with --scout, or a secondmate charter with --secondmate |
fm-ensure-agents-md.sh |
Ensure project AGENTS.md is the real memory file and CLAUDE.md symlinks to it |
fm-guard.sh |
Warn when tasks are in flight but queued wakes are pending; lead stale or missing watcher cases with a prominent banner |
fm-home-seed.sh |
Lease/provision a secondmate home transactionally, clone projects, initialize gates, and maintain data/secondmates.md |
fm-spawn.sh |
Spawn one task, several id=repo pairs, or a persistent secondmate with --secondmate |
fm-project-mode.sh |
Resolve a project's delivery mode and +yolo flag from data/projects.md |
fm-merge-local.sh |
Fast-forward a local-only project's local default branch after approval |
fm-review-diff.sh |
Review a crewmate branch against the authoritative base, with optional --stat output |
fm-watch-arm.sh |
Verified per-home watcher re-arm; reports started, healthy, or FAILED; --restart relaunches only this home's watcher |
fm-watch.sh |
Singleton-safe one-shot watcher; blocks until supervision work is due, queues it durably, then exits with one reason line |
fm-supervise-daemon.sh |
Presence-gated sub-supervisor for walk-away (/afk) supervision: wraps fm-watch.sh, self-handles routine wakes in bash, and escalates only captain-relevant events as one verified, batched, single-line digest prefixed with a sentinel marker |
fm-wake-drain.sh |
Atomically drain queued watcher wakes before handling supervision work |
fm-send.sh |
Send one verified literal line (or --key Escape) to a crewmate window; exits non-zero when Enter is positively swallowed |
fm-tmux-lib.sh |
Shared tmux pane primitives for busy detection, dim-ghost-aware and border-aware composer detection, and verified submit retry |
fm-peek.sh |
Print a bounded tail of a crewmate pane |
fm-pr-check.sh |
Record a PR-ready task and arm the watcher's merge poll |
fm-promote.sh |
Promote a scout task in place so it becomes a protected ship task |
fm-teardown.sh |
Return the worktree or retire/release a secondmate home; protects ship work, requires scout reports, checks child work, and prints the backlog reminder |
fm-harness.sh |
Detect the running harness; resolve the effective crewmate harness |
fm-lock.sh |
Per-home firstmate session lock |
Firstmate ships these built-in skills you invoke by name.
Claude uses the slash form shown here; codex uses the same names with $, such as $afk.
| Skill | What it does |
|---|---|
/afk |
Enter away-mode supervision: the sub-supervisor self-handles routine wakes in bash and escalates only captain-relevant events as one batched digest, cutting supervision cost while you step away |
/updatefirstmate |
Self-update the running firstmate and its secondmates to the latest from origin with fast-forward-only pulls, then re-read instructions and nudge secondmates |
The shared orchestrator behavior lives in AGENTS.md - edit it like any prompt when the fleet is empty, or dispatch shared-repo edits to a crewmate while tasks are in flight.
The tracked .tasks.toml pins the optional tasks-axi markdown backend to data/backlog.md, with done_keep = 10 and an archive at data/done-archive.md.
When compatible tasks-axi is on PATH, firstmate uses its verbs for routine backlog mutations and keeps secondmate transfers behind fm-backlog-handoff.sh validation; without it, backlog bookkeeping remains manual.
Compatible means the shared bootstrap probe accepts tasks-axi --version as 0.1.1 or newer.
Personal preferences for one captain's fleet live locally in data/captain.md; it is gitignored and read after data/projects.md and optional data/secondmates.md during bootstrap.
Persistent secondmate routes live locally in data/secondmates.md.
Each line records the secondmate id, charter summary, absolute home path, natural-language scope, project clone list, and added date; fm-home-seed.sh validate refuses duplicate ids, duplicate homes, and nested or overlapping homes.
The main first mate routes by reading those scopes with judgment; the project list is provisioning data, not exclusive ownership.
Use fm-home-seed.sh <id> - <project>... to lease a fresh firstmate worktree for the secondmate home.
The lease is held under the secondmate id until explicit retirement or seed rollback returns it, so normal restarts do not free or recycle the home.
Teardown of a leased home fails closed if treehouse return cannot release the lease; plain-clone homes with no treehouse pool slot are removed directly.
Secondmate routes cover no-mistakes and direct-PR projects; local-only projects remain main-firstmate work.
For no-mistakes projects, seeding initializes only projects newly cloned into a secondmate home and refuses to mutate a preexisting clone that is not already initialized.
After creating a secondmate, move existing main-backlog items that you have judged in-scope with fm-backlog-handoff.sh <secondmate-id> <item-key>...; it is idempotent and refuses in-flight items or non-secondmate homes.
Set FM_SECONDMATE_CHARTER to seed from inline charter text when no filled charter brief exists; set FM_SECONDMATE_SCOPE when the routing scope should differ from the charter text.
FM_HOME selects the operational home for one firstmate instance.
When it is unset, the repo root is the home; when it is set, scripts still run from this repo's bin/, but state/, data/, config/, and projects/ come from $FM_HOME.
Harness support is a table in section 4: claude, codex, opencode, and pi are all empirically verified; new harnesses get verified through a supervised trial task before joining the table.
Runtime tuning via environment variables (defaults shown):
FM_HOME= # optional operational home; unset means this repo root
FM_POLL=15 # seconds between watcher cycles
FM_HEARTBEAT=600 # base seconds between fleet reviews; backs off exponentially while idle
FM_HEARTBEAT_MAX=7200 # heartbeat backoff cap
FM_CHECK_INTERVAL=300 # seconds between slow checks (merged-PR polls)
FM_CHECK_TIMEOUT=30 # seconds allowed per slow check script
FM_LOCK_STALE_AFTER=2 # seconds before dead-pid lock records can be reclaimed; mid-acquire locks keep at least 2s grace
FM_GUARD_GRACE=300 # seconds before guard warnings and arm health checks treat a watcher beacon as stale
FM_ARM_CONFIRM_TIMEOUT=10 # seconds fm-watch-arm waits to confirm a fresh watcher before reporting FAILED
FM_WATCHER_STALE_GRACE=300 # defaults to FM_GUARD_GRACE; seconds a live watcher lock may have a stale beacon before re-arm errors
FM_SIGNAL_GRACE=30 # seconds to coalesce nearby status and turn-end signals into one wake
FM_FLEET_SYNC_BOOTSTRAP_TIMEOUT=20 # seconds allowed for bootstrap's best-effort clone refresh
FM_FLEET_PRUNE=1 # set to 0 to skip pruning local branches whose upstream is gone
FM_BUSY_REGEX='esc (to )?interrupt|Working\.\.\.' # busy-pane signatures, shared by watcher and tmux helper
FM_COMPOSER_IDLE_RE= # optional empty-composer regex, applied after dim-ghost and border stripping
FM_SEND_RETRIES=3 # fm-send Enter-retry attempts after typing the line once
FM_SEND_SLEEP=0.4 # seconds between fm-send submit checks
# sub-supervisor (bin/fm-supervise-daemon.sh); presence-gated via /afk
FM_SUPERVISOR_TARGET=firstmate:0 # supervisor tmux target (override; auto-discovers from $TMUX_PANE)
FM_INJECT_SKIP=heartbeat # |-prefixes force-self-handled bypassing classification; empty disables
FM_STALE_ESCALATE_SECS=240 # idle seconds before a stale pane escalates as a possible wedge
FM_ESCALATE_BATCH_SECS=90 # buffer window for batched escalation digests; 0 = flush immediately
FM_MAX_DEFER_SECS=300 # max buffered escalation age before retry plus wedge alarm; 0 disables
FM_INJECT_CONFIRM_RETRIES=3 # daemon Enter-retry attempts after typing a digest once
FM_INJECT_CONFIRM_SLEEP=0.5 # seconds between daemon submit checks
FM_HEARTBEAT_SCAN_SECS=300 # cadence of the catch-all status scan for missed captain verbs
FM_HOUSEKEEPING_TICK=15 # seconds between batch-flush, stale-recheck, and scan passesTracked changes to firstmate itself, including AGENTS.md, README.md, CONTRIBUTING.md, .tasks.toml, .github/workflows/, bin/, and agent skill files, ship through the no-mistakes pipeline on a feature branch and require the captain's explicit merge approval.
When supervising live crewmates, keep firstmate's own long validation or build commands in the background so watcher wakes can still be handled.
A crewmate driving its own no-mistakes validation does the opposite: it runs the gate in the foreground and lets each synchronous no-mistakes axi run or no-mistakes axi respond call return.
The pipeline owns auto-fix changes; the crewmate authorizes them with no-mistakes axi respond --action fix --findings <ids> instead of editing or committing while the run is active.
Human-authored pull requests targeting main must be raised through git push no-mistakes; see CONTRIBUTING.md for the enforced contributor workflow.
Local .no-mistakes/ state and test evidence stay out of this repo; .no-mistakes.yaml keeps evidence in a temp directory instead.
The current watcher reliability work keeps the one-shot process model and adds a durable queue, race-proof singleton lock, duplicate self-eviction, and a self-verifying tracked-child arm wrapper.
The presence-gated sub-supervisor (bin/fm-supervise-daemon.sh) provides proactive wake routing for walk-away supervision via the /afk skill; a blocking-waiter split remains a deferred follow-up phase.
bash -n bin/*.sh # syntax-check the toolbelt
shellcheck bin/*.sh tests/*.sh # lint the toolbelt and behavior tests; CI enforces this
for test_script in tests/*.test.sh; do "$test_script"; done # behavior tests, matching CI
tests/fm-wake-queue.test.sh # durable wake queue, race-proof locks, watcher singleton/re-arm behavior, sub-supervisor classifier, /afk presence-gating, border-aware composer, max-defer, and fm-send submit tests
tests/fm-composer-ghost.test.sh # dim-ghost stripping, ghost-only composer detection, and escape-free peek tests
tests/fm-afk-inject-e2e.test.sh # private-socket end-to-end test of the afk injection path (partial-input deferral, swallowed-Enter retry)
tests/fm-bootstrap.test.sh # bootstrap dependency and feature-probe tests
tests/fm-update.test.sh # fast-forward-only self-update, reread, nudge, dedup, and skip-safety tests
tests/fm-secondmate.test.sh # persistent secondmate routing, seeding, idle charter, backlog handoff, spawn, recovery, teardown, and FM_HOME tests
tests/fm-teardown.test.sh # fm-teardown.sh safety and reminder checks: local-only fork-remote allow, truly-unpushed refuse, merged-to-main allow, no-mistakes regression, tasks-axi reminder, --force override
[ "$(readlink CLAUDE.md)" = "AGENTS.md" ]
[ "$(readlink .claude/skills)" = "../.agents/skills" ]
FM_HEARTBEAT=2 FM_POLL=1 bin/fm-watch-arm.sh # watcher re-arm smoke test (prints arm status, then "heartbeat")