Skip to content

esotericzhang/firstmate

 
 

Repository files navigation

firstmate

Platform X Discord

Talk to one agent. Ship with a crew.

firstmate - talk to one agent, ship with a crew

You can run one coding agent easily. But the moment you want three project tasks done in parallel - fixes, investigations, plans, audits - you become a tab-juggler: babysitting sessions, copy-pasting context between repos, forgetting which terminal had the failing test.

firstmate flips the model. You talk to a single agent - the first mate - and it runs the crew for you: spawning autonomous agents in tmux windows, giving each a clean git worktree, supervising them to completion, and handing you finished PRs, approved local merges, or standalone investigation reports. For larger fleets, you can opt in to persistent secondmates: domain supervisors that are still ordinary direct reports, but run from their own isolated firstmate homes. There is no app to install; the whole orchestrator is an AGENTS.md file that any terminal coding agent can follow.

  • One liaison - you never talk to a worker agent. The first mate dispatches, supervises, escalates only real decisions, and reports plain outcomes about work that is ready, blocked, or needs your call.
  • A visible crew - every crewmate lives in a tmux window. Watch any of them work, or type into their window to intervene; the first mate reconciles.
  • Persistent domain supervisors - route natural-language scopes through data/secondmates.md when a domain deserves its own long-lived supervisor. Each secondmate has a separate FM_HOME, local state, local projects, and its own session lock, while the main first mate still supervises it like any other direct report.
  • Guarded by construction - the first mate is read-only over your projects except for clean local default-branch refreshes, safe pruning of local branches whose remote is gone, and approved local-only fast-forward merges; crewmates work in disposable treehouse worktrees. Ship tasks follow each project's delivery mode, and scout tasks produce local reports without pushing anything.

This is not an agent harness. This is not a skill. This is not a CLI.

This is.. a directory that turns any agent into your firstmate, and you the captain.

Quick Start

$ git clone https://github.com/kunchenguid/firstmate && cd firstmate
$ claude   # launch your agent harness here; AGENTS.md takes over

> ahoy! look at my github project xyz, then fix the flaky login test and add dark mode

# firstmate checks its toolchain (asking your consent before installing anything),
# clones the project under projects/, and spawns two crewmates in tmux windows
# fm-fix-login-k3 and fm-dark-mode-p7.
# Minutes later:

  PR ready for review, captain: https://github.com/you/xyz/pull/42
  (fix flaky login test - risk: low - CI green)

> alright merge it

Install

Prerequisites (the first mate detects everything else and offers to install it):

# 1. a verified agent harness - claude, codex, opencode, or pi
# 2. git + GitHub auth
# 3. tmux - the crew lives in tmux windows (firstmate offers to install it if missing)
gh auth login

Get firstmate:

git clone https://github.com/kunchenguid/firstmate
cd firstmate && claude

That is the whole install. On first launch the first mate detects what its required toolchain is missing or too old (tmux, node, gh, treehouse with durable lease support, no-mistakes, gh-axi, chrome-devtools-axi, lavish-axi), lists it with the exact install commands, and installs only after you say go. If compatible tasks-axi is already on PATH, bootstrap records it as an optional capability fact and firstmate uses its verbs for routine backlog mutations; when it is absent or incompatible, firstmate keeps hand-editing data/backlog.md exactly as before.

Run it inside tmux for the best experience. firstmate works from any terminal - outside tmux, crewmates land in a detached firstmate session you can attach to - but launching your harness from inside tmux puts every crewmate window in your own session, one per task, where you can watch the crew work in real time or type into any window to intervene.

How It Works

            you (the captain)
                  │  chat: requests, decisions, "merge it"
                  ▼
 ┌─────────────────────────────────────┐
 │ firstmate            (this repo)    │
 │ reads projects/ + firstmate routes  │
 │ writes guarded backlog/briefs/state │
 └──┬──────────────┬───────────────┬───┘
    │ tmux send-keys / status files │
    ▼              ▼               ▼
 ┌────────┐   ┌────────┐      ┌────────┐
 │fm-task1│   │fm-task2│  ... │fm-taskN│   tmux windows you can watch
 │crewmate│   │crewmate│      │crewmate│   one autonomous agent each
 └───┬────┘   └───┬────┘      └───┬────┘
     ▼            ▼               ▼
  treehouse worktree or isolated secondmate home
     │
     ├─ ship: project mode ► PR/local merge ► teardown
     │
     └─ scout: report at data/<id>/report.md ► relay findings ► teardown
  • Event-driven supervision - a zero-token bash watcher (bin/fm-watch.sh) sleeps on the fleet and wakes the first mate only when a crewmate reports, stalls, a PR merges, or an internal heartbeat review is due. Detected wakes are also written to a durable local queue (state/.wake-queue) before detector state advances, so a missed one-shot process exit can be recovered by draining the queue. Routine watcher polling, restarts, elapsed waiting time, and unchanged heartbeat reviews stay silent; an idle crew costs you nothing. A pull-based guard (bin/fm-guard.sh) warns through supervision tool output if tasks are in flight and that watcher stops running or queued wakes are waiting to be drained. A presence-gated sub-supervisor (bin/fm-supervise-daemon.sh) extends this for walk-away supervision: the /afk skill activates it, after which it self-handles routine wakes in bash and escalates only captain-relevant events as one batched, single-line digest (prefixed with an in-band sentinel marker so firstmate can tell daemon injections apart from real messages). Its injection path shares bin/fm-tmux-lib.sh with fm-send.sh, so dim-ghost-aware and border-aware composer detection plus verified submit retry stay consistent; stalled escalation delivery raises state/.subsuper-inject-wedged after FM_MAX_DEFER_SECS instead of silently deferring forever.
  • Worktrees, not branches in your checkout - crewmates never touch your clone; treehouse pools clean worktrees so parallel tasks on one repo cannot collide.
  • Two task shapes - ship tasks change projects and ship by project mode (no-mistakes, direct-PR, or local-only); scout tasks investigate, plan, reproduce bugs, or audit, then leave a report at data/<id>/report.md and never push.
  • Optional secondmates - data/secondmates.md records persistent domain supervisors with natural-language scopes, project clone lists, and home paths. fm-home-seed.sh provisions the isolated home, clones the listed PR-based projects into it, initializes newly cloned no-mistakes projects, copies the charter to data/charter.md, and fm-spawn.sh --secondmate launches it through the same tmux and status-file path as any direct report. When seeded with -, the home is a durable treehouse lease under the secondmate id, so it survives with no live process and is not recycled by later treehouse get or pruning. Retirement or seed rollback returns the leased home; normal restart/recovery keeps it leased. If returning the lease fails during teardown, firstmate leaves the route and home intact instead of hiding a still-held lease. Seeding is transactional: if validation, cloning, initialization, or registry update fails, generated briefs, new homes, new project clones, and registry edits are rolled back. local-only projects stay with the main first mate because they merge into the main local checkout instead of a remote-backed PR path. The same project may appear in multiple secondmate homes when their scopes differ, such as issue triage versus feature development. Secondmates are idle by default: after startup recovery reconciles only work already in their own home, an empty queue waits silently for routed tasks, and they never self-initiate surveys or audits. After seeding a secondmate, fm-backlog-handoff.sh moves already-judged in-scope queued items from the main backlog into that secondmate home so the domain queue starts in the right place. Idle secondmate panes are healthy; teardown is explicit and refuses while the secondmate home has in-flight work unless the captain has approved discard with --force.
  • Project modes are explicit - data/projects.md records each project's delivery mode and optional +yolo autonomy flag. no-mistakes projects run the full validation pipeline, direct-PR projects open PRs without that pipeline, and local-only projects stay local until firstmate performs an approved fast-forward merge.
  • Project memory belongs to projects - durable project-intrinsic agent knowledge lives in each project's committed AGENTS.md, with CLAUDE.md as a symlink. Ship briefs prompt crewmates to create or update those files through the normal delivery path; data/projects.md stays a thin private registry.
  • Local clones stay fresh - bootstrap and PR-based teardown refresh remote-backed project clones with clean default-branch fast-forwards when the clone is on the default branch and has no local work, and prune local branches whose remote is gone and that no worktree still needs.
  • Self-updates stay safe - /updatefirstmate fast-forwards the running firstmate repo and registered secondmate homes from origin, then re-reads updated instructions and nudges updated secondmates without touching project clones. The update is fast-forward only: dirty, diverged, offline, and off-default targets are reported and left untouched.
  • Restart-proof - all state lives in tmux, status files, local markdown under data/, data/secondmates.md, and persistent secondmate homes. Kill the first mate session anytime; the next one reconciles and carries on.

The bin/ toolbelt

The first mate drives these; you rarely need to, but they work by hand too.

Script Description
fm-bootstrap.sh Detect required toolchain problems and optional capability facts; refresh clones best-effort; install tools only after consent
fm-fleet-sync.sh Fetch clones, clean-fast-forward their checked-out default branches, and safely prune branches whose remote is gone
fm-update.sh Self-update the running firstmate repo and registered secondmate homes with fast-forward-only pulls from origin
fm-backlog-handoff.sh Move already-judged in-scope queued backlog items from the main home into a seeded secondmate home
fm-brief.sh Scaffold a ship brief, a report-only scout brief with --scout, or a secondmate charter with --secondmate
fm-ensure-agents-md.sh Ensure project AGENTS.md is the real memory file and CLAUDE.md symlinks to it
fm-guard.sh Warn when tasks are in flight but queued wakes are pending or the watcher liveness beacon is stale or missing
fm-home-seed.sh Lease/provision a secondmate home transactionally, clone projects, initialize gates, and maintain data/secondmates.md
fm-spawn.sh Spawn one task, several id=repo pairs, or a persistent secondmate with --secondmate
fm-project-mode.sh Resolve a project's delivery mode and +yolo flag from data/projects.md
fm-merge-local.sh Fast-forward a local-only project's local default branch after approval
fm-review-diff.sh Review a crewmate branch against the authoritative base, with optional --stat output
fm-watch.sh Singleton-safe one-shot watcher; blocks until supervision work is due, queues it durably, then exits with one reason line
fm-supervise-daemon.sh Presence-gated sub-supervisor for walk-away (/afk) supervision: wraps fm-watch.sh, self-handles routine wakes in bash, and escalates only captain-relevant events as one verified, batched, single-line digest prefixed with a sentinel marker
fm-wake-drain.sh Atomically drain queued watcher wakes before handling supervision work
fm-send.sh Send one verified literal line (or --key Escape) to a crewmate window; exits non-zero when Enter is positively swallowed
fm-tmux-lib.sh Shared tmux pane primitives for busy detection, dim-ghost-aware and border-aware composer detection, and verified submit retry
fm-peek.sh Print a bounded tail of a crewmate pane
fm-pr-check.sh Record a PR-ready task and arm the watcher's merge poll
fm-promote.sh Promote a scout task in place so it becomes a protected ship task
fm-teardown.sh Return the worktree or retire/release a secondmate home; protects ship work, requires scout reports, checks child work, and prints the backlog reminder
fm-harness.sh Detect the running harness; resolve the effective crewmate harness
fm-lock.sh Per-home firstmate session lock

Configuration

The shared orchestrator behavior lives in AGENTS.md - edit it like any prompt when the fleet is empty, or dispatch shared-repo edits to a crewmate while tasks are in flight. The tracked .tasks.toml pins the optional tasks-axi markdown backend to data/backlog.md, with done_keep = 10 and an archive at data/done-archive.md. When compatible tasks-axi is on PATH, firstmate uses its verbs for routine backlog mutations and keeps secondmate transfers behind fm-backlog-handoff.sh validation; without it, backlog bookkeeping remains manual. Compatible means the shared bootstrap probe accepts tasks-axi --version as 0.1.1 or newer. Personal preferences for one captain's fleet live locally in data/captain.md; it is gitignored and read after data/projects.md and optional data/secondmates.md during bootstrap. Persistent secondmate routes live locally in data/secondmates.md. Each line records the secondmate id, charter summary, absolute home path, natural-language scope, project clone list, and added date; fm-home-seed.sh validate refuses duplicate ids, duplicate homes, and nested or overlapping homes. The main first mate routes by reading those scopes with judgment; the project list is provisioning data, not exclusive ownership. Use fm-home-seed.sh <id> - <project>... to lease a fresh firstmate worktree for the secondmate home. The lease is held under the secondmate id until explicit retirement or seed rollback returns it, so normal restarts do not free or recycle the home. Teardown of a leased home fails closed if treehouse return cannot release the lease; plain-clone homes with no treehouse pool slot are removed directly. Secondmate routes cover no-mistakes and direct-PR projects; local-only projects remain main-firstmate work. For no-mistakes projects, seeding initializes only projects newly cloned into a secondmate home and refuses to mutate a preexisting clone that is not already initialized. After creating a secondmate, move existing main-backlog items that you have judged in-scope with fm-backlog-handoff.sh <secondmate-id> <item-key>...; it is idempotent and refuses in-flight items or non-secondmate homes. Set FM_SECONDMATE_CHARTER to seed from inline charter text when no filled charter brief exists; set FM_SECONDMATE_SCOPE when the routing scope should differ from the charter text. FM_HOME selects the operational home for one firstmate instance. When it is unset, the repo root is the home; when it is set, scripts still run from this repo's bin/, but state/, data/, config/, and projects/ come from $FM_HOME. Harness support is a table in section 4: claude, codex, opencode, and pi are all empirically verified; new harnesses get verified through a supervised trial task before joining the table.

Runtime tuning via environment variables (defaults shown):

FM_HOME=                 # optional operational home; unset means this repo root
FM_POLL=15              # seconds between watcher cycles
FM_HEARTBEAT=600        # base seconds between fleet reviews; backs off exponentially while idle
FM_HEARTBEAT_MAX=7200   # heartbeat backoff cap
FM_CHECK_INTERVAL=300   # seconds between slow checks (merged-PR polls)
FM_CHECK_TIMEOUT=30     # seconds allowed per slow check script
FM_GUARD_GRACE=300      # seconds a stale watcher beacon may age before guard warnings
FM_SIGNAL_GRACE=30      # seconds to coalesce nearby status and turn-end signals into one wake
FM_FLEET_SYNC_BOOTSTRAP_TIMEOUT=20   # seconds allowed for bootstrap's best-effort clone refresh
FM_FLEET_PRUNE=1        # set to 0 to skip pruning local branches whose upstream is gone
FM_BUSY_REGEX='esc (to )?interrupt|Working\.\.\.'   # busy-pane signatures, shared by watcher and tmux helper
FM_COMPOSER_IDLE_RE=    # optional empty-composer regex, applied after dim-ghost and border stripping
FM_SEND_RETRIES=3       # fm-send Enter-retry attempts after typing the line once
FM_SEND_SLEEP=0.4       # seconds between fm-send submit checks
# sub-supervisor (bin/fm-supervise-daemon.sh); presence-gated via /afk
FM_SUPERVISOR_TARGET=firstmate:0   # supervisor tmux target (override; auto-discovers from $TMUX_PANE)
FM_INJECT_SKIP=heartbeat           # |-prefixes force-self-handled bypassing classification; empty disables
FM_STALE_ESCALATE_SECS=240         # idle seconds before a stale pane escalates as a possible wedge
FM_ESCALATE_BATCH_SECS=90          # buffer window for batched escalation digests; 0 = flush immediately
FM_MAX_DEFER_SECS=300              # max buffered escalation age before retry plus wedge alarm; 0 disables
FM_INJECT_CONFIRM_RETRIES=3        # daemon Enter-retry attempts after typing a digest once
FM_INJECT_CONFIRM_SLEEP=0.5        # seconds between daemon submit checks
FM_HEARTBEAT_SCAN_SECS=300         # cadence of the catch-all status scan for missed captain verbs
FM_HOUSEKEEPING_TICK=15            # seconds between batch-flush, stale-recheck, and scan passes

Development

Tracked changes to firstmate itself, including AGENTS.md, README.md, CONTRIBUTING.md, .tasks.toml, .github/workflows/, bin/, and agent skill files, ship through the no-mistakes pipeline on a feature branch and require the captain's explicit merge approval. When supervising live crewmates, keep long validation or build work in the background so watcher wakes can still be handled. Human-authored pull requests targeting main must be raised through git push no-mistakes; see CONTRIBUTING.md for the enforced contributor workflow. Local .no-mistakes/ state and test evidence stay out of this repo; .no-mistakes.yaml keeps evidence in a temp directory instead. The current watcher reliability work keeps the one-shot process model and adds a durable queue plus singleton lock. The presence-gated sub-supervisor (bin/fm-supervise-daemon.sh) provides proactive wake routing for walk-away supervision via the /afk skill; a blocking-waiter split remains a deferred follow-up phase.

bash -n bin/*.sh                          # syntax-check the toolbelt
shellcheck bin/*.sh tests/*.sh            # lint the toolbelt and behavior tests; CI enforces this
for test_script in tests/*.test.sh; do "$test_script"; done   # behavior tests, matching CI
tests/fm-wake-queue.test.sh               # durable wake queue, singleton behavior, sub-supervisor classifier, /afk presence-gating, border-aware composer, max-defer, and fm-send submit tests
tests/fm-composer-ghost.test.sh           # dim-ghost stripping, ghost-only composer detection, and escape-free peek tests
tests/fm-afk-inject-e2e.test.sh           # private-socket end-to-end test of the afk injection path (partial-input deferral, swallowed-Enter retry)
tests/fm-bootstrap.test.sh                # bootstrap dependency and feature-probe tests
tests/fm-update.test.sh                   # fast-forward-only self-update, reread, nudge, dedup, and skip-safety tests
tests/fm-secondmate.test.sh               # persistent secondmate routing, seeding, idle charter, backlog handoff, spawn, recovery, teardown, and FM_HOME tests
tests/fm-teardown.test.sh                 # fm-teardown.sh safety and reminder checks: local-only fork-remote allow, truly-unpushed refuse, merged-to-main allow, no-mistakes regression, tasks-axi reminder, --force override
[ "$(readlink CLAUDE.md)" = "AGENTS.md" ]
[ "$(readlink .claude/skills)" = "../.agents/skills" ]
FM_HEARTBEAT=2 FM_POLL=1 bin/fm-watch.sh  # watcher smoke test (prints "heartbeat")

About

Talk to one agent. Ship with a crew.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 100.0%