Summary
gbrain autopilot uses a lock file at $HOME/.gbrain/autopilot.lock regardless of GBRAIN_HOME. When two brains run on the same host (each with its own GBRAIN_HOME and database), they compete for the same lock — the second autopilot silently exits with code 0 every time, producing a silent respawn loop under launchd/systemd.
Affected
- v0.35.6 (confirmed; introduced when multi-brain support landed)
- Any setup with multiple brains sharing a host
Repro
- Two brains:
~/.gbrain/ and ~/.gbrain-side/, each with its own config.json and Postgres DB.
- Two launchd
KeepAlive agents, each setting its own GBRAIN_HOME env:
com.gbrain.autopilot → GBRAIN_HOME=~/.gbrain
com.gbrain-side.autopilot → GBRAIN_HOME=~/.gbrain-side
- Start both.
- Observe: only the first autopilot to take the lock does any work. The second exits with
"Another autopilot instance is running (lock file is fresh). Exiting." and launchd respawns it every ThrottleInterval seconds.
In our setup this produced 46,388 silent failures in ~3 days before we traced it.
Root cause
src/commands/autopilot.ts:
const lockPath = join(process.env.HOME || '', '.gbrain', 'autopilot.lock');
The path is hardcoded to $HOME/.gbrain/, ignoring GBRAIN_HOME. Every other piece of state (config, sync_repo_path, DB) is correctly scoped by GBRAIN_HOME; the lock is the one place it isn't.
Why it stays invisible
- Exit code is
0 (not a crash) → launchd considers it a normal restart.
- Log line is informational, not an error.
launchctl list shows the agent as healthy because it's always "currently running" between respawns.
- Process never gets to
gbrain dream / extract / embed / orphans → the brain stops evolving but nothing alerts.
Fix
Scope the lock path to GBRAIN_HOME when set. PR: #
const gbrainHome = process.env.GBRAIN_HOME || join(process.env.HOME || '', '.gbrain');
const lockPath = join(gbrainHome, 'autopilot.lock');
Each brain gets its own lock; they coexist cleanly. Single-brain installs are unaffected (fallback to $HOME/.gbrain).
Summary
gbrain autopilotuses a lock file at$HOME/.gbrain/autopilot.lockregardless ofGBRAIN_HOME. When two brains run on the same host (each with its ownGBRAIN_HOMEand database), they compete for the same lock — the second autopilot silently exits with code 0 every time, producing a silent respawn loop under launchd/systemd.Affected
Repro
~/.gbrain/and~/.gbrain-side/, each with its ownconfig.jsonand Postgres DB.KeepAliveagents, each setting its ownGBRAIN_HOMEenv:com.gbrain.autopilot→GBRAIN_HOME=~/.gbraincom.gbrain-side.autopilot→GBRAIN_HOME=~/.gbrain-side"Another autopilot instance is running (lock file is fresh). Exiting."and launchd respawns it everyThrottleIntervalseconds.In our setup this produced 46,388 silent failures in ~3 days before we traced it.
Root cause
src/commands/autopilot.ts:The path is hardcoded to
$HOME/.gbrain/, ignoringGBRAIN_HOME. Every other piece of state (config, sync_repo_path, DB) is correctly scoped byGBRAIN_HOME; the lock is the one place it isn't.Why it stays invisible
0(not a crash) → launchd considers it a normal restart.launchctl listshows the agent as healthy because it's always "currently running" between respawns.gbrain dream/ extract / embed / orphans → the brain stops evolving but nothing alerts.Fix
Scope the lock path to
GBRAIN_HOMEwhen set. PR: #Each brain gets its own lock; they coexist cleanly. Single-brain installs are unaffected (fallback to
$HOME/.gbrain).