You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Autopilot installed via gbrain autopilot --install → launchd plist at ~/Library/LaunchAgents/com.gbrain.autopilot.plist with KeepAlive=true and RunAtLoad=true
Autopilot repo: a git working tree that another tool may force-push or hard-reset (in my case, ~/.claude/skills/gstack which gets git reset --hard origin/main during gstack auto-upgrades)
Trigger
gbrain autopilot is running normally against --repo <X>. It has cached a sync anchor commit somewhere in its sync state (this case: 17d8df4d).
Something outside gbrain does git reset --hard origin/main (or any other history-rewriting operation) inside <X>. The cached anchor commit is no longer reachable.
Next autopilot cycle: git cat-file <anchor> fails. Autopilot logs:
fatal: git cat-file: could not get object info
Sync anchor commit 17d8df4d missing (force push?). Running full reimport.
Full reimport starts, never converges. Cycle counter eventually hits "cycle-failure-cap" and logs Autopilot stopping (cycle-failure-cap).
Process exits.
launchd restarts the process within seconds (KeepAlive=true).
New process inherits the same stale anchor, repeats from step 3.
The loop is CPU-bound (~99%) because each reimport is a hot loop reading + chunking + extracting 100+ markdown files.
What actually broke for me
Because PGLite is single-process and autopilot held the database file open continuously, gbrain serve (the MCP entry point) could not acquire the database and blocked at startup. Claude Code's MCP client timed out the connection, surfacing as mcp__gbrain__* disconnected in every session for hours. The autopilot loop was invisible from gbrain doctor (--fast --json showed status: warnings, health_score: 90) because doctor checks didn't try to acquire the DB.
This is the lock-contention case that #677 anticipates, with autopilot as the antagonist holding the lock.
Evidence
~/.gbrain/autopilot.err (excerpt across multiple respawns):
fatal: git cat-file: could not get object info
Sync anchor commit 17d8df4d missing (force push?). Running full reimport.
[gbrain import] Skipping symlink: …
[import.files] start
[import.files] 4/123 (3%) imported=4 skipped=0 errors=0
…
[import.files] 78/123 (63%) imported=78 skipped=0 errors=0
[cycle.lint] start
[cycle.lint] done
[cycle.backlinks] start
…
[cycle.sync] start
fatal: git cat-file: could not get object info
Sync anchor commit 17d8df4d missing (force push?). Running full reimport.
…
~/.gbrain/autopilot.log (showing cycle-failure-cap exit and immediate respawn):
[cycle-inline partial] lint=0 backlinks=0 synced=0 extracted=0 embedded=0 orphans=124
[cycle] score=10 elapsed=1s next=150s
Autopilot stopping (cycle-failure-cap).
Autopilot starting. Repo: /Users/…/.claude/skills/gstack, interval: 300s
[autopilot] running steps inline (engine=pglite)
Running full import of /Users/…/.claude/skills/gstack...
Found 123 markdown files
Stale lock file found (>10 min). Taking over.
A missing sync anchor commit should trigger at most one full reimport, after which a new anchor is established at current HEAD.
If full reimport cannot converge, autopilot should exit with a non-zero status and a clear log message; under KeepAlive=true the OS will restart it, but at the very least the loop should make forward progress.
Alternatively: cycle-failure-cap should write a sentinel file or set state such that autopilot self-disables on next launch until manually re-enabled, instead of being respawned indefinitely by launchd.
Full reimport runs every cycle but never updates the sync anchor (or updates it to a value that doesn't survive a process restart).
cycle-failure-cap exits the process but launchd respawns it. The "stopping" message is misleading — autopilot is not actually stopped, the OS supervises it back up.
MCP server is silently unavailable for as long as the loop runs.
Once autopilot is unloaded, gbrain serve acquires the DB cleanly and MCP works. Re-enabling autopilot would re-trigger the loop, so I'm leaving it off pending this fix.
Suggested fixes (non-prescriptive)
When git cat-file <anchor> fails, after the full reimport completes, write the new anchor as the current HEAD of the repo, not the previously-cached value. The current code seems to keep the old anchor across reimports (or never write a new one until incremental sync would have).
Make cycle-failure-cap write a persistent sentinel (~/.gbrain/autopilot-disabled or similar) that the autopilot entrypoint checks before doing any work. Refuse to run until cleared manually. This breaks the launchd respawn loop without requiring the user to unload the plist.
Consider whether gbrain autopilot --install should generate a plist with KeepAlive set to a dictionary (KeepAlive = { SuccessfulExit = false }) instead of true, so that a clean cycle-failure-cap exit (which is intentional, not a crash) doesn't get respawned.
Environment
0.26.0(bun script install viabun install -g gbrain)~/.gbrain/config.jsonengine: pglite)gbrain autopilot --install→ launchd plist at~/Library/LaunchAgents/com.gbrain.autopilot.plistwithKeepAlive=trueandRunAtLoad=true~/.claude/skills/gstackwhich getsgit reset --hard origin/mainduring gstack auto-upgrades)Trigger
gbrain autopilotis running normally against--repo <X>. It has cached a sync anchor commit somewhere in its sync state (this case:17d8df4d).git reset --hard origin/main(or any other history-rewriting operation) inside<X>. The cached anchor commit is no longer reachable.git cat-file <anchor>fails. Autopilot logs:Autopilot stopping (cycle-failure-cap).KeepAlive=true).The loop is CPU-bound (~99%) because each reimport is a hot loop reading + chunking + extracting 100+ markdown files.
What actually broke for me
Because PGLite is single-process and autopilot held the database file open continuously,
gbrain serve(the MCP entry point) could not acquire the database and blocked at startup. Claude Code's MCP client timed out the connection, surfacing asmcp__gbrain__* disconnectedin every session for hours. The autopilot loop was invisible from gbrain doctor (--fast --jsonshowedstatus: warnings, health_score: 90) because doctor checks didn't try to acquire the DB.This is the lock-contention case that #677 anticipates, with autopilot as the antagonist holding the lock.
Evidence
~/.gbrain/autopilot.err(excerpt across multiple respawns):~/.gbrain/autopilot.log(showing cycle-failure-cap exit and immediate respawn):Process info during the loop:
Expected behavior
KeepAlive=truethe OS will restart it, but at the very least the loop should make forward progress.cycle-failure-capshould write a sentinel file or set state such that autopilot self-disables on next launch until manually re-enabled, instead of being respawned indefinitely by launchd.gbrain doctorshould surface "autopilot loop" or "DB held by long-running process" as a check (related to PGLite MCP server and maintenance commands need a cooperative single-owner mode #677's cooperative-single-owner direction).Actual behavior
cycle-failure-capexits the process but launchd respawns it. The "stopping" message is misleading — autopilot is not actually stopped, the OS supervises it back up.Workaround
Once autopilot is unloaded,
gbrain serveacquires the DB cleanly and MCP works. Re-enabling autopilot would re-trigger the loop, so I'm leaving it off pending this fix.Suggested fixes (non-prescriptive)
git cat-file <anchor>fails, after the full reimport completes, write the new anchor as the current HEAD of the repo, not the previously-cached value. The current code seems to keep the old anchor across reimports (or never write a new one until incremental sync would have).cycle-failure-capwrite a persistent sentinel (~/.gbrain/autopilot-disabledor similar) that the autopilot entrypoint checks before doing any work. Refuse to run until cleared manually. This breaks the launchd respawn loop without requiring the user to unload the plist.gbrain autopilot --installshould generate a plist withKeepAliveset to a dictionary (KeepAlive = { SuccessfulExit = false }) instead oftrue, so that a cleancycle-failure-capexit (which is intentional, not a crash) doesn't get respawned.gbrain servecan coexist with autopilot on PGLite.Related