fix(autopilot): scope lock file to GBRAIN_HOME by rafaelreis-r · Pull Request #1227 · garrytan/gbrain

rafaelreis-r · 2026-05-20T13:18:11Z

Problem

gbrain autopilot hardcodes the lock file to $HOME/.gbrain/autopilot.lock, ignoring GBRAIN_HOME. When multiple brains share a host, only one autopilot ever runs — the others silently exit on every launchd/systemd respawn, producing tens of thousands of invisible failures.

See #1226 for the full story (46,388 silent failures observed in ~3 days on our dual-brain setup).

Fix

Two-line change: derive the lock path from GBRAIN_HOME when set, fall back to $HOME/.gbrain for single-brain installs.

const gbrainHome = process.env.GBRAIN_HOME || join(process.env.HOME || '', '.gbrain');
const lockPath = join(gbrainHome, 'autopilot.lock');

Same fallback updates applied to the mkdirSync call so the directory creation is also scoped correctly.

Verification

bun run typecheck clean.
Tested locally on a dual-brain mac-mini setup: both autopilots now coexist with separate autopilot.lock files under their respective GBRAIN_HOME dirs.
Single-brain installs unaffected (env var unset → same fallback path as before).

Compatibility

Backwards compatible: existing single-brain setups behave identically.
No schema or config changes.
Stale lock takeover logic (>10min mtime) unchanged.

Adds src/core/fts-language.ts with getFtsLanguage(), a centralized helper that reads the GBRAIN_FTS_LANGUAGE env var (default 'english'). Refactors postgres-engine.ts and pglite-engine.ts to use the helper in their search queries, replacing four hardcoded 'english' literals across searchKeyword and searchKeywordChunks. Why: non-English brains lose stemming and stop-word removal because the tokenizer is wired to English regardless of content language. A user storing Portuguese pages currently gets dramatically worse keyword search than an equivalent English brain. This PR fixes the *query side* of the problem with zero behavior change for the default case. The trigger functions in schema.sql/schema-embedded.ts/pglite-schema.ts still hardcode 'english' for the write side — that's covered in a follow-up PR (recreate triggers idempotently from a migration). Validation: - VALID_CONFIG_NAME regex (/^[a-z][a-z0-9_]*$/) blocks SQL injection since Postgres tsvector functions don't accept parameterized config names — the value must be interpolated into the query string. - Invalid values fall back to 'english' with a one-time warning. - Cached after first read; tests reset via resetFtsLanguageCache(). Tests: 14 unit tests covering defaults, cache, validation rules, and SQL-injection guard. Backward-compatible: 100% — default behavior identical when env unset. (cherry picked from commit 43ffe13)

…language Builds on PR garrytan#1 (GBRAIN_FTS_LANGUAGE env var) by extending configurability to the *write side*: the trigger functions that populate pages.search_vector and content_chunks.search_vector now use the language from getFtsLanguage() instead of hardcoded 'english'. Implementation: schema migration v33 (handler-based, not static SQL). The handler reads getFtsLanguage() at apply time and issues CREATE OR REPLACE FUNCTION for the two trigger functions, atomically swapping their bodies. The triggers themselves don't need recreation because they reference the function by name. Backfill: when the configured language differs from 'english', v33 also re-tokenizes existing rows under the new tokenizer (UPDATE-to-self on pages, direct UPDATE on content_chunks). Skipped for 'english' to avoid wasted I/O when defaults are kept. Validation strategy: the language string flows through getFtsLanguage(), which enforces /^[a-z][a-z0-9_]*$/ before interpolation \u2014 SQL injection is structurally impossible. Tests include a deliberate injection attempt ('english\'; DROP TABLE pages; --') that verifies the fallback to 'english' kicks in and no DROP TABLE appears in any emitted SQL. Validated against a real Postgres brain (2782 pages, 4372 chunks): - apply-migrations succeeds with GBRAIN_FTS_LANGUAGE=pt_br - search 'opera\u00e7\u00f5es' (with diacritics) returns hits using pt_br stemmer - re-running migrate is idempotent (CREATE OR REPLACE) - re-running with same env is a no-op (version stays 33) Tests: 7 unit tests covering registration, handler shape, default-vs-non-default backfill behavior, and SQL injection guard. Combined with PR garrytan#1's helper tests (14): 21/21 pass. Limitation: changing GBRAIN_FTS_LANGUAGE *after* v33 has been applied requires resetting config.version to 32 to re-apply (documented in README). PR garrytan#3 in this series introduces 'gbrain reindex --search-vector' to recreate-and-backfill on demand without the version-stamp dance. Backward-compatible: 100% \u2014 default GBRAIN_FTS_LANGUAGE='english' produces identical trigger output to the pre-v33 schema. (cherry picked from commit d73b7e1)

Completes the GBRAIN_FTS_LANGUAGE story (PRs garrytan#1, garrytan#2 in this series) by giving users an explicit way to recreate FTS trigger functions and backfill existing rows after changing the language env var. Why: schema migration v33 (PR garrytan#2) stamps the trigger functions with GBRAIN_FTS_LANGUAGE on first apply and then the migrations runner considers v33 'done'. Users who later change the env var would need to manually reset config.version to re-trigger v33 \u2014 fragile and undocumented. This CLI command is the documented escape hatch: explicit, gated, idempotent. Behavior: - Reads GBRAIN_FTS_LANGUAGE via the same getFtsLanguage() helper as the engines and v33 migration, so all three sources of truth stay in lockstep. - --dry-run shows row counts (pages + chunks affected) without touching the DB. - --yes / -y skips interactive prompt; required in non-TTY contexts. - --json emits a structured result envelope (status, language, counts, durationMs) for scripting. - Trigger recreate is atomic via CREATE OR REPLACE FUNCTION, so the two writes are individually atomic; backfill is two UPDATEs (pages UPDATE-to-self re-fires the trigger; content_chunks gets a direct vector compute). Validated against a real Postgres brain (2782 pages, 4372 chunks): - --dry-run reports correct counts, exits 0 without writes - --yes completes in ~7-8s, search 'opera\u00e7\u00f5es' continues to work afterward - --json output parses cleanly Tests: 6 unit tests covering --dry-run shortcuts, default vs non-default language behavior, SQL injection guard (same as PRs garrytan#1/garrytan#2), and edge cases (empty inventory, durationMs presence). With PRs garrytan#1+garrytan#2: 27/27 unit tests pass. Trade-offs considered: - Could persist language in the config table instead of relying on env var. Decided against: env var is the established pattern in GBrain (GBRAIN_EMBED_MODEL, GBRAIN_BRAIN_ID, GBRAIN_DATABASE_URL etc.) and adding a config table row creates ambiguity about which wins (env vs DB). Single source of truth via env is simpler. - Could auto-detect language drift (compare configured vs trigger body in pg_proc) and warn at startup. Out of scope for this PR; file as a follow-up if there's demand. Backward-compatible: command is additive. Default behavior of the brain (with no language env var set) is unchanged. (cherry picked from commit adf11ec)

… detection, evals, E2E, filing rules, USAGE docs (cherry picked from commit 396aba4)

The lock file path was hardcoded to $HOME/.gbrain/autopilot.lock, ignoring GBRAIN_HOME. When two brains share a host (e.g. main brain and a side brain), only the first autopilot to acquire the lock runs; the second sees a fresh lock (<10min) and silently exits with code 0. Under launchd KeepAlive=true + ThrottleInterval=5s, the second autopilot enters a respawn loop that produces no work but is invisible because the exit is clean. We observed 46,388 silent failures in 3 days on a dual-brain setup before tracing it to this lock. Fix: derive the lock path from GBRAIN_HOME when set, falling back to $HOME/.gbrain. Each brain now gets its own lock and they coexist.

garrytan · 2026-05-21T04:42:08Z

Closing in favor of #1253 (v0.37.7.0 fix wave) which re-implements the same fix against current master via gbrainPath('autopilot.lock') — the canonical GBRAIN_HOME-aware helper from src/core/config.ts. Same fix shape, same outcome.

The implementation lives at src/commands/autopilot.ts:122-130 with a regression test at test/autopilot-lock-path.test.ts. Your contribution is acknowledged via Co-Authored-By: rafaelreis-r trailer. Thank you for the quick patch — closing #1226 honestly.

rafaelreisR and others added 7 commits May 17, 2026 17:11

fix(migrate): close preceding migration object before v67 entry

26a702b

test: quarantine FTS/reindex tests as *.serial (env mutation isolation)

7a28e29

Sprint 6+7: gdoc-ingest v0.7.0 production-ready — Iron Law, successor…

684032a

… detection, evals, E2E, filing rules, USAGE docs (cherry picked from commit 396aba4)

garrytan mentioned this pull request May 21, 2026

v0.37.7.0 fix wave: federated brains + autopilot safety + OAuth confidential clients #1253

Open

6 tasks

garrytan closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(autopilot): scope lock file to GBRAIN_HOME#1227

fix(autopilot): scope lock file to GBRAIN_HOME#1227
rafaelreis-r wants to merge 7 commits into
garrytan:masterfrom
rafaelreis-r:fix/autopilot-lock-per-gbrain-home

rafaelreis-r commented May 20, 2026

Uh oh!

garrytan commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

rafaelreis-r commented May 20, 2026

Problem

Fix

Verification

Compatibility

Uh oh!

garrytan commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants