fix(autopilot): scope lock file to GBRAIN_HOME#1227
Closed
rafaelreis-r wants to merge 7 commits into
Closed
Conversation
Adds src/core/fts-language.ts with getFtsLanguage(), a centralized helper that reads the GBRAIN_FTS_LANGUAGE env var (default 'english'). Refactors postgres-engine.ts and pglite-engine.ts to use the helper in their search queries, replacing four hardcoded 'english' literals across searchKeyword and searchKeywordChunks. Why: non-English brains lose stemming and stop-word removal because the tokenizer is wired to English regardless of content language. A user storing Portuguese pages currently gets dramatically worse keyword search than an equivalent English brain. This PR fixes the *query side* of the problem with zero behavior change for the default case. The trigger functions in schema.sql/schema-embedded.ts/pglite-schema.ts still hardcode 'english' for the write side — that's covered in a follow-up PR (recreate triggers idempotently from a migration). Validation: - VALID_CONFIG_NAME regex (/^[a-z][a-z0-9_]*$/) blocks SQL injection since Postgres tsvector functions don't accept parameterized config names — the value must be interpolated into the query string. - Invalid values fall back to 'english' with a one-time warning. - Cached after first read; tests reset via resetFtsLanguageCache(). Tests: 14 unit tests covering defaults, cache, validation rules, and SQL-injection guard. Backward-compatible: 100% — default behavior identical when env unset. (cherry picked from commit 43ffe13)
…language Builds on PR garrytan#1 (GBRAIN_FTS_LANGUAGE env var) by extending configurability to the *write side*: the trigger functions that populate pages.search_vector and content_chunks.search_vector now use the language from getFtsLanguage() instead of hardcoded 'english'. Implementation: schema migration v33 (handler-based, not static SQL). The handler reads getFtsLanguage() at apply time and issues CREATE OR REPLACE FUNCTION for the two trigger functions, atomically swapping their bodies. The triggers themselves don't need recreation because they reference the function by name. Backfill: when the configured language differs from 'english', v33 also re-tokenizes existing rows under the new tokenizer (UPDATE-to-self on pages, direct UPDATE on content_chunks). Skipped for 'english' to avoid wasted I/O when defaults are kept. Validation strategy: the language string flows through getFtsLanguage(), which enforces /^[a-z][a-z0-9_]*$/ before interpolation \u2014 SQL injection is structurally impossible. Tests include a deliberate injection attempt ('english\'; DROP TABLE pages; --') that verifies the fallback to 'english' kicks in and no DROP TABLE appears in any emitted SQL. Validated against a real Postgres brain (2782 pages, 4372 chunks): - apply-migrations succeeds with GBRAIN_FTS_LANGUAGE=pt_br - search 'opera\u00e7\u00f5es' (with diacritics) returns hits using pt_br stemmer - re-running migrate is idempotent (CREATE OR REPLACE) - re-running with same env is a no-op (version stays 33) Tests: 7 unit tests covering registration, handler shape, default-vs-non-default backfill behavior, and SQL injection guard. Combined with PR garrytan#1's helper tests (14): 21/21 pass. Limitation: changing GBRAIN_FTS_LANGUAGE *after* v33 has been applied requires resetting config.version to 32 to re-apply (documented in README). PR garrytan#3 in this series introduces 'gbrain reindex --search-vector' to recreate-and-backfill on demand without the version-stamp dance. Backward-compatible: 100% \u2014 default GBRAIN_FTS_LANGUAGE='english' produces identical trigger output to the pre-v33 schema. (cherry picked from commit d73b7e1)
Completes the GBRAIN_FTS_LANGUAGE story (PRs garrytan#1, garrytan#2 in this series) by giving users an explicit way to recreate FTS trigger functions and backfill existing rows after changing the language env var. Why: schema migration v33 (PR garrytan#2) stamps the trigger functions with GBRAIN_FTS_LANGUAGE on first apply and then the migrations runner considers v33 'done'. Users who later change the env var would need to manually reset config.version to re-trigger v33 \u2014 fragile and undocumented. This CLI command is the documented escape hatch: explicit, gated, idempotent. Behavior: - Reads GBRAIN_FTS_LANGUAGE via the same getFtsLanguage() helper as the engines and v33 migration, so all three sources of truth stay in lockstep. - --dry-run shows row counts (pages + chunks affected) without touching the DB. - --yes / -y skips interactive prompt; required in non-TTY contexts. - --json emits a structured result envelope (status, language, counts, durationMs) for scripting. - Trigger recreate is atomic via CREATE OR REPLACE FUNCTION, so the two writes are individually atomic; backfill is two UPDATEs (pages UPDATE-to-self re-fires the trigger; content_chunks gets a direct vector compute). Validated against a real Postgres brain (2782 pages, 4372 chunks): - --dry-run reports correct counts, exits 0 without writes - --yes completes in ~7-8s, search 'opera\u00e7\u00f5es' continues to work afterward - --json output parses cleanly Tests: 6 unit tests covering --dry-run shortcuts, default vs non-default language behavior, SQL injection guard (same as PRs garrytan#1/garrytan#2), and edge cases (empty inventory, durationMs presence). With PRs garrytan#1+garrytan#2: 27/27 unit tests pass. Trade-offs considered: - Could persist language in the config table instead of relying on env var. Decided against: env var is the established pattern in GBrain (GBRAIN_EMBED_MODEL, GBRAIN_BRAIN_ID, GBRAIN_DATABASE_URL etc.) and adding a config table row creates ambiguity about which wins (env vs DB). Single source of truth via env is simpler. - Could auto-detect language drift (compare configured vs trigger body in pg_proc) and warn at startup. Out of scope for this PR; file as a follow-up if there's demand. Backward-compatible: command is additive. Default behavior of the brain (with no language env var set) is unchanged. (cherry picked from commit adf11ec)
… detection, evals, E2E, filing rules, USAGE docs (cherry picked from commit 396aba4)
The lock file path was hardcoded to $HOME/.gbrain/autopilot.lock, ignoring GBRAIN_HOME. When two brains share a host (e.g. main brain and a side brain), only the first autopilot to acquire the lock runs; the second sees a fresh lock (<10min) and silently exits with code 0. Under launchd KeepAlive=true + ThrottleInterval=5s, the second autopilot enters a respawn loop that produces no work but is invisible because the exit is clean. We observed 46,388 silent failures in 3 days on a dual-brain setup before tracing it to this lock. Fix: derive the lock path from GBRAIN_HOME when set, falling back to $HOME/.gbrain. Each brain now gets its own lock and they coexist.
6 tasks
Owner
|
Closing in favor of #1253 (v0.37.7.0 fix wave) which re-implements the same fix against current master via The implementation lives at |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1226.
Problem
gbrain autopilothardcodes the lock file to$HOME/.gbrain/autopilot.lock, ignoringGBRAIN_HOME. When multiple brains share a host, only one autopilot ever runs — the others silently exit on every launchd/systemd respawn, producing tens of thousands of invisible failures.See #1226 for the full story (46,388 silent failures observed in ~3 days on our dual-brain setup).
Fix
Two-line change: derive the lock path from
GBRAIN_HOMEwhen set, fall back to$HOME/.gbrainfor single-brain installs.Same fallback updates applied to the
mkdirSynccall so the directory creation is also scoped correctly.Verification
bun run typecheckclean.autopilot.lockfiles under their respectiveGBRAIN_HOMEdirs.Compatibility