fix(sync): scope auto-embed to source by hnshah · Pull Request #1120 · garrytan/gbrain

hnshah · 2026-05-17T19:32:08Z

Summary

Dogfooding incremental code sync against a registered local source exposed that sync imported the changed page under the correct source, then the auto-embed step tried to embed the slug without passing --source.

That falls back to source=default, which can print:

Error embedding hello-js: Page not found: hello-js (source=default)

This threads sourceId into the auto-embed call so incremental source sync embeds the same source row it just imported.

Dogfood evidence

Before:

[sync.imports] 1/1 (100%) hello.js
  Error embedding hello-js: Page not found: hello-js (source=default)
Synced bb58d793..f0f41e36:
  +0 added, ~1 modified, -0 deleted, R0 renamed
  2 chunks created, 1 pages embedded

After:

[sync.imports] 1/1 (100%) hello.js
hello-js: all 2 chunks already embedded
Synced f0f41e36..86977128:
  +0 added, ~1 modified, -0 deleted, R0 renamed
  2 chunks created, 1 pages embedded

Tests

bun test test/sync.test.ts --timeout 120000

47 passing.

`gbrain sync --source-id X` triggered auto-embed for the affected slugs but `runEmbed` ran with no `--source` flag, so it fell back to the default source. For non-default-source syncs the page row lives at (sourceId, slug) — the embed code saw "Page not found" for the right slug under the wrong source, swallowed the error as best-effort, and the sync result reported `embedded: 0` for the wrong reason. `buildAutoEmbedArgs(slugs, sourceId)` is the new helper: when sourceId is set, prepends `--source X`. Exported for the regression test. Pairs with the upcoming source-id write-path audit (P1 #8). Cherry-picked from PR #1120. Co-Authored-By: hnshah <hnshah@users.noreply.github.com>

@sharziki

* fix(sync): accept .tf / .tfvars / .hcl in CODE_EXTENSIONS Terraform repos were invisible to `gbrain sync --strategy code` because the three HCL-family extensions never reached the file walker. Silent data loss — the user thinks the sync covered the repo but the IaC layer was dropped on the floor. detectCodeLanguage() returns null for these extensions, so the chunker falls back to recursive (no tree-sitter grammar for HCL) — the same path toml/yaml take. Closes #878. Co-Authored-By: johnybradshaw <johnybradshaw@users.noreply.github.com> * fix(upgrade): run `bun update gbrain` from Bun's global install root `gbrain upgrade --strategy bun` was failing on canonical `bun install -g github:garrytan/gbrain` installs because `execSync('bun update gbrain')` ran in the user's shell cwd. Bun's update operates on whatever package.json it finds via cwd-walk, so a user not standing in the global root got "No package.json, so nothing to update". resolveBunGlobalRoot() returns the right directory: 1. `$BUN_INSTALL/install/global` when set (operator override). 2. `~/.bun/install/global` (Bun's documented default). 3. Walk up from realpath(argv[1]) looking for `node_modules/gbrain` — handles non-standard installs without trusting argv naming. execFileSync replaces execSync (no shell), with cwd pinned. Error path prints the exact `cd && bun update` recovery command instead of a vague hint. Closes #1029. Cherry-picked from PR #1032. Co-Authored-By: mvanhorn <mvanhorn@users.noreply.github.com> * fix(config): redact sensitive values in `config set` output (closes #892) `gbrain config set openai_api_key sk-...` was echoing the full key to stderr via `console.log('Set %s = %s', key, value)`. Shell scrollback and tmux scroll buffers commonly retain stderr for hours; a screen-share or shoulder-glance during set leaked the secret. The `show` path already redacted but used a naive `.includes('key')` substring check that would mask 'monkey' or 'parsekey' (no false-negative but ugly). Single source of truth: `isSensitiveConfigKey()` uses a word-boundary regex (`(^|[._-])(key|secret|token|password|pwd|passwd|auth)([._-]|$)/i`) so 'openai_api_key' matches but 'monkey' doesn't. `redactConfigValue()` composes the postgresql:// URL redactor + sensitive-key check, used by both `show` and `set`. Helpers exported for unit tests. Closes #892. Cherry-pick of @sharziki's PR #918 (config.ts hunk only — the extract.ts walker change in that PR is unrelated and tracked in #202). Co-Authored-By: sharziki <sharziki@users.noreply.github.com> * fix(oauth): throw InvalidTokenError so bearerAuth returns 401, not 500 `verifyAccessToken` was throwing bare `Error` on expired or invalid tokens. The MCP SDK's `requireBearerAuth` middleware catches `InvalidTokenError` and returns 401 with WWW-Authenticate; bare Error falls through to 500. Result: legitimate clients with stale tokens hit 500-not-401, so token-refresh logic (which keys off 401) never fires. Two call sites in verifyAccessToken: token-expired path and invalid-token path. Both now throw InvalidTokenError. Existing tests continue to pass because they assert on the throw, not the message class. Closes #935. Cherry-picked from PR #1012. Co-Authored-By: Aashiqe10 <Aashiqe10@users.noreply.github.com> * fix(serve): return 405 on GET /mcp instead of 404 MCP Streamable HTTP spec says GET /mcp opens an optional SSE backchannel for server-initiated messages. gbrain's transport is stateless and doesn't push server-initiated messages, so per spec we MUST return 405 with Allow: POST, DELETE — not 404. Probing clients (claude.ai, etc.) distinguish "endpoint exists, no SSE channel" from "endpoint missing" on this status code; 404 makes them give up. Cherry-picked from PR #1076. Co-Authored-By: lukejduncan <lukejduncan@users.noreply.github.com> * fix(doctor): resolve whoknows fixture from module location, not cwd `gbrain doctor` warned about a missing whoknows fixture for every install that wasn't standing in the gbrain source repo at run time — which is everyone. The check used `process.cwd()` to locate the fixture, so any real user (running doctor against `~/.gbrain`) saw a spurious warning. `resolveWhoknowsFixturePath()` walks up from `import.meta.url` looking for the source-repo signature (`src/cli.ts` + `skills/RESOLVER.md`), respects `GBRAIN_WHOKNOWS_FIXTURE_PATH` env override (absolute or cwd-relative), and returns null with an actionable warning when the fixture can't be located. Closes #969. Cherry-picked from PR #1034. Co-Authored-By: mvanhorn <mvanhorn@users.noreply.github.com> * fix(frontmatter): centralize --fix backups under ~/.gbrain/backups/ `gbrain frontmatter validate --fix` and `gbrain frontmatter generate --fix` wrote `<file>.bak` siblings into the source tree. Users running gbrain over a brain repo found .bak files scattered through people/, companies/, etc. that broke gitignore expectations and showed up in `git status` after every fix pass. Backups now land under `~/.gbrain/backups/frontmatter/<run-id>/<rel>.bak` with an iso-week-sorted run-id so a multi-fix session keeps the same parent directory. Backup directory + per-file structure mirrored from the original file's relative path. The .bak safety contract is intact for both git and non-git brain repos. Also adds `--include-catch-all` opt-in to `frontmatter generate` so the default catch-all rule (`type: note`) is no longer applied to arbitrary workspace documents that happen to live under a brain root. Closes #902. Cherry-picked from PR #903. Co-Authored-By: 100yenadmin <100yenadmin@users.noreply.github.com> * fix(config): use path.isAbsolute() for GBRAIN_HOME on Windows The GBRAIN_HOME validator rejected every valid Windows path (`C:\\Users\\...`, `D:\\gbrain`, etc.) because it used `trimmed.startsWith('/')` to check for absoluteness — only POSIX absolute paths pass that. `path.isAbsolute()` is the cross-platform check. Same fix for the `..` traversal check: split on both `/` and `\` so Windows path separators don't sneak `..` through. Closes #1019. Cherry-picked from PR #1083. Co-Authored-By: sharziki <sharziki@users.noreply.github.com> * fix(ai): warn only for the configured embedding provider, not all recipes Gateway construction was warning on stderr for every recipe with an embedding touchpoint missing max_batch_tokens — including providers the brain isn't using. Users on Voyage saw noise about OpenAI / Google / DashScope / etc. recipes that never get loaded. Filter the warning to recipes whose provider id is referenced by `embedding_model` or `embedding_multimodal_model` in the active config. The structural protection against forgetting max_batch_tokens stays in place for the recipes that actually run; the noise for unrelated recipes goes away. Cherry-picked from PR #1117. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(sync): skip git pull when repo has no origin remote `gbrain sync` ran `git pull` unconditionally and printed scary stderr on every cycle for brains that have no `origin` remote (local-only workflows, single-machine setups, brains initialized via `gbrain init --pglite` against an arbitrary directory). The pull failed harmlessly but the noise was confusing and made operators think sync was broken. `hasOriginRemote()` probes `git remote get-url origin` with stdio ignored; on failure (`no such remote`), skip the pull, print a single informational line, and proceed with the local working tree. Cherry-picked from PR #1119. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(query): drain cache writes before CLI exit The query cache write was fired with `void promise.catch(...)` — true fire-and-forget. On a fast CLI invocation (`gbrain query <q>` exits in ~50ms), the process terminates before the cache write commits. Result: the cache effectively never warms from CLI use; every query is a miss. `awaitPendingSearchCacheWrites()` tracks each in-flight cache write in a module-level Set. The CLI dispatcher awaits the set after `query` finishes formatting output but before the process exits. MCP server path unchanged (long-lived process, fire-and-forget remains correct). Cherry-picked from PR #1125. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(backlinks): dedupe (source, target) pairs within a single source page A source page that mentions the same entity N times produced N duplicate "Referenced in" lines on the target. `extractEntityRefs` returns one EntityRef per occurrence, and the per-ref `hasBacklink` check reads a snapshot of `target.content` that's frozen at outer scope — so every iteration sees "no backlink yet" and appends another gap. The cumulative effect on a long meeting note with multiple mentions of the same person was visible in PRs landing 3-5 identical Timeline entries. Track seen target slugs per source page; cap gaps at one pair. Cherry-picked from PR #967 with a current-master regression test covering both markdown-link and Obsidian-wikilink formats in the same source page. Co-Authored-By: p3ob7o <p3ob7o@users.noreply.github.com> * fix(dream): audit backlinks without mutating pages during cycle The dream/autopilot maintenance cycle ran the backlinks phase in 'fix' mode, which writes "Referenced in" timeline bullets into entity pages every sync. The graph extractor + auto-link path is the canonical link store during sync/dream/autopilot — the legacy filesystem fixer wrote markdown that fought with both the user's manual edits and the graph layer's own timeline. Cycle now runs backlinks in 'check' mode (audit-only); the materializer remains available via `gbrain check-backlinks fix` for users who really want markdown backlinks committed to disk. Cherry-picked from PR #1027. Co-Authored-By: sliday <sliday@users.noreply.github.com> * fix(autopilot --install): source ~/.zshenv before zshrc/bashrc zshenv is the canonical place for env vars in zsh on macOS — zshrc is sourced only for interactive shells, so vars exported in zshrc don't reach a non-interactive subprocess like the autopilot wrapper. Users who exported GBRAIN_DATABASE_URL, OPENAI_API_KEY, or ANTHROPIC_API_KEY in zshrc and assumed autopilot would inherit them hit silent missing- secret failures on the LaunchAgent. Source ~/.zshenv first (always reaches non-interactive shells per zsh docs), then fall back to ~/.zshrc / ~/.bashrc for users on other profile conventions. Cherry-picked from PR #966. Co-Authored-By: p3ob7o <p3ob7o@users.noreply.github.com> * fix(apply-migrations): return exit 0 on list/dry-run/up-to-date `gbrain apply-migrations list`, `gbrain apply-migrations --dry-run`, and the "All migrations up to date" path were returning from the async function but never calling `process.exit(0)`. The CLI dispatcher in cli.ts treated the implicit fall-through as exit 1 when the parent process inspected status via shell scripts, breaking automation that gates on `apply-migrations list && do-something`. Three call sites: list, dry-run, and the no-op path. All three now exit(0) explicitly. Cherry-picked from PR #1062. Co-Authored-By: nezovskii <nezovskii@users.noreply.github.com> * fix(sync): scope auto-embed to source on incremental syncs `gbrain sync --source-id X` triggered auto-embed for the affected slugs but `runEmbed` ran with no `--source` flag, so it fell back to the default source. For non-default-source syncs the page row lives at (sourceId, slug) — the embed code saw "Page not found" for the right slug under the wrong source, swallowed the error as best-effort, and the sync result reported `embedded: 0` for the wrong reason. `buildAutoEmbedArgs(slugs, sourceId)` is the new helper: when sourceId is set, prepends `--source X`. Exported for the regression test. Pairs with the upcoming source-id write-path audit (P1 #8). Cherry-picked from PR #1120. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(query): honor source_id with no-expand for cross-source search Two related corrections: 1. `gbrain query --no-expand` parsed `--no-expand` as the literal key `no_expand` instead of negating the boolean `expand` param. Result: the flag was silently ignored and expansion always ran. Now any `--no-<key>` where `<key>` is a boolean param flips it false. 2. The `query` op's source-id resolution treated `ctx.sourceId` as authoritative, so an explicit per-call `source_id` was overridden by the federated read scope. Now per-call `source_id` wins; `source_id=__all__` is an explicit opt-out for local cross-source search. Cherry-picked from PR #1124. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(doctor): child-table orphan detection (closes #1063) The autopilot orphans phase detects orphan PAGES (no inbound links via page-graph) but never scans FK-child tables. After a bulk delete or a pre-FK-migration code path, orphan rows can persist indefinitely in content_chunks, page_versions, tags, takes, raw_data, timeline_entries, or links — all declared ON DELETE CASCADE, so any orphan row is unexpected. `childTableOrphansCheck` enumerates 10 FK columns across 8 tables: - 8 NOT NULL columns (cascade): any value not in pages.id is an orphan. - 2 nullable SET NULL columns (links.origin_page_id, files.page_id): NULL is valid; only NOT-NULL-but-missing-in-pages counts. Surfaces paste-ready cleanup SQL when orphans are found. Cherry-picked from PR #1064. Co-Authored-By: vincedk-alt <vincedk-alt@users.noreply.github.com> * fix(autopilot,cycle): stop respawn-storm from steady-state 'partial' cycles Two compounding bugs under KeepAlive=true: 1. Autopilot tripped its circuit breaker on cycle.status === 'partial', not just 'failed'. 'partial' means at least one phase warned/failed while others ran — a soft signal, not fatal. On every cycle that warned, autopilot logged a failure and the supervisor respawned the worker. 2. The orphans phase emitted 'warn' when `count > 20` orphan pages. That threshold was tuned for small dev brains; on any corpus past a few hundred pages it fires every cycle in steady state. Together with bug 1, this produced visible respawn storms. Fix: - Autopilot trips only on cycle.status === 'failed'. - Orphans phase warns by ratio: orphans / total_pages > 0.5 (the real "your graph fell apart" signal), not by absolute count. Cherry-picked from PR #1113. Co-Authored-By: sergeclaesen <sergeclaesen@users.noreply.github.com> * fix(ai): reject partial embedding responses before indexing `embedSubBatch` only validated the FIRST embedding's dimension and never asserted the response length matched the input length. If a provider returned fewer embeddings than requested (rate-limit truncation, malformed response, etc.), the gateway silently indexed an offset-shifted result — every page after the missing index got the embedding of a different page's chunk. Two new guards: 1. `result.embeddings.length === texts.length` — fail loud if any count mismatch, with a paste-ready retry hint. 2. Validate dim on EVERY embedding, not just the first. Cherry-picked from PR #926. Co-Authored-By: 100yenadmin <100yenadmin@users.noreply.github.com> * fix(serve): admin register-client supports auth_code + PKCE public clients The admin dashboard's /admin/api/register-client endpoint hardcoded client_credentials and ignored grantTypes, redirectUris, and tokenEndpointAuthMethod. Result: you couldn't register a browser-based PKCE client (claude.ai Custom Connector, Cursor, etc.) through the dashboard — only confidential machine-to-machine clients worked. Pass grantTypes / redirectUris through to registerClientManual. When tokenEndpointAuthMethod === 'none', NULL out client_secret_hash so the SDK's clientAuth middleware skips the hash-vs-plaintext compare that would otherwise reject the no-secret PKCE flow. Cherry-picked from PR #1077. Co-Authored-By: lukejduncan <lukejduncan@users.noreply.github.com> * fix(extract-facts): treat slugs:[] as no-op, not unscoped full-walk `runExtractFacts` checked `opts.slugs && opts.slugs.length > 0` to decide between scoped and full-brain walk. Both `undefined` (caller omits → full walk intended) AND `[]` (sync no-op → zero work intended) fall through to the same `else` branch and triggered `engine.getAllSlugs()`. On a multi-thousand-page brain, the unintended full walk exceeded the autopilot-cycle ~600s timeout and dead-lettered the job — visible in production as `[cycle.extract_facts] start` followed by silence until `Autopilot stopping (cycle-failure-cap)`. Use presence (`opts.slugs !== undefined`), not truthiness, to distinguish the two modes. Empty array is a real incremental no-op. Closes #1096. Three regression cases in test/extract-facts-phase.test.ts: slugs=[] no-op, slugs=undefined still walks, slugs=['a'] walks just one. Co-Authored-By: navin-moorthy <navin-moorthy@users.noreply.github.com> * fix(serve): embed admin/dist into binary; serve from manifest (closes #1090) Pre-fix, /admin returned 404 on every globally-installed binary because serve-http.ts:780 resolved admin/dist via process.cwd(). The admin SPA files are checked into git but `bun build --compile` does NOT embed arbitrary directories — only assets imported via `with { type: 'file' }` ESM imports land in the compiled binary. Wire: - scripts/build-admin-embedded.ts walks admin/dist/, emits src/admin-embedded.ts with one `with { type: 'file' }` import per file + a manifest map (request path → resolved path + mime). Auto-invoked by `bun run build:admin`. - src/admin-embedded.ts is the auto-generated module. Bun resolves every file: import to a path that works at runtime inside the compiled binary (same pattern as src/core/chunkers/code.ts WASM imports). - serve-http.ts switches to two-tier resolution: cwd-relative admin/dist for dev (Vite hot-rebuild), embedded manifest otherwise. Embedded path reads bytes lazily and caches per-asset for the lifetime of the process. - scripts/check-admin-embedded.sh CI gate re-runs the generator and fails on drift (mirrors check-wasm-embedded.sh). PRs that rebuild admin/dist but forget to regenerate the embedded module fail loud. - package.json wires build:admin-embedded + check:admin-embedded. Closes #1090. * test(source-id): lock in routing regression coverage (closes #891 #978 #1078) Audit of every page write path (sync, embed, extract, dream, autopilot, wikilinks, tags, chunks) confirmed that sourceId already threads correctly through importFromContent → engine.putPage → SQL INSERT since v0.18.0. The original bug reports from #891, #978, #1078 were real at the time and got swept by the multi-source refactor; today's master is correct. This commit locks in that correctness with six PGLite regression cases (no Postgres fixture needed; runs in CI everywhere): 1. importFromContent({sourceId:"work"}) lands at source_id=work, not the silent 'default' fallback. 2. Two sources hold the same slug independently. 3. Omitting sourceId falls through to 'default' (legacy contract). 4. Chunks land under the requested source. 5. Tags land under the requested source. 6. FK integrity smoke (originally #1078). The earlier issue reports stay closed by the existing threading; this suite ensures any future refactor of the write path can't silently re-introduce the wrong-source-default bug. The 90-minute write-path audit budget from the plan resolves here. * fix(apply-migrations): unblock PGLite chain (closes #1100) `gbrain apply-migrations --yes` was wedging on the v0.11.0 (Minions) schema phase for PGLite installs. Two compounding bugs: 1. `apply-migrations` pre-flight schema-version warning connects to PGLite to read config.version, then disconnects. The brief lock hold races with downstream subprocess spawns that try to re-acquire it; the 30s lock timeout fires before the parent fully releases. Pre-flight is a *warning*; on PGLite it adds no information the orchestrators don't already handle. Skip the probe for PGLite. 2. v0.11.0 phase A spawned `gbrain init --migrate-only` as an execSync subprocess to apply schema migrations. PGLite is single-writer; the subprocess inherits HOME and tries to lock the same DB. On Postgres this works (concurrent connections OK); on PGLite it deadlocks. Route in-process for PGLite — create + connect + initSchema + disconnect directly, skipping the subprocess hop. Postgres keeps the legacy execSync path. Verified: fresh PGLite install now walks the full migration chain through v0.32.2 (Facts SoR) and lands "All migrations up to date" on re-run. Closes #1100. * fix(serve): bootstrap token env override + suppress flag (closes #1024) `gbrain serve --http` regenerated the admin bootstrap token on every restart and printed it to stderr. In supervisor-managed production deployments (LaunchAgent, systemd, k8s) every restart leaks the value into log aggregators and rotates the access for any agent that paste- copied it. Two new knobs: - **GBRAIN_ADMIN_BOOTSTRAP_TOKEN** env var: when set, used as the bootstrap secret instead of a fresh per-process token. Validated: must match `^[A-Za-z0-9_-]{32,}$` (32-char minimum), else refuse to start with a paste-ready generator hint. Failing closed beats silently accepting a weak token. - **--suppress-bootstrap-token** CLI flag: suppresses the printed token line entirely. Operator takes responsibility for tracking the value out-of-band. Startup banner now reflects the chosen source: - `Admin Token: suppressed` when the flag is set. - `Admin Token: from $GBRAIN_ADMIN_BOOTSTRAP_TOKEN` when env-sourced. - Full token print only when both are absent (default behavior, dev installs). Closes #1024. Co-Authored-By: billy-armstrong <billy-armstrong@users.noreply.github.com> * fix(config): migrate legacy 'provider' + 'model' to 'embedding_model' Pre-v0.32 docs and some community templates used a config shape: { "provider": "voyage", "model": "voyage-4-large" } The canonical shape (since the v0.31.12 gateway seam) is: { "embedding_model": "voyage:voyage-4-large" } Users on the legacy shape hit silent fallthrough to the hardcoded OpenAI default; sync + embed errored out with "OpenAI embedding requires OPENAI_API_KEY" regardless of their actual provider config. loadConfig() now translates the legacy keys at parse time: - emits a one-line stderr nudge with the paste-ready canonical key - preserves the rest of the config unchanged - skipped when `embedding_model` is already set (forward-compat) Closes #1086. Co-Authored-By: jeunessima <jeunessima@users.noreply.github.com> * chore(test): quarantine upgrade tests (process.env mutation) PR #1032's cherry-picked tests use the static-snapshot + try/finally pattern for env vars instead of the project's withEnv() helper. The test-isolation lint catches process.env mutations outside withEnv to prevent cross-test leakage in parallel runs. Renaming to *.serial.test.ts (the quarantine convention) is the documented out: runs sequentially, no cross-file race. A future cleanup PR can migrate the tests to withEnv() and drop the quarantine. * fix(test): update brain-writer .bak assertion for centralized backup path The v0.36.x frontmatter backup change (bd60cdf — closes #902) moved .bak files from sibling-of-source to ~/.gbrain/backups/frontmatter/... The old test still asserted on the sibling path, so CI failed even though the production behavior was correct. Updated assertion contract: backup lands under the injected backupRoot (test-isolated), the returned backupPath ends in .bak and exists, and no sibling .bak is created next to the source file. The pre-fix sibling-path is now a negative assertion. * chore: bump version and changelog (v0.36.1.0) v0.36.1.0 — community fix wave (28 atomic fixes + 22 PRs closed as already-shipped + 14 issues triaged). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(fix-wave): close test gaps surfaced by post-ship audit After the fix-wave shipped, an audit found 11 commits with no new test file. Some were inherently structural (build pipelines, shell content) or had existing test coverage that worked either way; others had real regression risk with no guard. This commit closes the gaps that matter. New regression tests for: - OAuth `verifyAccessToken` throws `InvalidTokenError` (not bare Error) on both expired and unknown token paths. Pre-fix, the SDK's `requireBearerAuth` middleware fell through to 500 instead of 401 → client token-refresh logic never fired (#935). - `loadConfig` translates legacy `{provider, model}` config shape to the canonical `embedding_model: <provider>:<model>`. 3 cases: pure legacy → migrated; canonical wins over legacy when both present; canonical-only is untouched. Pre-fix, Voyage/Cohere/Mistral users silently fell through to OpenAI (#1086). - `configDir` rejects relative paths; rejects `..` segments via both separators (regression guard for the Windows path acceptance fix #1019 / cherry-pick #1083). - `resolveBootstrapToken` (new exported helper extracted from `runServeHttp`). 9 cases: unset env generates fresh, valid env accepted, hyphens/underscores accepted, < 32 chars rejected, special chars rejected, whitespace trimmed, empty string rejected, 32-char boundary accepted, 31-char one-short rejected. Security-critical validation surface (#1024). - GET /mcp returns 405 with `Allow: POST, DELETE` (E2E case in `serve-http-oauth.test.ts`). Pre-fix, claude.ai and other probing MCP clients saw 404 and gave up (#1076). - apply-migrations `process.exit(0)` on list / dry-run / up-to-date paths. Source-shape assertion locks the rule in; shell scripts gating on `$?` work (#1062). - Autopilot wrapper sources `~/.zshenv` BEFORE `~/.zshrc`. zshenv is the canonical place for env vars in non-interactive zsh; without this ordering, LaunchAgent subprocesses never inherit secrets exported in zshrc (#966). - `test/fix-wave-structural.test.ts` consolidates source-shape regression guards for fixes whose behavior is hard to runtime-test without heavy mocking: query cache drain (#1125), admin embed manifest + handler (#1090), admin register-client PKCE branch (#1077), PGLite v0.11.0 phase A in-process routing (#1100), query `--no-expand` negation (#1124). 9 source-grep assertions. Refactored `runServeHttp` to extract `resolveBootstrapToken` as a pure helper. The boot path now consumes the helper's tagged-union result ({kind:'ok'|'error'}); side effects (`process.exit`, `console.error`) moved to the caller. Unit-testable without spinning up Express. Test counts: oauth 71 (was 69), config 20 (was 14), apply-migrations 19 (was 18), autopilot-install 5 (was 4), serve-http-bootstrap-token 9 (new file), fix-wave-structural 9 (new file). Net: +28 cases across 6 files; +1 new exported function with full coverage. Remaining audit gaps (deferred): - e82dda0 admin embed E2E (post-deploy curl smoke covers this) - d93fa81 apply-migrations PGLite chain E2E (already smoke-tested manually in the original commit; subprocess test would be flaky in CI without DATABASE_URL gating) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: close the two deferred E2E gaps from the post-ship audit Both gaps now have real behavior coverage. No DATABASE_URL needed (PGLite engine), so they run in standard unit CI alongside the rest of the suite. Serial quarantine because both spawn subprocesses + bind ports / write tmpdirs. test/admin-embed-spawn.serial.test.ts (4 cases, ~6s wall-clock): - Spawns `gbrain serve --http` from a fresh tmpdir so `process.cwd()/ admin/dist` does not exist — this forces the embedded-manifest branch (the one under test). Pre-fix, this exact setup hit 404. - GET /admin/ → 200 + SPA shell HTML (title + #root div), content-type text/html. - GET /admin/index.html → same body via explicit path. - GET /admin/agents → SPA fallback returns index.html for deep links. - GET /admin/api/stats → NOT 200 (regression guard: SPA fallback must not swallow /admin/api/* routes and silently return HTML to a JSON client). Closes #1090. test/apply-migrations-pglite-spawn.serial.test.ts (3 cases, ~25s): - Seeds a fresh PGLite config in a tmpdir, runs `gbrain init --migrate-only` + `gbrain apply-migrations --yes --non-interactive`. Pre-fix this hit "GBrain: Timed out waiting for PGLite lock" because apply-migrations' pre-flight probe + v0.11.0's phase A subprocess both wanted the single-writer lock. - Asserts exit 0, no "Timed out" string, no "Phase A failed" string, brain.pglite file written. - Re-run case: idempotent — "All migrations up to date" exits 0 (also locks in the #1062 exit-code fix end-to-end). - --list path exits 0 (third leg of the #1062 contract). Closes #1100. Pinned bootstrap token via GBRAIN_ADMIN_BOOTSTRAP_TOKEN env so the admin test doesn't have to scrape stderr; the startup banner format is allowed to drift, the /health probe is the readiness contract. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): consolidate PGLite spawn test to one end-to-end pass CI failed on test/apply-migrations-pglite-spawn.serial.test.ts (Ubuntu, bun 1.3.14). The previous shape ran 3 tests × ~3 spawns each. Each `bun run /abs/src/cli.ts` from a tmpdir cwd pays a full parse/transpile cost (no near-cwd .bun cache); on Ubuntu CI that compounds past the runner's per-test budget. Consolidated to ONE test that exercises the full lifecycle in one brain: init --migrate-only → apply-migrations --yes → re-run → --list. Four spawns instead of eight. Local wall-clock: 32s → 11.5s. All four assertion buckets preserved: no PGLite lock timeout, no Phase A failure, brain.pglite written, idempotent re-run "All migrations up to date" exits 0 (#1062 end-to-end), --list exits 0. Per-test timeout 480_000ms as insurance against the runner's --timeout=60000 default (bun's API spec: per-test wins). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(diag): dump apply-migrations output when CI exit != 0 The PGLite spawn test passes locally on macOS/bun 1.3.13 in ~11s end-to-end but fails on Ubuntu/bun 1.3.14 in 4.92s with apply.exitCode = 1 — fast enough that something is failing early, not timing out. The runCli helper captured stdout+stderr but never printed them, so the CI log only showed the bare assertion failure. This commit prints the captured streams from BOTH init and apply when the exit code mismatches expectation. After the next CI run we can read the actual error message and diagnose the Ubuntu-specific failure mode (likely BUN_INSTALL / HOME / PGLite WASM env quirk). No behavior change; pure diagnostic output gate on failure. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): shim `gbrain` on PATH for PGLite spawn test Root cause of the Ubuntu CI failure: the v0.11.0 orchestrator's phase B runs `execSync('gbrain jobs smoke')`. PGLite phase A now routes in-process (the #1100 fix), but phase B and several follow-up phases still shell out to the `gbrain` binary on PATH. Locally the binary resolves via `bun link`; on CI Ubuntu it does not exist on PATH, so execSync exits 127 → orchestrator returns 'failed' → apply-migrations exits 1. Test failed at 4.92s with exitCode=1, well before any timeout. Verified locally by removing ~/.bun/bin/gbrain to simulate CI: pre-shim: apply.exitCode=1 (same as CI) post-shim: apply.exitCode=0 in 8.4s The shim writes a tiny `gbrain` executable to a tmpdir that just `exec`s `bun run <repo>/src/cli.ts "$@"`. Prepended to PATH for the spawned subprocesses. Mirrors the production contract (gbrain on PATH) without depending on `bun link` having run in the CI image. Diagnostic dump from the previous commit stays — useful insurance for the next time something silently fails inside a spawned binary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: johnybradshaw <johnybradshaw@users.noreply.github.com> Co-authored-by: mvanhorn <mvanhorn@users.noreply.github.com> Co-authored-by: sharziki <sharziki@users.noreply.github.com> Co-authored-by: Aashiqe10 <Aashiqe10@users.noreply.github.com> Co-authored-by: lukejduncan <lukejduncan@users.noreply.github.com> Co-authored-by: 100yenadmin <100yenadmin@users.noreply.github.com> Co-authored-by: hnshah <hnshah@users.noreply.github.com> Co-authored-by: p3ob7o <p3ob7o@users.noreply.github.com> Co-authored-by: sliday <sliday@users.noreply.github.com> Co-authored-by: nezovskii <nezovskii@users.noreply.github.com> Co-authored-by: vincedk-alt <vincedk-alt@users.noreply.github.com> Co-authored-by: sergeclaesen <sergeclaesen@users.noreply.github.com> Co-authored-by: navin-moorthy <navin-moorthy@users.noreply.github.com> Co-authored-by: billy-armstrong <billy-armstrong@users.noreply.github.com> Co-authored-by: jeunessima <jeunessima@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

garrytan · 2026-05-21T04:42:06Z

Closing in favor of #1253 (v0.37.7.0 fix wave). The sync --source flag already existed; this PR's contribution focused on threading source-id into the auto-embed args, which the wave preserves via the v0.31.x buildAutoEmbedArgs helper at src/commands/sync.ts:215. Co-Authored-By: hnshah trailer included. Thank you for the dogfood time.

@zscgeek

…dential clients (#1253) * fix(reindex-frontmatter): connect engine before query (#1225) `createEngine()` from src/core/engine-factory.ts only constructs the engine; callers MUST call connect() before any executeRaw. The reindex-frontmatter CLI was constructing the engine and going straight to countAffected, which crashed on PGLite with "PGLite not connected. Call connect() first." even on --dry-run. Fix follows the existing-command pattern (src/commands/auth.ts, src/commands/backfill.ts, src/commands/integrity.ts all do the same): pass toEngineConfig(cfg) into both createEngine() AND engine.connect(), then engine.initSchema() (idempotent on a current schema, ~1ms cost). Pre-fix verification: codex outside-voice CF5 flagged the related "can't import connectEngine from cli.ts" misdirection in the original fix plan. This implementation uses the canonical sibling pattern instead. Regression test pinned at test/reindex-frontmatter-connect.test.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump VERSION to 0.37.7.0 + stub CHANGELOG v0.37.5.0 claimed by #1229 (warsaw-v4); v0.37.6.0 by #1246 (OpenRouter recipe). v0.37.7.0 is the next free slot for this fix wave. CHANGELOG entry stubbed in user-facing voice per CLAUDE.md "CHANGELOG voice + release-summary format" — ELI10 lead-first, real fix details below. The "## To take advantage of v0.37.7.0" block follows the v0.13+ self-repair pattern from CLAUDE.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(subagent): short-circuit terminal-on-resume (#1151) Bug: when the worker resumed a subagent job whose persisted last message was an assistant turn with text-only content (no tool_use blocks), the replay reconciler at subagent.ts:241-247 had no branch for that case. The main loop then called messages.create against a conversation ending in assistant role, which Sonnet 4.6+ rejects with HTTP 400 "This model does not support assistant message prefill." 3 retries later → dead-letter, despite all the job's work having committed in earlier turns. @zscgeek's bug report pinned this exactly: dream-cycle Otter corpus runs hit ~7% dead-letter rate, every dead job's last subagent_messages row was a text-only synthesis summary listing slugs that already existed in `pages`. Their proposed fix mirrors this implementation. Fix: add an else branch to the assistant-tail check that mirrors the live-loop terminal logic at subagent.ts:440-447 — reconstruct finalText from the persisted text blocks, return stop_reason='end_turn' immediately. No LLM call, no schema change. Two new regression cases: - text-only terminal on resume returns immediately with zero messages.create calls - tool-use replay path unchanged (existing behavior preserved) Codex outside-voice (CF13) initially flagged this fix as mis-targeted, claiming subagent.ts already handled the case. /investigate run revealed the live-loop terminal at :440-447 was covered but the REPLAY-path terminal at :241-247 was missing — both branches need symmetric handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(autopilot): scope lockfile to GBRAIN_HOME (#1226) The autopilot lockfile was hardcoded at `~/.gbrain/autopilot.lock` (via `process.env.HOME`), bypassing GBRAIN_HOME. Two brains pointed at different GBRAIN_HOME directories still wrote to the same global lockfile; one would silently take over the other on each restart. Fix: route through `gbrainPath('autopilot.lock')` from src/core/config.ts (imported aliased as gbrainHomePath since the local `gbrainPath` var in installAutopilot references the CLI binary path). The mkdirSync(`~/.gbrain`) call also routes through the helper so the directory is created in the right place too. Co-authored with @rafaelreis-r — same fix shape as PR #1227, re-implemented against current master per the wave's "re-implement, credit, close" workflow. Tests cover: one GBRAIN_HOME → one canonical lock; two GBRAIN_HOME values → two distinct locks; default fall-through still works. Co-Authored-By: rafaelreis-r <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(graph-query): foreign-edge footer + --include-foreign (#1153) The graph-query CLI silently dropped edges to pages in other sources on federated brains. Users had no signal those edges existed unless they read the source code. Fix: - New --include-foreign flag (off by default, preserves the existing scoping contract; on = explicit cross-source traversal). - After every traversal, count edges from rootSlug whose target page lives in a different source. When count > 0 AND user didn't opt in, emit a stderr footer: `(N edge(s) to foreign-source pages hidden; pass --include-foreign to include them)` - The "no edges found" path also runs the count + footer so users discover foreign edges even when scoped traversal returned nothing. - Thin-client path skips the count (engine query not available); future T1 work threads source resolution through MCP for that path. - Single quotation correctness in count SQL: page_links table is `links` (not `page_links`); JOIN both endpoints to pages and compare source_id, NULL-safe via `IS NOT NULL` guards on both sides. - Fail-open on missing source_id column for pre-v0.18 brains: return 0 (no foreign edges to report) instead of throwing. 4 new test cases: footer fires on scoped query with foreign edge, --include-foreign suppresses footer, zero-foreign no-footer case, pluralization regression guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(sources): `gbrain sources current` + tier attribution (#1222) Federated-brain users running destructive ops (extract, import, purge) need a way to verify which source they're targeting BEFORE the op runs. Pre-fix, the only way was to grep config files or run the op with --dry-run and inspect output. New command: gbrain sources current # human output gbrain sources current --json # machine-readable gbrain sources current --source X # show what an explicit --source # X would resolve to (validates # X exists in the sources table) Output names BOTH the resolved source id AND which tier of the 6-tier resolution chain won (flag / env / dotfile / local_path / brain_default / seed_default), plus a `detail` line naming the winning signal (e.g. "GBRAIN_SOURCE=dept-x" or ".gbrain-source" or "/work/gstack/src"). Implementation: - New `resolveSourceWithTier()` in source-resolver.ts as an additive variant of `resolveSourceId()`. Walks the same 6 steps in the same order; just returns `{ source_id, tier, detail? }` instead of bare string. Existing `resolveSourceId()` unchanged — all callers continue working. - New `SOURCE_TIER_NAMES` const + `SourceTier` type export so the CLI, doctor (Tier 5 follow-up), and future MCP consumers share one vocabulary instead of inlining strings. - Help text updated; `current` subcommand registered in dispatcher. 11 new tests pin the 6-tier ladder + priority semantics. Existing 19 source-resolver tests still pass (regression preserved). Per codex CF3 (the existing src/core/source-resolver.ts was missed in the original plan). Re-uses the existing helper instead of inventing a duplicate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(extract): --source-id scopes extraction to one brain source (#1204) Federated brain users running `gbrain extract` had no way to scope extraction to one source. The DB path walks all sources together via listAllPageRefs(), which is correct for cross-source resolution but sometimes the user wants to extract per-source explicitly (e.g. re-running extract on a specific source after a manual import). The pre-existing `--source` flag is the data-source axis (fs|db) and can't be repurposed. New flag `--source-id <id>` joins it on the brain-source-id axis: gbrain extract all --source db --source-id alpha -> walks only alpha-source pages; extracts links + timeline from those, into the alpha source Important: the resolver maps (allSlugs + slugToSources) stay built from the FULL listAllPageRefs result, not the scoped subset. This ensures qualified cross-source wikilinks like `[[other-src:slug]]` still resolve correctly even when the extract walk is scoped — the filter is on which pages we extract FROM, not what we can resolve TO. Threaded through both `extractLinksFromDB` and `extractTimelineFromDB` with backward-compat: callers passing no opts get the old behavior. 4 new test cases pin: walks-all-without-flag baseline, alpha-only-when-scoped-to-alpha, beta-only-when-scoped-to-beta, empty-set-on-unknown-source. Note: #1204's wider "silent 0 links" report on federated brains has additional facets beyond this flag (resolver path edge cases on overlapping slugs). The scoped-walk fix gives users an explicit workaround AND closes the per-source extraction gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(todos): file v0.37.7.0 follow-ups (#1173, #1204, T5N) Three items deferred from v0.37.7.0: 1. #1173 .sql indexing — verify-first gate found tree-sitter-sql.wasm missing from src/assets/wasm/grammars/. Dedicated wave needed: vendor the wasm, add .sql to walker filter, address slug-shape collision with #1172. 2. #1204 deeper investigation — wave added --source-id flag as workaround. Underlying silent-zero-links bug on unscoped federated extracts needs its own /investigate pass against a cross-source-duplicate-slug fixture. 3. Tier 5N doctor sweep for dead-lettered subagent jobs matching the #1151 fingerprint. Deferred to v0.37.8+ behind the islamabad doctor.ts conflict resolution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(sync): walker skips git submodule directories (#1169) Sync walker descended into git submodules and indexed their markdown content as if it belonged to the parent brain. Users with submodules in their brain repo saw foreign content in their pages table. Fix: pruneDir gains an optional `parentDir` arg. When set, the helper stats `<parentDir>/<name>/.git` and skips the directory if `.git` exists as a FILE (gitfile pointer — the canonical submodule shape). Directories containing `.git` as a DIRECTORY (a real nested repo, not a submodule) are descended into; the inner `.git` dir itself is then dot-prefix-excluded. Callers updated to pass parentDir: - src/commands/extract.ts walkMarkdownFiles - src/core/cycle/transcript-discovery.ts walker Back-compat preserved: existing pruneDir(name) callers without parentDir get the pre-v0.37.7.0 behavior unchanged. Companion `.gitignore`-respect feature from PR #1159 (@jetsetterfl) NOT in this wave — it would require adding the `ignore` npm package as a dep, which the plan's "no new deps in this PR" gate excludes. Filed as follow-up TODO for a dedicated wave. 5 new test cases pin the submodule shape + back-compat + nested-repo ambiguity. Existing extract-fs / extract-db tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(brain-routing): document 6-tier source resolution chain (#1222) The convention skill didn't have a tier-by-tier reference for how gbrain resolves the active source. Users running federated brains had to read the source code to know which signal wins. Added: - Canonical 6-tier table (flag → env → dotfile → local_path → brain_default → seed_default) matching src/core/source-resolver.ts. - Pointer to `gbrain sources current` (new in v0.37.7.0) as the verification command. - The CLI-layer trust boundary note: operations.ts handlers don't read env/dotfile (preserves v0.34.1.0 source-isolation work for MCP callers). - Per-command flag map: --source, --source-id (extract), and --include-foreign (graph-query). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(import): --source-id flag routes pages to a brain source (#1167) `gbrain import --source dept-x ./pages` silently fell back to the default source because the CLI parser never consumed --source. PR #707's design intent excluded the flag explicitly; users had no signal their pages were going to the wrong place. #1167 + #1222 filed the regression. Fix: parse `--source-id <id>` (matching v0.37.7.0 extract.ts T2's naming convention — --source-id stays out of conflict with future axes that may want --source). When set, the flag value wins over any programmatic opts.sourceId; back-compat preserved for callers that pass sourceId via opts only. Also threaded into the positional-dir arg parser's flagValues set so `--source-id <value> <dir>` doesn't treat <value> as the dir. Note on related surfaces: - `gbrain query "X" --source_id dept-x` already routed correctly via the operations.ts query op (added in v0.34) — no fix needed. - `gbrain extract --source-id <id>` shipped in T2. - `gbrain sync --source <id>` already worked (pre-existing). - `gbrain sources current` (shipped in T4) is the verification tool — run it before destructive ops to confirm routing. Closes the silent-fallback for the import path. Co-authored with @tyad67-netizen (#1168), @hnshah (#1124, #1120), whose patches informed the shape; re-implemented against current master per the wave's "re-implement, credit, close" workflow. 3 new test cases pin: default-without-flag, --source-id-routes-correctly, flag-value-not-treated-as-dirArg. Co-Authored-By: tyad67-netizen <noreply@github.com> Co-Authored-By: hnshah <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(autopilot): reconnect classifier + launchd ThrottleInterval (#1162) Pre-fix: when database_url was unset/malformed, the DB-health-check reconnect loop logged `config.database_url undefined` forever because the catch swallowed every error type uniformly. launchd's KeepAlive=true respawned immediately on any exit, so even when the process did exit, it came right back into the same bad state. @colin477 reported the daemon-thrash pattern. Two-part fix: 1. In-process error classifier — `classifyReconnectError(err)`: - `unrecoverable` (database_url missing/empty/malformed, auth failure, no-brain-configured): exit immediately with a clear stderr line. Pattern-matched against postgres / config-loader error shapes. Tests pin the matcher against the #1162 fingerprint exactly. - `recoverable` (network blip, pool saturated, connection refused on a port coming up, Supabase 503): retry. Up to GBRAIN_AUTOPILOT_MAX_RECONNECT_FAILS (default 30 = ~5min) before finally giving up with `max_reconnect_fails_exceeded`. - Counter resets on every successful health probe or reconnect. 2. launchd plist gains `ThrottleInterval=60`. Combined with the in-process exit, launchd waits 60s before relaunching instead of immediate respawn. Pure-function `generateLaunchdPlist()` exported for tests. 16 new test cases: - 11 classifier cases (database_url shapes, malformed URL, auth, role-does-not-exist with quoted name, network blip, pool saturated, 503, non-Error inputs, case-insensitivity) - 5 plist generator cases (ThrottleInterval=60, KeepAlive preserved, wrapper path, XML escaping, StandardErrorPath). Pre-existing autopilot-lock-path tests unchanged — both fixes land cleanly side-by-side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(oauth): confidential clients via custom /token middleware (#1166) v0.34.1.0 (#909) fixed PUBLIC PKCE clients (client_secret=undefined) by normalizing NULL → undefined in getClient. Confidential clients regressed: the MCP SDK's clientAuth middleware does plaintext `client.client_secret !== presented_secret` compare, but gbrain stores SHA-256 hashes, so the SDK's compare always failed for authorization_code and refresh_token grants on confidential clients. Result: /token returned `invalid_client` for every confidential exchange. Fix shape per locked-decision-5: custom /token middleware BEFORE the SDK's authRouter, similar to the pre-existing client_credentials handler. The middleware: 1. Detects confidential auth via `client_secret` in body (client_secret_post) OR `Authorization: Basic` header (client_secret_basic per RFC 6749 §2.3.1). 2. Falls through to the SDK when neither is present (public PKCE path stays canonical, preserves v0.34.1.0 behavior). 3. Calls new `verifyConfidentialClientSecret(clientId, presented)` on the provider which does SHA-256 hash compare ourselves (same shape as exchangeClientCredentials' existing hash check). 4. On verification success, calls existing `exchangeAuthorizationCode` / `exchangeRefreshToken` directly with the validated client. 5. RFC 6749 §5.2 error semantics: 401 invalid_client for auth failures, 400 invalid_grant for code/token problems. Per CLAUDE.md "GBRAIN:RLS_EXEMPT" annotation contract: this surface sits in front of the SDK's clientAuth and doesn't depend on the SDK's plaintext compare working — the SDK's middleware never fires for confidential paths the new middleware claims. 7 new test cases pin: correct-secret-returns-client, wrong-secret opaque rejection, non-existent client, public-client refuses the confidential path, case-sensitivity, soft-deleted revocation, verify-then-exchange-refresh round-trip with second-use rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(doctor): 3 new checks — source routing + oauth + autopilot lock (T12/T13/T14) Three v0.37.7.0 doctor checks landing in one atomic commit (single file, shared merge-conflict surface with garrytan/islamabad-v3 per locked decision 1): 1. source_routing_health (T12 / #1167): Sample non-default sources for pages; warn when a registered source has zero pages (silent-collapse-to-default fingerprint). D5 lock: total-sample cap of 200 pages across all sources, with per-source cap = min(50, ceil(200/N)) so a 20-source CEO brain pays 200 selects, not 1000. Fix hint paste-ready to `gbrain sources current --json` for verification. 2. oauth_confidential_client_health (T13 / #1166): Probe every oauth_clients row. Confidential clients (auth_method != 'none') must have a non-NULL client_secret_hash; if any row claims confidential auth but stores NULL hash, that's the pre-v0.37.7.0 regression. Public clients (auth_method='none') correctly keep NULL hash per v0.34.1.0 #909. Fix hint: `gbrain auth revoke-client + register-client` OR `gbrain upgrade`. Pre-OAuth schemas (missing oauth_clients table) skip gracefully. 3. autopilot_lock_scope (T14 / #1226): Detect stale ~/.gbrain/autopilot.lock outside the current GBRAIN_HOME. Codex CF11: dangerous to paste-ready `rm` without verifying the owning PID isn't a live process. Hint reads the PID file and gives the user a `ps -p <pid>` check before any delete — matches sshd-style stale-lock recovery hints. 9 new test cases pin the canonical paths. Pre-existing 80+ doctor checks unchanged. Expected to conflict with garrytan/islamabad-v3 at merge time. The 3 new check functions live in their own block far from the islamabad skill_brain_first check; the conflict surface should be limited to the `checks.push(...)` call site near the end of runDoctor's DB-checks phase (~10 lines). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): withEnv wrapper in source-resolver-with-tier (test-isolation lint) The new source-resolver-with-tier.test.ts from T4 mutated process.env.GBRAIN_SOURCE directly in two cases, which violates scripts/check-test-isolation.sh R1 (env mutations leak across parallel-loaded test files in the same shard process). Fix: wrap both mutation sites in withEnv() from test/helpers/with-env.ts, which saves+restores via try/finally per the canonical pattern in CLAUDE.md. Pure refactor — all 11 cases still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.37.7.0 CHANGELOG.md — populated the "What landed" stub with the 18-commit brisbane wave (source-id flag threading, sources current subcommand, graph-query foreign-edge footer, autopilot lockfile scope + reconnect classifier + launchd ThrottleInterval, OAuth confidential client middleware, reindex-frontmatter connect fix, subagent terminal-on-resume fix, sync walker submodule skip, 3 new doctor checks, brain-routing.md convention skill). Voice: ELI10 lead, capability table, paste-ready verification, "what's safe to know" + "what we caught" sections. CLAUDE.md — extended Key Files annotations for the v0.37.7.0 changes: import/extract --source-id flags, sources current subcommand, graph-query --include-foreign, resolveSourceWithTier() additive helper, autopilot classifyReconnectError + generateLaunchdPlist exports, OAuth confidential client middleware, pruneDir submodule detection, subagent terminal short-circuit, 3 new doctor checks. Pinned by their test files. llms-full.txt — regenerated via `bun run build:llms` (CI guard at test/build-llms.test.ts will fail otherwise). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: rafaelreis-r <noreply@github.com>

fix(sync): scope auto-embed to source

1c9ceaf

garrytan mentioned this pull request May 18, 2026

v0.36.1.1 fix-wave: community PR triage + 28 atomic fixes #1182

Merged

7 tasks

garrytan mentioned this pull request May 21, 2026

v0.37.7.0 fix wave: federated brains + autopilot safety + OAuth confidential clients #1253

Merged

6 tasks

garrytan closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sync): scope auto-embed to source#1120

fix(sync): scope auto-embed to source#1120
hnshah wants to merge 1 commit into
garrytan:masterfrom
hnshah:ren/dogfood-sync-embed-source

hnshah commented May 17, 2026

Uh oh!

garrytan commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hnshah commented May 17, 2026

Summary

Dogfood evidence

Tests

Uh oh!

garrytan commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants