Security Providers: vault-backed secret management for the desktop (AIR-017→023)#673
Security Providers: vault-backed secret management for the desktop (AIR-017→023)#6730xr00tf3rr3t wants to merge 32 commits into
Conversation
Greptile SummaryThis PR adds a vault-backed secrets management layer to the desktop app — a two-stage setup wizard (security provider first, then model provider), a Settings panel for testing/editing vault keys, async vault create/TPM-seal, and a KeePassXC onboarding guide — and then fixes six live-data-found credential-routing bugs (AIR-017 through AIR-023) found by exercising the feature against a real vault.
Confidence Score: 4/5Safe to merge with the installer vault fallback fixed; the broad token regex can strand users at a 401 chat error after incorrectly bypassing onboarding. The vault install gate in installer.ts falls through to a regex that accepts any _API_KEY/_TOKEN credential from the vault — including unrelated service tokens — without first checking the alias-constrained set (aliasesForEnvKey) that every other gate in this PR uses. A user with Anthropic configured who happens to store GITHUB_TOKEN in their vault for unrelated use passes the gate without an LLM credential, then hits a chat-blocking 401 with no actionable guidance. The remaining changes are well-constructed and the AIR series fixes are logically sound. src/main/installer.ts — the vault-path hasApiKey fallback should check aliasesForEnvKey(expectedKey) before the broad regex, mirroring how resolvedHasKey() and envHasUsableValue() handle aliases in this same PR. Important Files Changed
|
…reptile P1) The credential NAME-ALIAS map (ANTHROPIC_API_KEY → ANTHROPIC_TOKEN, CLAUDE_CODE_OAUTH_TOKEN) was defined THREE times independently — config-health.ts, validation.ts, and Setup.tsx (as MODEL_KEY_ALIASES) — kept in sync only by comments. Adding an alias to one but not the others would silently split the five security gates (Greptile P1 on PR fathah#673). Centralize it in src/shared/url-key-map.ts (already the single source of truth for credential-key names, already imported by both main and renderer) as KEY_ALIASES + aliasesForEnvKey(). All three sites now import it; no local copies remain. Test: src/shared/url-key-map.test.ts guards the shared table. RED-proven — removing CLAUDE_CODE_OAUTH_TOKEN from the ONE shared map now reds 5 tests across the shared guard + validation (AIR-022) + Setup (AIR-018/018b), confirming every gate consumes the single source. Full suite 1549 passed.
…ary probes (Greptile P1+P2) Two vaultBootstrap fixes from the PR fathah#673 review: P1 — detectExistingVault hardcoded 'keepassxc-cli' in its suggestedCommand for an on-disk vault, while createVault (same module) correctly uses the snap-aware resolveKeepassxcCli() name. On a snap-only system the binary is 'keepassxc.cli', so a user who already has a vault was shown a read command that fails immediately. Now resolves the CLI name (falling back to the apt name for display when none is installed yet), matching createVault. P2/security — hasBinary() and keepassxcIsSnap() interpolated their name/cli argument into a /bin/sh -c string without quoting. All current callers pass hardcoded literals so there's no live exploit, but the unquoted surface is a trap for any future dynamic caller. Both now shellQuote the interpolated value (reusing the module's existing shellQuote helper). vaultBootstrap tests 18/18; full suite 1549 passed.
|
Thanks for the review — all three findings addressed: P1 — Triplicate P1 — P2/security — Full suite: 1549 passed, 0 failed. The branch was also merged up to current |
…check TS1117) The upstream merge (5185d0d) brought together two identical invalidateSecretsCache property definitions in the contextBridge object literal in preload/index.ts — one from upstream, one from secrets/04. tsc -p tsconfig.node.json (the CI typecheck config) rejects this with TS1117 'An object literal cannot have multiple properties with the same name'; the duplicate was the CI 'check' job failure on PR fathah#673. (Local bare 'npx tsc --noEmit' uses the default config and did not flag it — must run 'npm run typecheck' to match CI's per-project tsconfig.node/web invocation.) Removed the duplicate (kept upstream's, identical body). npm run typecheck and npm test both pass.
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
|
AIR-016 consistency gap — fixed ( Good catch. One implementation note for reviewers: the obvious fix — Every security invariant of the original is preserved (and now regression-tested):
The two IPC handlers now Re: the |
| // The gateway token name may differ from the .env key name (the | ||
| // masking layer's Bearer variant). Accept any resolved provider-shaped | ||
| // credential (*_API_KEY / *_TOKEN) so a vault user isn't blocked. | ||
| hasApiKey = Object.entries(resolved).some( | ||
| ([k, v]) => /(_API_KEY|_TOKEN)$/.test(k) && usable(v), | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
Vault install gate bypassed by any
*_TOKEN/*_API_KEY in vault
When expectedKey is known (e.g. "ANTHROPIC_API_KEY") but not found directly in the vault, the code falls through to a broad regex that accepts ANY *_API_KEY or *_TOKEN credential from the vault. A user who has GITHUB_TOKEN, SLACK_BOT_TOKEN, or any other service token in their vault — but no LLM credential — incorrectly passes this gate and is shown the chat screen instead of being guided back through setup. The alias-constrained check (aliasesForEnvKey) already used by resolvedHasKey() in config-health.ts and by envHasUsableValue() directly above in this same file should be consulted before falling through to the broad regex. The broad !expectedKey fallback (provider not catalogued) is still appropriate; the issue is that it also fires when expectedKey is known but its aliases were not checked.
Surfaces the secrets-provider state in the renderer and lets a user pick up a
vault rotation without restarting. Builds on the provider (PR 1) and its IPC
wiring (PR 2).
- Gateway.tsx / Settings.tsx: roll getApiServerKeyStatus into loadConfig and add
a 10s poll, so the "API key not configured" banner self-clears within 10s of a
vault rotation. Add a "Refresh from vault" button that invalidates the secrets
cache then re-fetches, with a disabled "Refreshing…" state while in flight.
- preload: expose hermesAPI.invalidateSecretsCache() over the new IPC channel;
the getApiServerKeyStatus result carries the additive { hasKey, providerId?,
checkedAt? } shape so the UI can distinguish vault vs .env vs missing.
- i18n: new English keys (settings/gateway refreshFromVault + refreshingFromVault;
diagnose apiKeyModal note that the warning is ignorable for vault users). Other
locales fall back to English until translated.
Also fixes a pre-existing react-hooks/exhaustive-deps issue this work surfaced in
Gateway.tsx: `platforms` was a fresh array each render, defeating the
filteredPlatforms useMemo — now wrapped in its own useMemo.
Tests: Gateway.test.tsx asserts the 10s poll re-fetches the key status and mocks
the new invalidateSecretsCache; new Settings.test.tsx covers the polling. 7
renderer tests pass; typecheck:node + :web clean.
…a provider)
Renderer counterpart to the unified secrets.provider selector and the
`hermes secrets` CLI verbs: a Settings section where the user can see the
active secret provider, switch between env / command / bitwarden, configure
the command helper, and TEST it — all without any secret value crossing the
IPC boundary.
- New src/main/config.ts secretsProviderStatus(profile): resolves the active
provider's keys via resolvedSecrets() (which routes through the
spawn-rate-floored providerListSafe, so repeated Test clicks can't flood the
main process) and returns { provider, keys, count } — KEY NAMES ONLY, never
values. Honors the bitwarden back-compat (bare enabled ⇒ provider=bitwarden).
- IPC "secrets-provider-status" wired in main/index.ts + exposed in preload
(+ .d.ts type).
- New SecretsProviders.tsx (mirrors MemoryProviders): three provider cards with
active badge, "Use this" to switch (setConfig secrets.provider +
invalidateSecretsCache), a helper-command input for the command provider, and
a Test button that lists resolved key names + count with an explicit
"values are never displayed" note. Embedded in the Settings screen next to the
existing config-health / refresh-from-vault block.
- i18n: secrets_* strings added to the settings namespace (English; other
locales fall back to English until translated — no new namespace, smaller diff).
Tests: SecretsProviders.test.tsx — 5 cases: renders the three cards, reflects
the active provider from config, activate writes secrets.provider + invalidates
cache, Test renders resolved key NAMES + count + the values-hidden note AND
asserts the IPC contract carries no values field, empty-resolve surfaces a
warning. typecheck (node + web) clean; Settings + secrets suite 66 passed;
lint clean for the changed files.
First-run setup is now two-stage: pick a model provider (unchanged), then choose where secrets live — env (.env, recommended to start) / command (vault helper) / bitwarden. An active choice at the moment the API key is saved, so the vault option is discoverable instead of buried in Settings. - The command card includes WHAT YOU NEED FIRST guidance: how to create a KeePassXC vault (install, db-create, one entry per key with title = env var name), keep it unlocked at startup, and where the full guide lives — so picking the vault path isn't a dead end for a first-timer with no vault yet. - env is the default and a one-tap pass-through (no friction for "just let me chat"). The entered API key is always saved to .env regardless; the provider choice only governs resolution going forward, stated plainly in the UI. - command → setConfig(secrets.provider, command) + secrets.command + cache invalidate; bitwarden → setConfig(secrets.provider, bitwarden) and points at the CLI wizard to finish. Also fixes an arg-order bug introduced while refactoring: setModelConfig is (provider, model, baseUrl) — a regression test now locks that order so the baseUrl can never again land in the model slot. i18n strings added to the setup namespace (English; other locales fall back). Tests: Setup.test.tsx — 4 cases: Continue advances without completing, Finish saves model config with correct arg order + no secrets write for env, command choice writes the selector + helper + invalidates cache, Back returns to stage 1. typecheck:web clean; lint clean.
User walkthrough (docs/keepassxc-vault-guide.md + screenshots) for keeping API keys in an encrypted KeePassXC vault instead of plaintext .env: create the vault with keepassxc-cli, add an entry per key (title = env var name), verify the helper resolves it, then pick "Vault command" in first-run setup. Pairs with the Security Providers setup onboarding shipped on this branch. Every step captured from a real run; screenshots use placeholder values (no secrets).
…rStatus
Greptile gate, Family 1 (contract-invariant). The renderer suite asserts the
no-values invariant against a MOCKED IPC bridge; nothing tested the REAL
main-process secretsProviderStatus(). This adds a direct test that calls the
production function and asserts it returns {provider,keys,count} with key NAMES
only — serializing the whole return object and asserting a sentinel secret
value appears nowhere. Covers env-empty and throw-degrade paths too.
RED-proven: injecting a 'values' field into secretsProviderStatus reds the
shape assertion. Reverted; function unchanged.
Part of the SDLC pre-launch gate of secrets/03 rebased onto secrets/02.
Closes the two backlog adversarial-test families the Greptile-findings catalog
tracked as "NOT YET WRITTEN" on the secrets-provider rate-limiting code. Both
are mutation-proven load-bearing (RED-verified by flipping the operator /
resetting the timestamp in index.ts, then restoring).
AIR-005 — exact comparison-operator boundaries (family: boundary):
- get() floor: strict `<` MIN_SPAWN_INTERVAL_MS — degrades at 999ms,
re-spawns at EXACTLY 1000ms (boundary exclusive).
- list() TTL: `<=` LIST_CACHE_TTL_MS — fresh through EXACTLY 5000ms inclusive.
- list() spawnAllowed: `>=` MIN_SPAWN_INTERVAL_MS — inclusive at 1000ms.
A future `<`<->`<=` slip on any of the three reds a test (the get() floor and
list() spawnAllowed agree at t==1000 today; nothing pinned that before).
AIR-006 — deletion-visibility window (family: state-ordering):
invalidateProviderListCache() marks data stale but does NOT reset `ts`, so a
HARD-DELETED vault key stays visible to a freshly-spawned gateway for up to
MIN_SPAWN_INTERVAL_MS after "Refresh from vault" — a deliberate
"stale beats wedged" tradeoff. Tests pin the window: deleted key visible
inside the floor, gone at 1000ms, and ROTATION (vs deletion) has no data-loss
window (key never disappears, value just refreshes). Mock gained a
`vaultHasKey` flag to simulate a hard deletion.
No production code changed; only the two test files. Full tracked secrets suite:
10 files, 102 tests green (was 93).
…ndows
Two confirmed lock-up / lock-out classes a community user's setup hits that the
happy-path suite missed. Both reproduced against the real Node runtime and
RED-proven load-bearing (revert the fix -> the matching test reds).
T1.1 ORPHANED GRANDCHILD ON TIMEOUT (process leak / slow lock-up).
The provider ran the helper via `execFileSync("/bin/sh", ...)` whose timeout
SIGTERMs only the direct shell. A helper that backgrounds a child — a locked-
vault unlock agent, `keepassxc-cli` forking gpg-agent, a `( … ) & wait`
pipeline — leaves that grandchild ORPHANED on every 3s timeout. Resolved
per-key at gateway spawn under a locked/slow vault, that's a steady process
leak. Fix: run the helper in its OWN process group (`detached`) and, after
spawnSync returns, `process.kill(-pid, "SIGKILL")` the whole group so
grandchildren are reaped. Switched execFileSync -> spawnSync (group kill needs
the returned pid) behind a single `runHelper()`; killSignal is now SIGKILL
(SIGTERM let a blocked helper linger). All existing behavior preserved
(timeout, output cap, key-as-env-data, stderr-discard F6, structured-only
logging).
T1.2 WINDOWS SILENT DEAD-END (total key lock-out).
`/bin/sh` does not exist on win32, so a configured command provider threw
ENOENT -> caught -> EVERY key resolved to null silently (only a console.warn
the user never sees). A Windows community user who picks "command" gets a
silent lock-out of their keys with no explanation. Fix: `runHelper` short-
circuits on win32 (code EUNSUPPORTED_PLATFORM, no doomed spawn), and a new
exported `commandProviderUnsupportedReason()` gives the onboarding/Settings UI
an actionable message + steers the user to the env provider — no dead-end
picker.
T1.3 (install-gate / config-health fail direction) was investigated and is NOT a
bug: checkInstallStatus defaults hasApiKey=false and runConfigHealthCheck wraps
every check in try/catch (swallow + degrade), so a throwing secrets probe fails
SAFE (to setup / empty audit), never wedges the app. No change needed; verified,
not assumed.
Tests: new commandProvider.robustness.test.ts (orphan-reap live spawn; win32
gate: unsupported-reason, runHelper short-circuit, get()/list() degrade no-throw;
POSIX sanity). stdio F6 test reworked onto runHelper. Tracked secrets suite:
11 files, 107 green. tsc + eslint clean on changed files. No new failures in the
full suite (the 3 pre-existing reconcileStreamedWithDb reds are unrelated, see
PR findings).
…ardened base Reconciles fix/readiness-remote-mode-guard (vault bootstrap: first-run create, auto-detect, opt-in TPM seal, UID-safe paths, snap-KeePassXC, write path) onto the secrets/03 hardened tree, preserving AIR-001..015 (esp. AIR-014 orphan reap + AIR-015 Windows gate). Shared-core files (commandProvider.ts, index.ts, spawnRateFloor/property tests) taken from 03 — 03 is a strict superset of bootstrap's edits to them. AIR-010 sibling-divergence handled by additive layer, not raw merge.
sealKeyFileToTpm() ran `systemd-creds encrypt --with-key=tpm2` via synchronous execFileSync on the main thread. Measured worst case on a real TPM + snap box: a non-deterministic 7-15s block (bounded by TOOL_TIMEOUT_MS), freezing the whole UI on every onboarding seal attempt -- the lock-up mumbo hit live while the seal 'timed out while adding the password'. The outcome was already honest (never a false sealed:true; 0600 fallback on timeout); the defect was purely the block. Fix: add an async tryExecAsync (execFile wrapped in a never-rejecting Promise) used only for the SLOW calls; make sealKeyFileToTpm async/Promise<SealResult> and await it in the already-async vault-seal-tpm IPC handler. The fast sub-100ms probes (command -v / readlink, ~7ms total) stay on the sync helper. Proven: PROBE-5 (live tier) starts a 100ms event-loop ticker and asserts it keeps ticking during the seal -- 76 ticks over 7.6s with the fix; 0 ticks over a 15s freeze when reverted to sync (RED-proven). Tracked suite 166/166 green; full suite 1328 passed (the only 7 reds are the pre-existing live-smoke + reconcile- streamed set, identical on 03's tip). Catalog: AIR-016.
…bling) createVault() called keepassxc-cli db-create via synchronous execFileSync on the Electron main thread — the same wedge class as the TPM seal (AIR-016), just the sibling slow call. A snap-confined CLI or slow disk can make db-create take seconds, freezing the UI during first-run vault creation. Fix: createVault is now async/Promise<CreateVaultResult>, using the same tryExecAsync helper for the slow db-create; the already-async vault-create IPC handler awaits it. Renderer is unaffected (it calls over IPC, already a promise). All callers updated to await (typecheck enforces it — a sync caller won't compile against the Promise return). Tracked suite 166/166; full suite 1328 passed (only the known live-smoke + reconcile-streamed reds remain, identical on 03).
…providers screen
On Settings -> Security Providers, after a successful Test the resolved keys
rendered as bare name badges with no indication the VALUE comes from the vault.
Add a per-key 'Vault Provided' label next to each resolved key name so the user
can see the value is supplied by the vault (not typed / .env). Values are still
never shown — only names + this provenance label.
- New i18n string settings.secrets_vaultProvided ('Vault Provided').
- Per-key label span in the resolved-keys list (success-colored, uppercase, small).
- Test: assert exactly one 'Vault Provided' label per resolved key (2 keys -> 2),
RED-proven by removing the label span (test reds 'Unable to find element').
Tracked Settings/i18n suites green; full suite 1328 passed (only the known
live-smoke + reconcile-streamed reds remain, identical on 03).
…ocess.env overlay (AIR-017) Found by LIVE-DATA testing the IPC against the real vault: secretsProviderStatus() computed Object.keys(resolvedSecrets()), which overlays the ENTIRE process.env onto the vault keys — so in the Electron main process it returned ~130 keys (PATH, HOME, npm_config_*, DISPLAY, …), not the user's 6 real vault keys. The Security Providers screen renders each as a 'Vault Provided' badge, so it would falsely label every environment variable as vault-provided. This is the DISPLAY-path sibling of the appsec 'fail-open gate trap': the canWrite gate was already fixed to count providerListSafe() (vault-only) for exactly this reason; the display path never got the same treatment. Both the renderer test and the main-process contract test HID it by mocking the resolver to a small clean set (AIR-011 trap) — only real-vault + real-process.env testing surfaced the 130. Fix: Object.keys(providerListSafe(profile)) (vault-only, spawn-floor cached). The no-VALUE invariant was always intact; this corrects the key SET. Live-confirmed: count 130 -> 6 (the real vault keys only). Contract test re-pointed to mock providerListSafe (the fn the producer now calls) + new AIR-017 case asserting the process.env overlay does NOT bleed into the badge list; RED-proven by reverting (3 tests red, count 123). Full suite 1333 passed (only the pre-existing reconcile-streamed reds remain). Catalog: AIR-017.
…ual key toggle (AIR-018) Found by live GUI testing against the real vault: on the Setup model step, a vault-only user whose Anthropic credential is the OAuth token (CLAUDE_CODE_OAUTH_TOKEN, the Claude Code auth-patch token) was falsely forced to enter an API key. The onboarding's vaultHasModelKey() did a bare vaultKeys.includes(ANTHROPIC_API_KEY) with no alias awareness — the THIRD credential-name gate with this class of bug (install gate + warning banner already had partial alias bridges; Setup had none). Part 1 — detection: add CLAUDE_CODE_OAUTH_TOKEN as an ANTHROPIC_API_KEY alias in both config-health.ts KEY_ALIASES and a new MODEL_KEY_ALIASES in Setup.tsx (kept in lock-step). All three authenticate to Anthropic, so a vault holding any one already provides the model key. Part 2 — both options (mumbo's ask): when the vault covers the key, show an explicit toggle 'Use vault credential' vs 'Enter an API key' (radiogroup) on both the named-provider and custom-URL branches. Default vault; manual reveals the key field and re-requires a typed key. handleFinish validates the effective mode. Tests: Setup.test.tsx AIR-018 case (alias->vault-covered by default, toggle offers both, manual reveals field, switch-back hides, Finish works in vault mode with no empty setEnv). GREEN 11/11; RED-proven by reverting the alias (1 test red). Full suite 1334 passed (only pre-existing reconcile-streamed reds). Catalog: AIR-018.
…-guard + auto-load) — AIR-018 follow-up LIVE bug mumbo hit: on the Setup model step with an EXISTING vault (config already has secrets.provider=command), the Anthropic credential was NOT detected — no vault/manual toggle appeared, the screen demanded an API key. Two root causes, both in the renderer: 1. vaultHasModelKey() bailed on its FIRST line: if (secretsChoice === 'env') return false. secretsChoice is local React state that defaults to 'env' and only flips when the user CLICKS the command tile during onboarding. An existing-vault user reaches the model step with secretsChoice still 'env' while the CONFIG has a command provider — so detection died before the alias check ran. Fix: drop the env-guard; the resolved vaultKeys list is the authoritative signal (empty keys already yields false, and env resolves no provider keys). 2. vaultKeys was only populated by an explicit 'Test vault' click on the secrets step. A user who reached the model step without it had vaultKeys=[]. Fix: a useEffect auto-loads secretsProviderStatus() key names on entering the provider stage (self-guards: env returns no keys -> no toggle). Also aligned the Finish button's disabled condition to showingVaultCredential() (was bare vaultHasModelKey()) so it matches handleFinish validation. Diagnosed by instrumenting vaultHasModelKey: debug showed vaultKeys correctly held CLAUDE_CODE_OAUTH_TOKEN and anthropic was selected, but secretsChoice='env' early -returned. Test AIR-018b reproduces the existing-vault path (reach model step with NO Test-vault click); GREEN 12/12, RED-proven by restoring the env-guard (1 red). Full suite 1335 passed (only pre-existing reconcile-streamed reds).
…d in envkey-mismatch auto-fix (AIR-019, AIR-020) Two distinct bugs surfaced by mumbo live-testing chat after the AIR-018 toggle fix: AIR-019 — empty model.default bricks chat (400). The Setup model-name field is optional; a blank submission persisted model.default: "", and setModelConfig wrote it unconditionally. The gateway then POSTs model:"" → Anthropic 400 'model: String should have at least 1 character'. Fix: setModelConfig only writes model.default when the model string is non-empty — a blank call leaves any existing valid model untouched (never clobbers a good selection, never writes an empty one). Defense in depth in the main process so ANY caller is protected. Tests: empty-model guard + no-clobber, RED-proven (revert → 2 red). AIR-020 — credential-bleed in UI_RUNTIME_ENVKEY_MISMATCH auto-fix. When the expected provider key was empty, the detector picked ANY populated *_API_KEY / *_TOKEN in .env and offered to copy it across — so with ANTHROPIC_API_KEY empty it suggested copying the UNRELATED MATRIX_ACCESS_TOKEN into the Anthropic slot: a wrong, non-working value AND a mislabel of another service's secret. Fix: restrict candidates to KNOWN ALIASES of the expected key (KEY_ALIASES — ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN for ANTHROPIC_API_KEY). If no alias is populated there is no safe rename — that's MODEL_KEY_MISSING territory, not an auto-copy. Tests: alias-IS-flagged + unrelated-key-NOT-flagged, RED-proven (greedy revert → 1 red). Full suite 1338 passed (only pre-existing reconcile-streamed reds).
…v1/models (AIR-019) Completes the empty-model fix: the guard (prev commit) stops the empty-string brick, but a fresh user who leaves the optional model-name field blank still ended up with NO model. Per mumbo: query the provider's /v1/models endpoint and use a returned id as the fallback — a LIVE default that can't go stale like a hardcoded constant. handleFinish, when the field is blank, calls the existing discoverProviderModels IPC (reuses model-discovery.ts, which already auth-resolves via the vault and has Anthropic /v1/models support) and picks a model via pickDefaultModel(): prefer a clean stable id, de-prioritising dated snapshots (…-20250219) and -preview/-beta/ deprecated, keeping the provider's own ordering otherwise. Best-effort: if discovery is unreachable (no network / unresolved key) it persists no model and the main-process guard preserves any existing valid selection — never writes empty. Tests: pickDefaultModel unit suite (empty, snapshot-vs-stable, ordering, all-noisy fallback, trim/blank) + two Setup integration tests (blank field → discovered model persisted; discovery-unreachable → empty persisted, guard handles it). Full suite 1345 passed (only pre-existing reconcile-streamed reds).
…kill the false banner + harmful copy (AIR-021) mumbo's running app showed: 'ANTHROPIC_API_KEY is empty but CLAUDE_CODE_OAUTH_TOKEN has a value — likely saved under the wrong name' with a one-click 'copy across' auto-fix. Both the banner AND the fix are wrong: - FALSE POSITIVE: the gateway's anthropic provider plugin reads ANTHROPIC_API_KEY, ANTHROPIC_TOKEN AND CLAUDE_CODE_OAUTH_TOKEN directly (env_vars). A populated CLAUDE_CODE_OAUTH_TOKEN ALREADY satisfies the credential — ANTHROPIC_API_KEY being empty is correct/intended for an OAuth setup, not a misname. - HARMFUL FIX: an OAuth token (sk-ant-oat…) is only valid on the Authorization: Bearer path. Copied into ANTHROPIC_API_KEY it is sent as x-api-key → Anthropic 401 'invalid x-api-key' (the documented OAuth-in-api-key-slot self-inflicted-401 trap). The auto-fix would BREAK a working setup. Fix: when a populated accepted-alias exists, the credential is satisfied — emit NO issue (return []), exactly like the customEndpointKeyResolvable early-return. The copy-suggestion path is removed entirely (the AIR-020 alias-restriction already killed the unrelated-key bleed; this removes the remaining alias-copy footgun). When neither expected key nor any accepted alias is populated, that's MODEL_KEY_MISSING territory, not a rename. Tests: 'OAuth-token alias = SATISFIED, no false mismatch/copy' + the unrelated-key guard retained. RED-proven by restoring flag-on-alias (1 red). Full suite 1345 passed (only pre-existing reconcile-streamed reds). NOTE: the live trigger was ALSO a stale-token desync (tmpfs/.env held a dead OAuth token while ~/.claude/.credentials.json was fresh) — fixed operationally by mirroring the fresh token into the vault+tmpfs; this commit fixes the bogus banner that the desync surfaced.
…ock Send for vault OAuth users (AIR-022) mumbo's live error after the token-desync fix: 'Missing ANTHROPIC_API_KEY — required by the active provider.' This is the FOURTH credential-name gate with the OAuth-alias gap (install gate, config-health banner, config-health mismatch were the first three). validateChatReadiness (the pre-send gate that enables/disables Send) checked only the canonical ANTHROPIC_API_KEY and an auth.json OAuth path — it did NOT recognize CLAUDE_CODE_OAUTH_TOKEN / ANTHROPIC_TOKEN in .env, so a vault user whose Anthropic credential is the OAuth token got a false MISSING_API_KEY block and a disabled Send button even though the gateway authenticates fine via the Bearer path. Fix: before returning MISSING_API_KEY, check KEY_ALIASES (kept in lock-step with config-health.ts and Setup.tsx) for a populated accepted alias; if present, fail open (return OK). Mirrors the install gate and the other three surfaces. Tests: OAuth-token alias allows Send, ANTHROPIC_TOKEN alias allows Send, and an UNRELATED token (MATRIX_ACCESS_TOKEN) still BLOCKS (no credential-bleed false-pass). RED-proven by removing the alias loop (2 red). Full suite 1348 passed (only pre-existing reconcile-streamed reds). This was the last gate — all four credential-name surfaces now accept the OAuth token alias.
… provider to the gateway (AIR-023)
Audit of provider <-> gateway <-> security-provider compatibility (mumbo's
request, generalized for the community client): the desktop's KNOWN_API_KEYS
list — the credential env-var names forwarded from the secrets provider into the
agent env on the CLI/non-gateway fallback path — had DRIFTED out of sync with the
gateway provider plugins' env_vars. It was missing every OAuth/Bearer-token name
and per-vendor alias that isn't the canonical <VENDOR>_API_KEY:
ANTHROPIC_TOKEN, CLAUDE_CODE_OAUTH_TOKEN (anthropic OAuth/Bearer),
GOOGLE_API_KEY/GEMINI_API_KEY, ZAI_API_KEY/Z_AI_API_KEY,
COPILOT_GITHUB_TOKEN/GH_TOKEN/GITHUB_TOKEN, KIMI_CODING_API_KEY/KIMI_CN_API_KEY,
DASHSCOPE_API_KEY/ALIBABA_CODING_PLAN_API_KEY, XAI_API_KEY, NVIDIA/NOVITA/
STEPFUN/GMI/ARCEEAI/KILOCODE/OPENCODE_ZEN/OPENCODE_GO/QWEN/NOUS/AZURE_FOUNDRY.
So a vault-only user whose provider credential is stored under any of these names
got NO key forwarded on that path — a whole class of community users, not just the
Anthropic-OAuth case that surfaced it. (The primary buildGatewayEnv path already
overlays the full providerListSafe() set unfiltered and was complete; this brings
the CLI-fallback path to parity.)
Fix: extend KNOWN_API_KEYS to mirror the plugins' env_vars, extract it to an
exported module-level const, and add a drift-guard test (knownApiKeys.test.ts)
that asserts the set CONTAINS the OAuth/Bearer names + per-vendor aliases — a
behavior contract, not a snapshot, so it survives reordering but reds if a
credential name is dropped. RED-proven by removing CLAUDE_CODE_OAUTH_TOKEN
(2 tests red). Full suite 1354 passed (only pre-existing reconcile-streamed reds).
Compatibility audit summary (all GREEN after this):
- model provider (anthropic) <-> secrets provider (command/vault): credential
resolved, no OAuth-in-api-key-slot duplication.
- gateway-spawn env: buildGatewayEnv forwards full provider set; CLI fallback
now at parity.
- 5 credential-name gates (install, config-health warn, config-health mismatch,
Setup detect, chat-readiness) + this env-forward layer all accept the alias.
…reptile P1) The credential NAME-ALIAS map (ANTHROPIC_API_KEY → ANTHROPIC_TOKEN, CLAUDE_CODE_OAUTH_TOKEN) was defined THREE times independently — config-health.ts, validation.ts, and Setup.tsx (as MODEL_KEY_ALIASES) — kept in sync only by comments. Adding an alias to one but not the others would silently split the five security gates (Greptile P1 on PR fathah#673). Centralize it in src/shared/url-key-map.ts (already the single source of truth for credential-key names, already imported by both main and renderer) as KEY_ALIASES + aliasesForEnvKey(). All three sites now import it; no local copies remain. Test: src/shared/url-key-map.test.ts guards the shared table. RED-proven — removing CLAUDE_CODE_OAUTH_TOKEN from the ONE shared map now reds 5 tests across the shared guard + validation (AIR-022) + Setup (AIR-018/018b), confirming every gate consumes the single source. Full suite 1549 passed.
…ary probes (Greptile P1+P2) Two vaultBootstrap fixes from the PR fathah#673 review: P1 — detectExistingVault hardcoded 'keepassxc-cli' in its suggestedCommand for an on-disk vault, while createVault (same module) correctly uses the snap-aware resolveKeepassxcCli() name. On a snap-only system the binary is 'keepassxc.cli', so a user who already has a vault was shown a read command that fails immediately. Now resolves the CLI name (falling back to the apt name for display when none is installed yet), matching createVault. P2/security — hasBinary() and keepassxcIsSnap() interpolated their name/cli argument into a /bin/sh -c string without quoting. All current callers pass hardcoded literals so there's no live exploit, but the unquoted surface is a trap for any future dynamic caller. Both now shellQuote the interpolated value (reusing the module's existing shellQuote helper). vaultBootstrap tests 18/18; full suite 1549 passed.
…check TS1117) The upstream merge (5185d0d) brought together two identical invalidateSecretsCache property definitions in the contextBridge object literal in preload/index.ts — one from upstream, one from secrets/04. tsc -p tsconfig.node.json (the CI typecheck config) rejects this with TS1117 'An object literal cannot have multiple properties with the same name'; the duplicate was the CI 'check' job failure on PR fathah#673. (Local bare 'npx tsc --noEmit' uses the default config and did not flag it — must run 'npm run typecheck' to match CI's per-project tsconfig.node/web invocation.) Removed the duplicate (kept upstream's, identical body). npm run typecheck and npm test both pass.
… vi.hoisted harness The merge resolution (5185d0d) took secrets/04's test harness for config-health.test.ts, which mocked profilePaths on ./config — but the source imports it from ./utils. That worked locally (~/.hermes/config.yaml exists) but failed in CI's clean checkout where real paths resolve to non-existent files → configExists=false → EMPTY_API_SERVER_KEY doesn't fire → 5 test failures. Root cause: mock on wrong module (./config instead of ./utils). Upstream had already fixed this with a vi.hoisted + vi.mock("./utils") + lazy await-import harness designed for CI determinism. Fix: take upstream's robust harness as the base, port secrets/04's additional tests (alias-awareness + connection-mode-audit) onto it. Added getConnectionConfig to the mocks object, mock factory, and alias section. All 19 tests pass with profilePaths properly mocked on ./utils. Full suite 1549/0, npm typecheck clean.
## Summary Adds a config-gated opt-out for the auto-updater. Auto-update remains ENABLED BY DEFAULT — only an explicit `desktop.auto_update: false` (or `0`) in config.yaml disables it, so behavior is unchanged for everyone who never sets the key. The opt-out exists for users who run a locally-built or patched `/opt` artifact: electron-updater's `autoDownload` + `autoInstallOnAppQuit` will otherwise re-download the public release and silently overwrite their build on quit. - `isAutoUpdateDisabled()` — pure, exported decision in config.ts (mirrors the existing `decideCanWrite` extraction pattern) so the gate is unit-testable without the Electron/IPC coupling in `setupUpdater()`. - `setupUpdater()` short-circuits via the same no-op-IPC path already used for dev/portable builds when the opt-out is set — no autoDownload wiring runs. - Settings UI: an "Automatic updates" toggle (config-backed, optimistic with rollback on failure, shows a restart notice since the gate is read once at launch). English i18n strings added to the existing settings namespace (other locales fall back to English). - Docs: a "Disabling auto-update" section + troubleshooting row in the KeePassXC vault guide (where a patched-build user is most likely to look). ## Testing - `npm run typecheck` — clean. - `npx vitest run src/main/isAutoUpdateDisabled.test.ts` — 5/5 pass: default ON for null/unset/empty/whitespace, disabled only for explicit false/0, case- and whitespace-insensitive, any unrecognized value fails safe to ON. - Full main-process suite (secrets + config-health + this) — 192/192 pass. - prettier/eslint: no new warnings on changed lines.
Follow-up hardening on f5b4aba (desktop.auto_update opt-out). The opt-out decision was duplicated: config.ts had its own isAutoUpdateDisabled() and the renderer's Settings toggle inlined the same `=== "false" || === "0"` check. Two copies of one security-relevant gate is a sibling-asymmetry drift risk — if one side's accepted values changed, the UI and the updater would silently disagree about whether auto-update is on. Changes - New src/shared/auto-update-gate.ts: the single source of truth. Takes `unknown` and coerces via String() so the renderer can pass getConfig()'s raw return straight in. ENABLED BY DEFAULT; only explicit "false"/"0" disables; null/unset/empty/whitespace/garbage all fail SAFE to upstream-ON. - src/main/config.ts now RE-EXPORTS isAutoUpdateDisabled from the shared helper (main gate in setupUpdater() unchanged at the call site). - src/renderer Settings.tsx calls the shared helper instead of its inline copy. - src/main/autoUpdateGateParity.test.ts: drift guard — asserts the main re-export IS the shared helper (same reference) and agrees across the full input matrix. Reds if anyone reintroduces a divergent copy. - src/shared/auto-update-gate.test.ts: 7 contract/adversarial tests (default-ON, empty/whitespace, explicit disable, case/whitespace-insensitive, fail-safe on garbage, non-string coercion never throws, renderer write-vocabulary round-trip). Supersedes the removed src/main/isAutoUpdateDisabled.test.ts. - docs/diagrams/auto-update-gate-diagrams.md: logical flow + SECRET/overwrite- gate workflow (Mermaid, both validated to parse). SDLC - Step-0: security-relevant (controls the build-overwrite behavior + config parse + renderer surface). - AppSec two-person rule: independent delegated audit returned a HIGH (the refactor was un-applied on config.ts after a RED-proof `git checkout` reverted it) — fixed and re-verified by reading real bytes + live test run. Cataloged as AIR-024 (verification-integrity class). Final verdict: SHIP. - typecheck (node+web) clean; gitleaks clean; zero deps introduced; semgrep on-diff = 4 benign i18next-key-format style FPs. - Full main-process + shared suite: 235/235 pass (incl. live vault/TPM probes). Fail-direction: the gate fails CLOSED to the upstream default (ENABLED). A config typo can never silently disable security updates. Rollback: revert this commit — f5b4aba's gate keeps working (it just regains a local copy of the decision); no migration, no data touched.
…decision)
The auto-update opt-out had a tested DECISION (isAutoUpdateDisabled) but the
WIRING gate it feeds — the early return in setupUpdater() that must fire BEFORE
`autoUpdater.autoDownload = true` / `autoInstallOnAppQuit = true` — was an
untested inline boolean (`!app.isPackaged || isPortableBuild || autoUpdateDisabled`).
That early return IS the protection the opt-out exists for (it's what stops the
updater from overwriting a patched /opt build on quit), so it deserves its own
regression test.
- Extract the inline condition to a pure `shouldSkipUpdaterWiring({isPackaged,
isPortableBuild, autoUpdateDisabled})` in config.ts (mirrors the decideCanWrite
/ isAutoUpdateDisabled extraction pattern) so the gate is unit-testable without
the Electron/ipcMain/require("electron-updater") coupling.
- setupUpdater() now calls the predicate; behavior is identical (the truth-table
test pins equivalence to the old inline expression).
- New src/main/shouldSkipUpdaterWiring.test.ts: full 8-row truth table (only a
packaged, non-portable, enabled build wires the updater; every other combo
skips), the safety-critical packaged+opt-out=SKIP case stated explicitly, and
an end-to-end compose-with-isAutoUpdateDisabled check (false/0 => skip; null/
empty/garbage => wire, fail-safe to upstream-ON).
Testing
- typecheck (node+web) clean.
- Full main+shared suite 265/265 pass.
- RED-proven: dropping the autoUpdateDisabled skip condition reds 3/4 cases
(the opt-out path stops skipping) — restored via reversible patch (no
git checkout, per AIR-024).
No production behavior change — this makes an existing safety gate provable.
Rollback: revert this commit; the inline boolean returns, gate behavior unchanged.
… canaries)
Adds the Greptile family-6 (data-not-code) adversarial suite for the SSH remote
command builders. The existing ssh-remote.test.ts proves command STRUCTURE
(quoting shape) + NUL round-trip; this proves SAFETY — that a hostile arg
crossing into the `sh -c` string built by buildRemoteHermesCmd is treated as
inert data and never executes.
Method (the honest one): for each canary, build the real command, run it through
a real shell with a fake `hermes` shim at $HOME/.local/bin/hermes (the absolute
probe path, per the PR-6 CLI-resolution fix), then assert (1) the side-effect the
injection WOULD cause did NOT happen (a canary file is never created) and (2) the
hostile string arrived at the shim verbatim as a single argument.
Canaries: $(touch), backticks, `; cmd`, `&& cmd`, `| cmd`, newline-injected cmd,
redirect overwrite, single-quote breakout, ${IFS} expansion, subshell, plus
$PATH-no-expand and ../traversal-stays-one-arg. Also covers the extraShell
redirect path and the sshSetConfigValue YAML-scalar guard (", \, CR, LF rejected
before any write; a benign URL is NOT rejected).
Testing
- typecheck clean; 17/17 pass; prettier/eslint clean on the new file.
- RED-proven: weakening shellQuote to naive double-quotes reds 11/17 — the
$()/backtick/$PATH/${IFS} canaries fire (canary file created / args mangled),
exactly as a real injection would. Restored via reversible patch (no git
checkout, per AIR-024). Note: a RED-proof of an injection test BY DESIGN
triggers the injection — run such proofs from a temp CWD to avoid littering
the repo root with redirect-target files.
No production code changed — this pins an existing security property.
…p-every-launch bug) A vault-only user saw the first-run Setup screen on EVERY launch. Root cause: checkInstallStatus() decides whether Setup shows (App.tsx: `!status.hasApiKey -> "setup"`), and its .env check, envHasUsableValue(), did an EXACT `key === expectedKey` match. A vault user stores the Anthropic credential under an ALIAS — CLAUDE_CODE_OAUTH_TOKEN (or ANTHROPIC_TOKEN) — not the canonical ANTHROPIC_API_KEY. So the gate returned hasApiKey=false and forced Setup every launch even though a usable credential was present. This is the install gate joining the credential-name-alias family the other gates already handle (config-health, validation, chat-readiness). Per AIR-018: fix the CLASS across ALL gates — the install gate was the missed one. - envHasUsableValue() is now alias-aware: it accepts the canonical key OR any of its aliases from the shared KEY_ALIASES map (../shared/url-key-map — ANTHROPIC_API_KEY -> [ANTHROPIC_TOKEN, CLAUDE_CODE_OAUTH_TOKEN]), the same single source of truth config-health.ts and validation.ts use. The match stays an allowlist scoped to the provider's own key names — an unrelated token (TELEGRAM_BOT_TOKEN / MATRIX_ACCESS_TOKEN) does NOT satisfy the gate (no credential bleed). - Robustness (pre-existing bug surfaced while here): re-trim the value AFTER quote-stripping so a quoted-blank `KEY=" "` no longer falsely satisfies the gate. This made the exact-match path too, not just aliases. - envHasUsableValue() is now exported for unit testing. Testing - typecheck clean; full suite 1417 pass / 3 skip. - New tests/install-gate-credential-alias.test.ts (8 cases): canonical accepted; ANTHROPIC_TOKEN + CLAUDE_CODE_OAUTH_TOKEN aliases accepted (incl. surrounded by unrelated tokens + comments); credential-bleed guard (unrelated token NOT accepted); empty / quoted-blank rejected; null-expectedKey path unchanged. - RED-proven: removing alias acceptance reds 3/8 (the alias cases); the credential-bleed guard stays green. Restored via reversible patch (no git checkout, per AIR-024). - Verified LIVE against the real vault .env: BEFORE (exact-match) hasApiKey=false (Setup shown), AFTER (alias-aware) hasApiKey=true (Setup skipped). - Independent appsec audit: SHIP (allowlist correctly scoped; fail-direction STRICTER not looser; one shared KEY_ALIASES source of truth; no exploitable findings). Rollback: revert this commit; the exact-match gate returns (and the Setup-every-launch bug with it).
…AIR-016) Greptile follow-up on the security-providers PR: commandWriteSecret / commandDeleteSecret used execFileSync, blocking the Electron main thread up to 5s on the vault-write IPC path — while createVault / sealKeyFileToTpm in the same PR were correctly made async. Closes that consistency gap. Implementation: rewritten on child_process.spawn (NOT promisify(execFile) — async execFile does NOT honor the `input`/stdin option, which would silently break value delivery; verified against a real /bin/sh). A shared runHelper() spawns `/bin/sh -c <command>` and writes the value to child.stdin explicitly. All security invariants preserved (appsec-reviewed, 8/8 PASS, SAFE TO MERGE): - value on stdin ONLY — never argv, shell string, or env - key NAME via HERMES_SECRET_KEY env only; validated /^[A-Za-z_]\w*$/ pre-spawn - hard timeout SIGKILLs a hung helper; output capped at 1 MiB - stderr piped + discarded (can echo the value) — never inherited - coarse, secret-free errors (timeout/exit-N/helper-not-found/bad-key) - single Promise resolution via idempotent settled/finish guard - EPIPE on early helper exit handled (stdin 'error' listener before write) Callers: the two async ipcMain.handle handlers now `await` the result and add a defensive `.catch` returning a coarse error (belt-and-suspenders per the review's LOW finding). Tests: mock rewritten for spawn (EventEmitter child); +2 branch tests (timeout, helper-not-found). 162/162 secrets-suite tests; typecheck + eslint clean. Real-/bin/sh runtime check confirms value reaches stdin and the delete path closes stdin without hanging (the bug the execFile attempt hid).
…IR-026 / Greptile P1) The install gate fell through to a broad /(_API_KEY|_TOKEN)$/ scan when the catalogued provider's expected key was not resolved directly — so a vault holding only an unrelated token (GITHUB_TOKEN, SLACK_BOT_TOKEN) and no LLM key falsely cleared the gate and showed chat instead of routing the user back through Setup. Fix: extract vaultResolvedHasKey(resolved, expectedKey). When expectedKey is known (catalogued provider), accept ONLY that key or one of its accepted aliases via the shared aliasesForEnvKey() / KEY_ALIASES single source of truth. The broad fallback now fires only when expectedKey is null (uncatalogued provider — no canonical name to match). Mirrors config-health.ts resolvedHasKey(), closing a sibling-asymmetry. Fail-closed: a resolver error leaves hasApiKey false (routes to Setup). Member of the credential-name-alias-across-gates class (AIR-026), same class as de949a7. Tests: tests/installer-vault-gate.test.ts (8) — bug-repro reds without the fix (pre-fix broad scan returned true for {GITHUB_TOKEN}); covers exact key, real aliases from the live map, blank/whitespace/non-string values, and the expectedKey-null boundary. Typecheck clean (node+web); semgrep TS clean on installer.ts. AppSec verdict: SHIP. Diagrams: docs/diagrams/install-gate-vault-alias-diagrams.md.
- Replace stale FAKE_VAULT bare variable with mocks.fakeVault - Fix ReturnType<typeof getConnectionConfig> → typeof mockedGetConnectionConfig (getConnectionConfig not in scope as a named import in the vi.hoisted harness)
|
Rebased onto current main (31 commits behind → 0). All conflicts resolved: Conflicts resolved:
Verification: |
…tainer tests/ssh-remote.test.ts wrote its fake `hermes` shim into $HOME/bin and made it reachable only by prepending that dir to PATH. The command under test (buildRemoteHermesCmd) runs under `bash -lc` — a LOGIN shell that re-sources /etc/profile and resets PATH — and probes a list of ABSOLUTE venv paths before falling back to `command -v hermes`. In a clean CI container (node:22-bookworm) there is no real `hermes` anywhere and the prepended PATH entry does not survive the login shell, so `command -v hermes` finds nothing and the command exits 1 with "hermes CLI not found" — 4 failing cases. On a dev box the same test passed for the WRONG reason: `command -v hermes` resolved a real host hermes. Fix is test-only: install the shim at $HOME/.local/bin/hermes — a path buildRemoteHermesCmd probes BY ABSOLUTE PATH ([ -x $HOME/.local/bin/hermes ]) before the PATH-dependent fallback. The shim is now hit deterministically, independent of login-shell PATH behavior and of whether a real hermes exists on the host. PATH is still prepended as belt-and-suspenders. No production code changes. Verified GREEN in the real CI image via `forgejo-runner exec` on node:22-bookworm: tests/ssh-remote.test.ts 17/17 pass (was 4 failing). Typecheck clean on the tracked surface.
Security Providers: vault-backed secret management for the desktop
This PR adds a Security Providers capability to the desktop app: choose and test a
secrets provider (KeePassXC/command, etc.), have vault-resolved credentials feed the
gateway, and surface clear config-health diagnostics — all without writing secrets to
plaintext
.env. It then hardens that feature against a series of real, live-data-foundcorrectness and credential-routing bugs (the AIR-017→023 series below).
What it adds
resolves (labeled "Vault Provided"), and Refresh-from-vault.
KeePassXC vault-create flow (picker + prerequisites, not a dead-end).
credentials and never blocks Send on a credential the gateway can actually resolve.
a multi-second crypto operation.
Correctness / credential-routing hardening (found by live-data testing)
Each of these was found by exercising the feature against a real vault + real OAuth
token, not mocks — and each is fixed with a RED-proven regression test:
secretsProviderStatuslisted vault keys plus all of process.env (130badges, not 6). Now lists vault-only keys.
token (
CLAUDE_CODE_OAUTH_TOKEN); added alias detection + a "use vault / enter a key"toggle.
model.default(→model: ""→ API 400).Now never writes an empty model, and a blank field falls back to a model discovered via
the provider's
/v1/models.token (e.g.
MATRIX_ACCESS_TOKEN) intoANTHROPIC_API_KEY— a credential-bleed.Restricted to known aliases.
one-click "copy into ANTHROPIC_API_KEY" fix that would send the OAuth token as
x-api-key→ 401. A populated accepted alias is now treated as satisfied (no banner,no copy).
ANTHROPIC_API_KEY" for vault OAuth users. Now accepts the alias.
the gateway (
KNOWN_API_KEYS) had drifted out of sync with the gateway provider plugins'env_vars, dropping every OAuth/Bearer name and per-vendor alias (anthropic, gemini,copilot, zai, kimi, dashscope, xai, nvidia, …). Extended to mirror the plugins, with a
drift-guard test.
Tests
Full suite green except 3 pre-existing
reconcile-streamed-with-dbfailures that areunrelated to this work (present on the base). Each AIR fix ships a behavior-contract test
(not a snapshot), and the credential-name set has a drift-guard so it can't silently fall
out of sync with the provider plugins again.
Notes for reviewers
buildGatewayEnvpath already overlayed the full provider secret set; theAIR-023 fix brings the CLI-fallback path to parity.
across all five gates (install, config-health warn, config-health mismatch, Setup detect,
chat-readiness) plus the env-forward layer.