fix(tui): credential-isolated env-token auth (ends recurring 401) + reap defunct sessions#141
Merged
Conversation
…ct sessions Root cause (PI231 incident): tmux does not forward the parent's env to the pane, so the TUI claude never saw CLAUDE_CODE_OAUTH_TOKEN and fell back to ~/.claude/.credentials.json, whose single-use refresh token got corrupted to an empty string by the per-request spawn + kill-session teardown racing claude's token rotation -> permanent "Please run /login" 401 (re-login re-corrupted on the next spawn). Connected leak: the pane's claude is a child of the tmux server (not node), so kill-session left <defunct> zombies the server never reaped (25 over 30 days; tmux kill-server dropped it 25->3). Fix 1: buildTuiCmd now adds CLAUDE_CODE_OAUTH_TOKEN=<shq-escaped> to the pane env prefix when the env is set, so claude authenticates via the long-lived token and never touches the credentials.json refresh path (matching stable hosts). Unset -> no token added (credentials.json-only hosts unaffected). Fix 2: reapStaleTuiSessions kill-servers after clearing our own sessions ONLY when no foreign tmux session remains (never disrupts a co-hosted olp-tui-*). kill-server is the only node-reachable action that ACTUALLY reaps -- server exit reparents survivors to init, which waitpids them; a per-session kill cannot, since node is not the zombies' parent. Added a 15-min periodic reap (server.mjs) gated on TUI_MODE and on the TUI path being idle. Residual: a request whose pane is created in the idle-check/kill-server window fails cleanly via the existing honesty gates (documented). ALIGNMENT: Class B (OCP-owned TUI spawn). cli.js does NOT perform either operation -- there is no cli.js analogue for "how the TUI pane authenticates" or "reaping tmux-server-owned zombies"; authorized by ADR 0007 (PR-C amendment) per ALIGNMENT.md's Class B citation requirement. No Class A wire surface, no endpoint shape, no alignment.yml token, and no models.json entry touched. Tests: +6 in test-features.mjs (buildTuiCmd token set/unset/shq-injection; reaper kill-server ours-only / foreign-present / no-server). 241 passed, 0 failed. Co-Authored-By: Claude <claude-opus> <noreply@anthropic.com>
…n shadowing) Passing CLAUDE_CODE_OAUTH_TOKEN to the spawned interactive `claude` (commit 6394ca3) is necessary but INSUFFICIENT to fix the PI231 401: interactive `claude` PREFERS ~/.claude/.credentials.json over the env var (unlike `-p`, where the env token wins), so a stale/corrupt credentials.json SHADOWS the env token. Decisive live evidence on PI231 (claude 2.1.104): - env token passed + a broken ~/.claude/.credentials.json present → 401 ("Please run /login · API Error: 401"). - env token passed + credentials.json moved aside → real answer. Fix: when CLAUDE_CODE_OAUTH_TOKEN is set (and OCP_TUI_HOME is unset), run the TUI `claude` in a CREDENTIAL-FREE scratch home (<HOME>/.ocp-tui/home) that has NO credentials.json — no symlink, no copy. The env token is then the only credential and is authoritative because nothing shadows it. This ALSO ends the original refresh-corruption incident (25-zombie / empty-refresh-token) at the ROOT: with no credentials file, claude never runs the token-refresh path, so the single-use refresh token can never be rotated/corrupted by the per-request spawn+kill cycle. This RESOLVES — not reintroduces — the ADR 0007 scratch-home concern. The old caveat was about a SYMLINKED credentials.json being forked on token refresh; in env-token mode there is no credentials file to fork and no refresh ever happens. Mechanism: scratch HOME (not CLAUDE_CONFIG_DIR). The claude binary supports CLAUDE_CONFIG_DIR, but it relocates transcripts to <CONFIG_DIR>/projects/ rather than <HOME>/.claude/projects/, forking the transcript-resolution rule across modes for no benefit. Scratch-HOME reuses the existing, tested prepareTuiHome/ehome plumbing; readTuiTranscript reads from the same home claude runs under, so transcripts land under the scratch home and findTranscriptPath globs them there. Backward compatible: when CLAUDE_CODE_OAUTH_TOKEN is unset, behaviour is byte-for- byte unchanged (real home + credentials.json) so hosts that intentionally rely on credentials.json are unaffected. Explicit OCP_TUI_HOME still wins. Onboarding + cwd-trust are seeded in the scratch .claude.json (hasCompletedOnboarding=true + trust ONLY the scratch cwd) so no interactive trust/onboarding dialog can hang the turn. Changes: - lib/tui/session.mjs: add resolveTuiHome() (pure) + DEFAULT_TUI_SCRATCH_HOME; prepareTuiHome() gains { envTokenMode } — skips the credentials symlink and seeds a minimal .claude.json; runTuiTurn derives envTokenMode = token set && ehome!==rhome. - server.mjs: TUI_HOME computed via resolveTuiHome(); boot log surfaces the auth mode. - test-features.mjs: env-token credential-free prepareTuiHome test (asserts NO credentials.json created/symlinked, .claude.json seeded with onboarding + cwd trust) + 3 resolveTuiHome decision tests; existing buildTuiCmd-token + reaper + legacy/real-home tests stay green (245 passed, 0 failed). - docs/adr/0007: PR-D amendment (corrects the PR-C rationale + the original scratch-home caveat); README Troubleshooting #401 + env-var table + TUI section. ALIGNMENT: Class B (OCP-owned TUI spawn). cli.js has no analogue for the TUI pane's auth/home strategy — authorized by ADR 0007 (PR-D amendment) per ALIGNMENT.md's Class B citation requirement. No Class A wire path, no alignment.yml blacklist token, no models.json touched. server.mjs is touched only to wire TUI_HOME via resolveTuiHome() and surface auth mode in the boot log. Co-Authored-By: Claude <claude-opus> <noreply@anthropic.com>
dtzp555-max
added a commit
that referenced
this pull request
Jun 13, 2026
…g 401) + defunct-session reaping (#141) (#142) Bump 3.20.0 → 3.20.1 + CHANGELOG. Ships the already-merged, twice-reviewed #141 (credential-isolated env-token home + zombie reaping). README/docs updated in #141. Co-authored-by: dtzp555 <dtzp555@gmail.com> Co-authored-by: Claude <claude-opus> <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes TUI-mode's recurring
Please run /login · API Error: 401(the PI231 incident) — a credential-shadowing + refresh-token-corruption bug — and reaps leaked defunctclaudesessions.Root cause (proven live on PI231, claude 2.1.104 Linux): interactive
claudeprefers~/.claude/.credentials.jsonover theCLAUDE_CODE_OAUTH_TOKENenv var (unlike-pmode, where the env token wins). OCP TUI's per-request spawn +kill-sessioncycle races claude's single-use refresh-token rotation → the refresh token gets corrupted to an empty string → permanent 401;claude /loginonly re-corrupts on the next spawn (the treadmill). macOS hosts read credentials from the Keychain, so this bit Linux/file-based hosts specifically — Mac mini was immune.Fixes:
buildTuiCmdpassesCLAUDE_CODE_OAUTH_TOKENto the spawned claude (necessary but, alone, insufficient — see #2).<realHome>/.ocp-tui/home, overridable byOCP_TUI_HOME) seeded with onboarding + cwd-trust but no.credentials.json, so the env token is the only credential. claude never runs the refresh path → never corrupts it. Recurrence-proof: a futureclaude logincan no longer break TUI.reapStaleTuiSessionsissuestmux kill-serveronly when no foreign session remains + a 15-min idle-gated periodic reap.When
CLAUDE_CODE_OAUTH_TOKENis unset → byte-for-byte the prior real-home + credentials.json behaviour (backward compatible).ALIGNMENT
Class B (OCP-owned TUI spawn/home strategy;
cli.jshas no analogue). ADR 0007 PR-D amendment. No Class A wire path /.github/workflows//models.jsontouched; CI blacklist unaffected.Verification
npm test→ 245 passed, 0 failed.6394ca3(token passing + reaping);52ddb99(credential isolation) — the second binary-verified that the seededhasCompletedOnboarding+projects[cwd].hasTrustDialogAcceptedflags exactly match claude's load-bearing gate keys, so no trust/onboarding dialog can hang a turn.credentials.json(empty refresh token) RESTORED in the real home, a TUI turn returns a real answer via the credential-free scratch home; the real-home credentials are left untouched.🤖 Generated with Claude Code