Skip to content

Skip sudo wrapper when os_user matches bot process user#192

Merged
dcellison merged 6 commits intomainfrom
fix/self-sudo-skip
Mar 26, 2026
Merged

Skip sudo wrapper when os_user matches bot process user#192
dcellison merged 6 commits intomainfrom
fix/self-sudo-skip

Conversation

@dcellison
Copy link
Copy Markdown
Owner

Summary

When os_user (from users.yaml) or CLAUDE_USER matches the user the bot process is already running as, skip the sudo -u wrapper instead of crashing with "Claude process EOF."

The problem

_ensure_started() unconditionally wraps the Claude command with sudo -u <user> when claude_user is set. Self-sudo (sudo -u kai when already running as kai) fails because sudoers disallows it, and the subprocess dies immediately.

The fix

Compare self.claude_user against pwd.getpwuid(os.getuid()).pw_name at spawn time. If they match, treat it as if claude_user were None (no sudo, no process group, no PGID). Log a warning so the misconfiguration is visible.

Changes

Change Detail
import pwd Added to claude.py
effective_claude_user local Avoids mutating self.claude_user
Self-sudo detection pwd.getpwuid(os.getuid()).pw_name comparison
Warning log "os_user matches the bot process user; skipping sudo"
Downstream refs start_new_session, _pgid, log message use effective_claude_user

Test plan

  • test_self_sudo_skipped - same user skips sudo, no process group, warning logged
  • test_different_user_still_uses_sudo - different user still gets sudo wrapper
  • Existing command construction tests pass
  • All 1175 tests pass, make check clean

Fixes #191

When claude_user equals the current process user, self-sudo fails
because sudoers disallows it. Detect this at spawn time using
pwd.getpwuid(os.getuid()).pw_name and skip the sudo wrapper,
logging a warning about the misconfiguration.

The local effective_claude_user variable avoids mutating the
instance attribute, which may be read elsewhere.

Fixes #191
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

PR Review: Skip sudo wrapper when os_user matches bot process user

The logic is sound and solves a real problem cleanly. Three findings:


WARNING — Unhandled KeyError from pwd.getpwuid (claude.py, ~line 217)

current_user = pwd.getpwuid(os.getuid()).pw_name

pwd.getpwuid() raises KeyError when the UID has no /etc/passwd entry — common in containers with --user <uid> that doesn't map to a named account, or rootless Podman. This would crash _ensure_started() in exactly the environments where you'd expect a bot process to run. The safe fallback is to treat it as "not the same user" and proceed with sudo:

try:
    current_user = pwd.getpwuid(os.getuid()).pw_name
except KeyError:
    current_user = None

SUGGESTION — import pwd inside test method body (test_claude.py, line 175)

async def test_self_sudo_skipped(self, caplog):
    """..."""
    import pwd

import os was correctly added at the top of the file (line 17), but import pwd was left inline. Move it to the module-level imports for consistency.


SUGGESTION — test_different_user_still_uses_sudo doesn't isolate pwd lookup (test_claude.py, ~line 200)

The test passes claude_user="some_other_user" and relies on the real pwd.getpwuid(os.getuid()) call not returning "some_other_user" — which is almost certainly true, but it's an implicit environmental assumption. Patching pwd.getpwuid to return a fixed value would make the test hermetic and document the intent clearly:

with patch("kai.claude.pwd.getpwuid", return_value=MagicMock(pw_name="kai")):
    await claude._ensure_started()

Overall: Clean fix, good use of effective_claude_user to avoid mutating self.claude_user, and start_new_session/_pgid are updated consistently. The KeyError edge case is the only one worth fixing before merge.

…eticity

- Wrap pwd.getpwuid() in try/except KeyError for container environments
  where the UID has no /etc/passwd entry (falls through to sudo path)
- Move import pwd to module level in test_claude.py
- Patch pwd.getpwuid in test_different_user_still_uses_sudo for hermeticity
- Add test_pwd_keyerror_falls_through_to_sudo for the new fallback
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

All three issues from the prior review have been addressed: KeyError is caught in claude.py, import pwd is at module level in the test file, and test_different_user_still_uses_sudo now patches kai.claude.pwd.getpwuid.

One new finding:


SUGGESTION — test_self_sudo_skipped will throw KeyError in unmapped-UID containers (test_claude.py, line 176)

current_user = pwd.getpwuid(os.getuid()).pw_name

This is the test setup line, not production code. In the same container environments the KeyError guard in claude.py was added to protect against, this test will itself fail with an unhandled KeyError before any assertions run. The production code degrades gracefully; the test does not.

A pytest.skip is the cleanest fix:

try:
    current_user = pwd.getpwuid(os.getuid()).pw_name
except KeyError:
    pytest.skip("UID has no passwd entry; self-sudo-skip path not reachable")

Everything else looks clean. The effective_claude_user local avoids mutating self.claude_user, all three start_new_session/_pgid/log references are updated consistently, and test_pwd_keyerror_falls_through_to_sudo is a good addition that directly exercises the new guard.

@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

PR Review: Skip sudo wrapper when os_user matches bot process user

All three findings from the first review and the single finding from the second review have been addressed. The pytest.skip guard is in place in test_self_sudo_skipped (lines 176–178), import pwd is at module level, test_different_user_still_uses_sudo patches kai.claude.pwd.getpwuid, and test_pwd_keyerror_falls_through_to_sudo directly exercises the guard.

One new finding:


SUGGESTION — Warning message says "os_user" but applies to CLAUDE_USER too (claude.py, line ~225)

log.warning(
    "os_user %r matches the bot process user; skipping sudo (no isolation)",
    effective_claude_user,
)

self.claude_user can be set via either users.yaml's os_user field or the CLAUDE_USER environment variable. When it comes from the env var, the label "os_user" in the log line is misleading. "claude_user" (or "os_user / CLAUDE_USER") would be accurate in both cases.


Overall: Clean and correct. effective_claude_user avoids mutating self.claude_user, all three downstream references (start_new_session, _pgid, log message) are updated consistently, and the test suite covers the self-sudo, different-user, and unmapped-UID paths. The warning label is the only thing left to tighten.

@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

PR Review: Skip sudo wrapper when os_user matches bot process user

All findings from the three prior review rounds have been addressed in this diff:

  • KeyError guard in claude.py — present and correct
  • import pwd at module level in test file — done
  • test_different_user_still_uses_sudo patches kai.claude.pwd.getpwuid — done
  • pytest.skip guard in test_self_sudo_skipped — done
  • Warning label changed from "os_user" to "claude_user" — done

No new findings. The implementation is clean:

  • effective_claude_user correctly avoids mutating self.claude_user
  • All three downstream references (start_new_session, _pgid, log line) use effective_claude_user consistently
  • test_pwd_keyerror_falls_through_to_sudo directly exercises the KeyError guard
  • The is False check in test_self_sudo_skipped (line ~207) is fine — False is a singleton in CPython and the kwarg is always explicitly passed

Ready to merge.

@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

PR Review: Skip sudo wrapper when os_user matches bot process user

All findings from the four prior review rounds have been addressed in this diff. A fresh pass finds nothing new.

Implementation (claude.py):

  • effective_claude_user correctly avoids mutating self.claude_user, so repeated _ensure_started() calls (e.g. on process restart) re-evaluate the check rather than inheriting stale state
  • KeyError guard is correct; current_user = None falls through to the sudo path as intended
  • Warning label is "claude_user" — accurate for both users.yaml and CLAUDE_USER sources
  • All three downstream references (cmd, start_new_session, _pgid) use effective_claude_user consistently

Tests (test_claude.py):

  • import pwd is at module level
  • pytest.skip guard in test_self_sudo_skipped protects against unmapped-UID containers at test time
  • test_different_user_still_uses_sudo patches kai.claude.pwd.getpwuid — hermetic
  • test_pwd_keyerror_falls_through_to_sudo directly exercises the KeyError guard
  • assert args[1].get("start_new_session") is False is fine — bool(None) is the False singleton

No new findings. Ready to merge.

@dcellison dcellison merged commit bdd9e54 into main Mar 26, 2026
1 check passed
@dcellison dcellison deleted the fix/self-sudo-skip branch March 26, 2026 23:35
dcellison pushed a commit that referenced this pull request Mar 27, 2026
PR #192 added a self-sudo check to claude.py but missed the identical
sudo pattern in triage.py and review.py. In single-user deployments
where claude_user matches the bot process user, triage and review
subprocesses attempted self-sudo and silently failed.

Move the check into config.resolve_claude_user() and apply it in all
three locations. Both the spawn path (sudo -u prefix) and cleanup path
(killpg vs proc.kill) use the resolved value so they stay in sync.

Fixes #194
dcellison added a commit that referenced this pull request Mar 27, 2026
* Extract self-sudo skip into shared resolve_claude_user() utility

PR #192 added a self-sudo check to claude.py but missed the identical
sudo pattern in triage.py and review.py. In single-user deployments
where claude_user matches the bot process user, triage and review
subprocesses attempted self-sudo and silently failed.

Move the check into config.resolve_claude_user() and apply it in all
three locations. Both the spawn path (sudo -u prefix) and cleanup path
(killpg vs proc.kill) use the resolved value so they stay in sync.

Fixes #194

* Fix review: remove duplicate log, use module-scoped patches, match timeout pattern

* Make test_different_user_returns_unchanged hermetic with monkeypatch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skip sudo wrapper when os_user matches the bot process user

2 participants