Skip to content

Extract self-sudo skip into shared resolve_claude_user() utility#205

Merged
dcellison merged 3 commits intomainfrom
fix/self-sudo-triage-review
Mar 27, 2026
Merged

Extract self-sudo skip into shared resolve_claude_user() utility#205
dcellison merged 3 commits intomainfrom
fix/self-sudo-triage-review

Conversation

@dcellison
Copy link
Copy Markdown
Owner

Summary

  • Extract the self-sudo check from claude.py into config.resolve_claude_user() shared utility
  • Apply the check in all three subprocess spawn sites: claude.py, triage.py, review.py
  • Both spawn paths (sudo -u prefix) and cleanup paths (killpg vs proc.kill) use the resolved value
  • Remove import pwd from claude.py (only used for the check that moved to config.py)
  • Existing tests updated to patch kai.config.pwd.getpwuid instead of kai.claude.pwd.getpwuid

Test plan

  • 5 new tests for TestResolveClaudeUser in test_config.py (None, empty, self-sudo, different user, KeyError)
  • 2 new tests in test_triage.py (self-sudo skip, timeout kills directly)
  • 2 new tests in test_review.py (self-sudo skip, timeout kills directly)
  • 4 existing tests updated to patch pwd in config.py instead of claude.py
  • Full suite: 1212 tests passing

Fixes #194

PR #192 added a self-sudo check to claude.py but missed the identical
sudo pattern in triage.py and review.py. In single-user deployments
where claude_user matches the bot process user, triage and review
subprocesses attempted self-sudo and silently failed.

Move the check into config.resolve_claude_user() and apply it in all
three locations. Both the spawn path (sudo -u prefix) and cleanup path
(killpg vs proc.kill) use the resolved value so they stay in sync.

Fixes #194
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

This is a clean refactor with good test coverage. A few findings:


Warning — inconsistent patch target in new timeout tests

tests/test_review.py (new test_self_sudo_timeout_kills_directly) and tests/test_triage.py (new test_self_sudo_timeout_kills_directly) both patch "os.killpg" directly:

patch("os.killpg") as mock_killpg,

The existing tests in the same files use the module-local path:

  • patch("kai.review.os.killpg")test_timeout_with_claude_user_kills_group
  • patch("kai.triage.os.killpg")test_timeout_with_claude_user_kills_group

Both approaches work (same module object), but "os.killpg" patches the attribute globally for the duration of the test rather than scoping it to the module under test. Inconsistent with existing convention in the same files; if a later test ever patches os.killpg for a different reason, interaction could be surprising.


Suggestion — log placement in config.py

log = logging.getLogger(__name__) is defined in the middle of config.py, inside the new "Self-sudo resolution" section, rather than near the top of the file with other module-level setup. Conventional placement would be immediately after the imports. If config.py already had a top-level log definition (not visible in the diff), this creates a redundant reassignment.


Suggestion — timeout trigger mechanism differs between old and new tests

The new test_self_sudo_timeout_kills_directly tests trigger the timeout via:

mock_proc.communicate = AsyncMock(side_effect=TimeoutError)

The existing parallel tests use patch("kai.*.asyncio.wait_for", side_effect=TimeoutError()) to simulate the timeout. Both work on Python 3.11+ (where asyncio.TimeoutError is TimeoutError), but the new approach is slightly more fragile and inconsistent with the existing pattern in the same files.


Core logic is correct. resolve_claude_user() handles all four cases cleanly (None/empty, self-sudo, different user, KeyError). The effective_user propagation to both start_new_session and killpg in review.py and triage.py is the key correctness fix — those two paths previously had no self-sudo protection at all.

@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

The two actionable findings from the prior review have been addressed in this revision:

Prior issues resolved:

  • Patch targets in new timeout tests now use module-local paths (kai.review.os.killpg, kai.triage.os.killpg) — consistent with convention.
  • Timeout trigger mechanism now uses patch("kai.*.asyncio.wait_for", side_effect=TimeoutError()) in both new tests — consistent with existing tests in the same files.

New findings:

Suggestion — test_different_user_returns_unchanged calls real pwd.getpwuid

tests/test_config.py, new TestResolveClaudeUser:

def test_different_user_returns_unchanged(self):
    assert resolve_claude_user("some_nonexistent_user_xyz") == "some_nonexistent_user_xyz"

This calls the real pwd.getpwuid(os.getuid()) inside resolve_claude_user without patching. It works as long as no test machine has a user literally named some_nonexistent_user_xyz, which is safe in practice. But unlike the other four tests in the same class, this one isn't isolated. The adjacent test_unknown_uid_returns_unchanged shows how to do this cleanly with monkeypatch; applying the same treatment here would make the suite fully hermetic.


Everything else looks correct. The log.warning call in resolve_claude_user uses the log defined in config.py (a prior-review comment on its placement — not re-raising). The effective_user propagation to both start_new_session and the killpg/proc.kill branch in review.py and triage.py is the key correctness fix and it's applied symmetrically in both files. The KeyError fallback behavior (pass-through to sudo) is correct and tested.

@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

The prior review's open suggestion — test_different_user_returns_unchanged calling real pwd.getpwuid without patching — has been addressed: it now uses monkeypatch with MagicMock(pw_name="localuser").

New findings in this revision:


Suggestion — test_self_sudo_skipped / test_self_sudo_timeout_kills_directly use real pwd.getpwuid

tests/test_review.py lines ~621–660, tests/test_triage.py lines ~374–413:

try:
    current_user = pwd.getpwuid(os.getuid()).pw_name
except KeyError:
    pytest.skip("UID has no passwd entry")

Neither new integration test patches pwd.getpwuid inside resolve_claude_user. They rely on the real system call returning the same value both times (setup + inside the function), which is true in practice. The skip guard handles containers. This is consistent with test_self_sudo_returns_none in TestResolveClaudeUser and is intentional — these tests are validating the end-to-end self-sudo detection path with the real user. Acceptable as-is, just non-hermetic like their unit-level counterpart.


Everything else is correct. The existing tests that use claude_user="kai" now all patch kai.config.pwd.getpwuid to return pw_name="notkai", making them machine-independent. The effective_user propagation to start_new_session and the killpg/proc.kill branch is applied symmetrically in both review.py and triage.py. Core logic, error handling, and the KeyError fallback are all sound.

@dcellison dcellison merged commit fa0e25d into main Mar 27, 2026
1 check passed
@dcellison dcellison deleted the fix/self-sudo-triage-review branch March 27, 2026 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Apply self-sudo skip to triage and review subprocess spawning

2 participants