Skip to content

Watcher lifecycle email fails with 'Exec format error: clauded' on startup #39

@brando90

Description

@brando90

Symptom

When the job watcher daemon starts on skampere1 (smart mode with clauded as the agent), it fails to dispatch the startup lifecycle email:

2026-04-16 12:26:25 [skampere1.stanford.edu] INFO Smart-job agent: clauded (/afs/cs.stanford.edu/u/brando9/bin/clauded)
2026-04-16 12:26:25 [skampere1.stanford.edu] WARNING Failed to dispatch lifecycle email: [Errno 8] Exec format error: 'clauded'

The watcher itself keeps running after the warning; only the email step fails.

Diagnosis

Errno 8 = ENOEXEC "Exec format error." This almost always means os.execvp(...) was called on a file whose shebang is missing, malformed, or points to a non-existent interpreter — OR the caller passed a plain string to something that expected an exec-ready binary. Given the scheduler is Python, likely one of:

  • A subprocess call uses executable='clauded' with shell=False and a string that needs shell parsing.
  • The email-dispatch path piping through clauded as a transport hits the same thing.

The name "lifecycle email" suggests the watcher is trying to use clauded to send the email via the coding-agent path rather than via SMTP/sendmail directly. If so, that's an odd coupling — consider separating transport (SMTP / mail / sendmail) from agent invocation.

Repro

Start the watcher on a SNAP node that has clauded in $PATH. Check the first few lines of ~/dfs/job_queue/logs/watcher_daemon.log for the WARNING.

Suggested fix

  1. Decouple lifecycle email from the coding-agent binary — use smtplib / system mail for lifecycle notifications.
  2. If the coupling is intentional, make the exec path robust: subprocess.run([executable, ...args], shell=False) with the interpreter resolved via shutil.which("clauded") and a fallback error that names the file being exec'd.

Context

Discovered during SNAP watcher audit on 2026-04-16. See agents-config commit 594a3cb for the broader two-layer job-checking context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions