Skip to content

fix(llmkube_core): drive metal-agent install via ansible become instead of Makefile sudo#11

Merged
Defilan merged 2 commits into
mainfrom
fix/metal-agent-sudo-without-tty
May 24, 2026
Merged

fix(llmkube_core): drive metal-agent install via ansible become instead of Makefile sudo#11
Defilan merged 2 commits into
mainfrom
fix/metal-agent-sudo-without-tty

Conversation

@Defilan

@Defilan Defilan commented May 24, 2026

Copy link
Copy Markdown
Member

What

Replace make install-metal-agent in the llmkube_core role with
explicit ansible tasks that use become: true for the privileged
copy + the modern launchctl bootstrap/bootout API for the
launchd reload.

Why

Fixes #10.

The Mac Studio bootstrap run on 2026-05-23 failed at the
metal-agent install step. Root cause is the same TTY-less sudo
class we keep hitting: the LLMKube Makefile's install-metal-agent
target runs sudo cp bin/llmkube-metal-agent /usr/local/bin/...
internally, brew/make's sudo bypasses ansible's BECOME plumbing,
no TTY for the password prompt.

We cannot remove the sudo from the upstream Makefile in this PR
(separate concern; a follow-up bug should be filed against
defilantech/LLMKube to switch to /opt/homebrew/bin on Apple
Silicon). What we CAN do is have llmkube-bootstrap drive the
install through ansible's own BECOME channel, which already has
the password from --ask-become-pass.

How

In roles/llmkube_core/tasks/main.yml, the previous single task:

- name: Build + install metal-agent via LLMKube's Makefile
  ansible.builtin.command: { cmd: make install-metal-agent, ... }

becomes six explicit tasks:

# Task become Purpose
1 make build-metal-agent no Builds the binary only; no sudo. Idempotent via creates:
2 file: /usr/local/bin yes Ensure dir exists (absent on fresh Apple Silicon Mac)
3 copy: bin/llmkube-metal-agent → /usr/local/bin/... yes The cp that previously needed Makefile sudo
4 file: ~/Library/LaunchAgents no User-space LaunchAgent dir
5 copy: plist no Install plist (no sudo needed)
6 `shell: launchctl bootout true; launchctl bootstrap gui/$(id -u) `

The launchctl call moved from the legacy launchctl load (which
the Makefile uses) to the modern bootstrap/bootout API. Modern
macOS treats load/unload as deprecated and bootstrap is the
documented user-LaunchAgent install path. Idempotent because
bootout swallows "service not loaded" errors via || true.

The env vars on the build call
(LLMKUBE_METAL_AGENT_{MODEL_STORE,MEMORY_FRACTION,LLAMA_SERVER_PORT})
are preserved for forward-compat. Today they are no-ops because
the Makefile does not template the plist, but if/when it does the
bootstrap continues to provide the values.

Verification

Local:

  • ansible-playbook --syntax-check: clean
  • Diff: 1 file, +58 / -6

CI:

Live (post-merge):

  • git pull on the Studio + re-run ./bootstrap.sh. Expected:
    binary builds, copies into /usr/local/bin under become, plist
    installs, launchctl bootstraps the daemon, role completes.

Manual unblock for the Studio (while CI runs)

cd ~/src/LLMKube
make build-metal-agent
sudo cp bin/llmkube-metal-agent /usr/local/bin/llmkube-metal-agent   # interactive sudo
mkdir -p ~/Library/LaunchAgents
cp deployment/macos/com.llmkube.metal-agent.plist ~/Library/LaunchAgents/
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.llmkube.metal-agent.plist
cd ~/llmkube-bootstrap
./bootstrap.sh    # resumes; metal-agent install is now a no-op

Related

Checklist

  • ansible-playbook --syntax-check passes
  • Uses ansible BECOME instead of Makefile sudo
  • Modern launchctl bootstrap/bootout API
  • Conventional commit + DCO sign-off
  • References the issue this closes

…ad of Makefile sudo

The role's previous `make install-metal-agent` call failed under
ansible because the Makefile target runs `sudo cp bin/llmkube-metal
-agent /usr/local/bin/...` internally. brew/make's sudo bypasses
ansible's BECOME plumbing -- there is no TTY for the password
prompt, so the cp fails with 'a terminal is required to read the
password' (#10).

Replace the single make-target task with explicit ansible tasks:

1. `make build-metal-agent`         -- builds the binary, no sudo
2. ensure /usr/local/bin exists      -- become: true (fresh Apple
                                        Silicon Macs lack it)
3. install binary to /usr/local/bin  -- become: true, ansible.builtin.copy
                                        equivalent to the Makefile's
                                        `sudo cp` but going through
                                        ansible BECOME (--ask-become-pass)
4. ensure ~/Library/LaunchAgents      -- no become; user-space
5. install launchd plist              -- no become; user-space
6. reload launchd unit                -- launchctl bootout + bootstrap,
                                        gated on binary or plist change.
                                        Uses modern bootstrap API instead
                                        of legacy 'launchctl load'.

The existing env vars on the build call (model_store, memory_fraction,
llama_server_port) are preserved for forward-compat with any future
Makefile templating; they are no-ops today.

ansible-playbook --syntax-check: clean.

The LLMKube Makefile itself still has the sudo-cp issue for anyone
running 'make install-metal-agent' manually in a non-TTY context;
a separate bug against the LLMKube repo should ask for either an
Apple Silicon /opt/homebrew/bin install path or clear TTY-requirement
docs. Out of scope for this PR.

Fixes #10
@Defilan Defilan added the bug Something isn't working label May 24, 2026
…ml[line-length]

ansible-lint flagged line 121 as 1 char over the 120-char limit
(launchctl bootstrap invocation with the inlined plist path). Hoist
the path into a task-scoped 'vars:' block; both lines fit cleanly.
@Defilan Defilan merged commit af77764 into main May 24, 2026
5 checks passed
@Defilan Defilan deleted the fix/metal-agent-sudo-without-tty branch May 24, 2026 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] llmkube_core: 'make install-metal-agent' fails with sudo TTY error during ansible run

1 participant