Skip to content

trentisiete/endy

Repository files navigation

endy

A tmux control plane that hands a coding task from one CLI agent to another when one runs out of tier.

endy handoff: opencode runs out, cmd picks up mid-task

Recording script: docs/demo.md.

Why

I kept hitting my paid agent's weekly cap on a Thursday afternoon, with a task half done in a tmux window I couldn't extend. The other CLIs I had installed — Gemini, OpenCode, CommandCode, Hermes — were idle, on free tiers, perfectly capable of continuing the work. They just didn't know about each other.

endy is the layer that makes them know.

What it does

One command:

endy handoff <task-id> --to <next-agent>

reads the original prompt, tails the previous agent's output, opens a new tmux window with a different CLI, and tells it:

Here is what was being done. Here is the full output of what your predecessor wrote. The previous agent stopped because of <reason>. Continue.

The new agent picks up. The chain is recorded in the task's meta file (handoff_from=…, handoff_chain=…), so taskA(opencode) → taskB(cmd) → taskC(hermes) is fully traceable. Same .logs/ directory, same web dashboard, same endy watch family of commands.

If you set ENDY_HANDOFF_RESOLVER to a script that prints an agent name (for example a wrapper around multiplexor), the --to flag becomes optional and routing happens automatically when one tier runs dry.

The stack

Layer Agent Tier Notes
Orchestrator codex paid Long context, good at planning. You pay only for the conductor.
Worker opencode free (multiple backends) Default for refactors, tests, fast edits
Worker cmd (CommandCode / Kimi K2.6) ~€1 buys a lot of work Strong taste reviewer; cheapest paid option
Worker hermes (Nous Research) depends on backend (configurable per provider, incl. Copilot) Tool-heavy agentic work
Worker gemini (Google Gemini CLI) free daily quota Wide reach
Smoke testing bash (offline stub) free Spawns a no-op window so you can rehearse handoff chains without burning real-agent credits

You only install the ones you want. endy doctor shows what is wired up and authenticated.

Local models (Ollama, LM Studio, …)

endy does not spawn a local model directly — endy spawn ollama is deliberately not wired. Local inference is better treated as a backend behind an existing agent than as a peer-level CLI:

  • Via hermes. Hermes supports user-defined providers in ~/.hermes/config.yaml pointing at any OpenAI-compatible endpoint, including ollama's http://localhost:11434/v1. Add the provider once, then endy spawn hermes --model "ollama/llama3.2" (or whichever model you've pulled) routes through hermes to your local ollama.
  • Via /model inside a CLI that supports it. cmd, codex, and others expose local providers in their /model picker (codex has --oss --local-provider ollama / --local-provider lmstudio; cmd has an Ollama provider you can pick interactively). When the picker opens, browse to the local provider, pick a pulled model, and the CLI sends to your local daemon for that session.

multiplexor also knows ollama as a local fallback at the routing layer — multiplexor delegate "task" may pick it when free tiers are exhausted — but the multiplexor-next-provider resolver used by endy handoff does NOT return ollama (endy can't drive it headlessly as a peer agent). The local-model story lives inside hermes and the /model slash command, not in endy's spawn surface.

Quickstart

Prereqs (macOS or Linux): tmux, python3, and at least one of codex / opencode / cmd / hermes / claude / gemini on PATH.

npm install -g @noetiklab/endy
endy install                     # idempotent: symlinks, completion, PATH,
                                 # AND bootstraps multiplexor from PyPI
                                 # (the routing policy — see below)
exec "$SHELL" -l
endy doctor

Or from source:

git clone https://github.com/trentisiete/endy.git
cd endy && ./scripts/install.sh --yes
exec "$SHELL" -l

The 60-second demo

cd ~/work/my-project
endy start                                            # tmux session for this dir

endy spawn opencode -- "refactor src/auth/ to use the new IdentityProvider interface, then run npm test"

endy watch tree                                       # see it running
# (opencode hits a rate limit, log shows "RESOURCE_EXHAUSTED")

endy handoff <task-id> --to cmd --reason "rate limited" --stop-parent
# → new cmd window opens, reads the original prompt + the FULL log of what
#   opencode produced, and continues from where opencode stopped. Add
#   --stop-parent to close the rate-limited window in the same shot.
#   (Use --lines N to truncate if you're handing off to a small-context
#   target like gemini free.)

Want to rehearse the loop without burning any real-agent credit? Use the offline bash stub:

endy spawn bash -- "pretend to be doing work"
endy handoff <task-id> --to bash --reason "smoke test"
endy watch tree

You get a real handoff chain in .logs/ and a real new tmux window — the agent just doesn't call out to a real model. Useful for testing the dashboard, the tree view, and your demo recording.

That is the loop. Everything else in endy exists to make this one command not feel magical:

  • endy spawn writes a strict .logs/task-<id>.{log,meta,prompt.md} contract so any frontend can read it.
  • endy watch shows the chain across tmux sessions, web dashboard, and your phone over Tailscale.
  • endy chat, endy ask, endy watch followup cover same-agent continuation, interactive takeover, and one-shot questions.

Status

Honest table — what is shippable today, what is on the roadmap. Phase labels match docs/roadmap.md.

Phase Feature Status
0 README + docs repositioning + LICENSE + PyPI metadata shipped
1 endy spawn / ask / chat / watch basic stack shipped
1 endy handoff <id> --to <agent> (manual handoff) shipped
1 Web dashboard + Tailscale mobile shipped
1 Per-directory tmux sessions + global endy overview shipped
1 endy watch tree / list render the ↪ handoff from X chain shipped
1 Web dashboard cards show ↪ from <short> + full chain panel shipped
2 endy install bootstraps multiplexor from PyPI automatically shipped
2 ENDY_HANDOFF_RESOLVER auto-routing (no --to needed) shipped
2 multiplexor next-provider + multiplexor status --json shipped
3 endy state snapshot + auto-prepended environment block shipped
3 codex/skills/endy-state Codex skill shipped
4 Auto-detection of exhaustion (CLI stderr → auto-handoff) shipped
5 Git worktree per spawned task (parallel isolation) planned
6 npm 0.6.0+ stable surface, real demo GIF, public launch planned

The loop now closes itself. When an agent task exits non-zero with a known exhaustion signal in its log (Gemini's RESOURCE_EXHAUSTED, opencode's ProviderModelNotFoundError, cmd's Reached maximum conversation turns, claude's usage_limit_exceeded, hermes's model_not_supported, etc.), endy invokes endy handoff automatically and multiplexor picks the next eligible agent. Disable per-task with --no-auto-handoff, globally with ENDY_AUTO_HANDOFF=0, per-project with a .endy/no-auto-handoff marker. Chain depth is capped at 5 to prevent runaway loops.

Multiplexor

multiplexor is the routing layer. It knows which CLIs you have installed, scores them by priority + tier_bonus, and picks the best one. When you wire it as ENDY_HANDOFF_RESOLVER, every endy handoff without an explicit --to calls multiplexor for the next eligible agent.

You do not install it separately: endy install already pulls endy-multiplexor from PyPI (via pipx / uv tool / pip --user, in that order of preference) and exports ENDY_HANDOFF_RESOLVER=multiplexor-next-provider into your shell startup. Pass --no-multiplexor to endy install if you want to skip it.

The two repos are independent — you can use either alone — but they are designed to compose. endy is the runtime; multiplexor is the policy.

A note on terms of service

endy executes each CLI under its own contract. You are responsible for using each provider within the terms you agreed to — including any limits on automation, free-tier eligibility, or use as a backing model for other applications. endy does not bypass quotas, scrape balances, or store credentials. It moves work between CLIs you have already authenticated yourself.

Documentation

  • docs/kickoff.md — onboarding for a new agent or contributor: architecture, conventions, design principles to preserve, anti-patterns we've already burned on
  • docs/operations.md — full command reference, manager workflows, the endy watch family, the .logs/ contract, web dashboard internals
  • docs/cli-gotchas.md — per-CLI quirks (opencode --dir, cmd --max-turns, hermes -Q, tmux specifics)
  • docs/demo.md — script for recording the handoff GIF, beat-by-beat
  • docs/roadmap.md — phases 0-6 with closing commits and what's coming next

endy help prints top-level usage. endy help <agent> (where <agent> is one of opencode, cmd, hermes, claude, gemini, bash, tmux) prints the relevant section of the gotchas doc.

Related

License

MIT.

About

Multi-agent control plane for coding CLIs (Codex, OpenCode, CommandCode, Hermes) with one tmux session, one .logs/ source of truth, terminal + web frontends.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors