endy

A tmux control plane that hands a coding task from one CLI agent to another when one runs out of tier.

_{Recording script: docs/demo.md.}

Why

I kept hitting my paid agent's weekly cap on a Thursday afternoon, with a task half done in a tmux window I couldn't extend. The other CLIs I had installed — Gemini, OpenCode, CommandCode, Hermes — were idle, on free tiers, perfectly capable of continuing the work. They just didn't know about each other.

endy is the layer that makes them know.

What it does

One command:

endy handoff <task-id> --to <next-agent>

reads the original prompt, tails the previous agent's output, opens a new tmux window with a different CLI, and tells it:

Here is what was being done. Here is the full output of what your predecessor wrote. The previous agent stopped because of <reason>. Continue.

The new agent picks up. The chain is recorded in the task's meta file (handoff_from=…, handoff_chain=…), so taskA(opencode) → taskB(cmd) → taskC(hermes) is fully traceable. Same .logs/ directory, same web dashboard, same endy watch family of commands.

If you set ENDY_HANDOFF_RESOLVER to a script that prints an agent name (for example a wrapper around multiplexor), the --to flag becomes optional and routing happens automatically when one tier runs dry.

The stack

Layer	Agent	Tier	Notes
Orchestrator	`codex`	paid	Long context, good at planning. You pay only for the conductor.
Worker	`opencode`	free (multiple backends)	Default for refactors, tests, fast edits
Worker	`cmd` (CommandCode / Kimi K2.6)	~€1 buys a lot of work	Strong taste reviewer; cheapest paid option
Worker	`hermes` (Nous Research)	depends on backend (configurable per provider, incl. Copilot)	Tool-heavy agentic work
Worker	`gemini` (Google Gemini CLI)	free daily quota	Wide reach
Smoke testing	`bash` (offline stub)	free	Spawns a no-op window so you can rehearse handoff chains without burning real-agent credits

You only install the ones you want. endy doctor shows what is wired up and authenticated.

Local models (Ollama, LM Studio, …)

endy does not spawn a local model directly — endy spawn ollama is deliberately not wired. Local inference is better treated as a backend behind an existing agent than as a peer-level CLI:

Via hermes. Hermes supports user-defined providers in ~/.hermes/config.yaml pointing at any OpenAI-compatible endpoint, including ollama's http://localhost:11434/v1. Add the provider once, then endy spawn hermes --model "ollama/llama3.2" (or whichever model you've pulled) routes through hermes to your local ollama.
Via /model inside a CLI that supports it. cmd, codex, and others expose local providers in their /model picker (codex has --oss --local-provider ollama / --local-provider lmstudio; cmd has an Ollama provider you can pick interactively). When the picker opens, browse to the local provider, pick a pulled model, and the CLI sends to your local daemon for that session.

multiplexor also knows ollama as a local fallback at the routing layer — multiplexor delegate "task" may pick it when free tiers are exhausted — but the multiplexor-next-provider resolver used by endy handoff does NOT return ollama (endy can't drive it headlessly as a peer agent). The local-model story lives inside hermes and the /model slash command, not in endy's spawn surface.

Quickstart

Prereqs (macOS or Linux): tmux, python3, and at least one of codex / opencode / cmd / hermes / claude / gemini on PATH.

npm install -g @noetiklab/endy
endy install                     # idempotent: symlinks, completion, PATH,
                                 # AND bootstraps multiplexor from PyPI
                                 # (the routing policy — see below)
exec "$SHELL" -l
endy doctor

Or from source:

git clone https://github.com/trentisiete/endy.git
cd endy && ./scripts/install.sh --yes
exec "$SHELL" -l

The 60-second demo

cd ~/work/my-project
endy start                                            # tmux session for this dir

endy spawn opencode -- "refactor src/auth/ to use the new IdentityProvider interface, then run npm test"

endy watch tree                                       # see it running
# (opencode hits a rate limit, log shows "RESOURCE_EXHAUSTED")

endy handoff <task-id> --to cmd --reason "rate limited" --stop-parent
# → new cmd window opens, reads the original prompt + the FULL log of what
#   opencode produced, and continues from where opencode stopped. Add
#   --stop-parent to close the rate-limited window in the same shot.
#   (Use --lines N to truncate if you're handing off to a small-context
#   target like gemini free.)

Want to rehearse the loop without burning any real-agent credit? Use the offline bash stub:

endy spawn bash -- "pretend to be doing work"
endy handoff <task-id> --to bash --reason "smoke test"
endy watch tree

You get a real handoff chain in .logs/ and a real new tmux window — the agent just doesn't call out to a real model. Useful for testing the dashboard, the tree view, and your demo recording.

That is the loop. Everything else in endy exists to make this one command not feel magical:

endy spawn writes a strict .logs/task-<id>.{log,meta,prompt.md} contract so any frontend can read it.
endy watch shows the chain across tmux sessions, web dashboard, and your phone over Tailscale.
endy chat, endy ask, endy watch followup cover same-agent continuation, interactive takeover, and one-shot questions.

Status

Honest table — what is shippable today, what is on the roadmap. Phase labels match docs/roadmap.md.

Phase	Feature	Status
0	README + docs repositioning + LICENSE + PyPI metadata	shipped
1	`endy spawn` / `ask` / `chat` / `watch` basic stack	shipped
1	`endy handoff <id> --to <agent>` (manual handoff)	shipped
1	Web dashboard + Tailscale mobile	shipped
1	Per-directory tmux sessions + global `endy overview`	shipped
1	`endy watch tree` / `list` render the `↪ handoff from X` chain	shipped
1	Web dashboard cards show `↪ from <short>` + full chain panel	shipped
2	`endy install` bootstraps multiplexor from PyPI automatically	shipped
2	`ENDY_HANDOFF_RESOLVER` auto-routing (no `--to` needed)	shipped
2	`multiplexor next-provider` + `multiplexor status --json`	shipped
3	`endy state` snapshot + auto-prepended environment block	shipped
3	`codex/skills/endy-state` Codex skill	shipped
4	Auto-detection of exhaustion (CLI stderr → auto-handoff)	shipped
5	Git worktree per spawned task (parallel isolation)	planned
6	npm 0.6.0+ stable surface, real demo GIF, public launch	planned

The loop now closes itself. When an agent task exits non-zero with a known exhaustion signal in its log (Gemini's RESOURCE_EXHAUSTED, opencode's ProviderModelNotFoundError, cmd's Reached maximum conversation turns, claude's usage_limit_exceeded, hermes's model_not_supported, etc.), endy invokes endy handoff automatically and multiplexor picks the next eligible agent. Disable per-task with --no-auto-handoff, globally with ENDY_AUTO_HANDOFF=0, per-project with a .endy/no-auto-handoff marker. Chain depth is capped at 5 to prevent runaway loops.

Multiplexor

multiplexor is the routing layer. It knows which CLIs you have installed, scores them by priority + tier_bonus, and picks the best one. When you wire it as ENDY_HANDOFF_RESOLVER, every endy handoff without an explicit --to calls multiplexor for the next eligible agent.

You do not install it separately: endy install already pulls endy-multiplexor from PyPI (via pipx / uv tool / pip --user, in that order of preference) and exports ENDY_HANDOFF_RESOLVER=multiplexor-next-provider into your shell startup. Pass --no-multiplexor to endy install if you want to skip it.

The two repos are independent — you can use either alone — but they are designed to compose. endy is the runtime; multiplexor is the policy.

A note on terms of service

endy executes each CLI under its own contract. You are responsible for using each provider within the terms you agreed to — including any limits on automation, free-tier eligibility, or use as a backing model for other applications. endy does not bypass quotas, scrape balances, or store credentials. It moves work between CLIs you have already authenticated yourself.

Documentation

docs/kickoff.md — onboarding for a new agent or contributor: architecture, conventions, design principles to preserve, anti-patterns we've already burned on
docs/operations.md — full command reference, manager workflows, the endy watch family, the .logs/ contract, web dashboard internals
docs/cli-gotchas.md — per-CLI quirks (opencode --dir, cmd --max-turns, hermes -Q, tmux specifics)
docs/demo.md — script for recording the handoff GIF, beat-by-beat
docs/roadmap.md — phases 0-6 with closing commits and what's coming next

endy help prints top-level usage. endy help <agent> (where <agent> is one of opencode, cmd, hermes, claude, gemini, bash, tmux) prints the relevant section of the gotchas doc.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
bin		bin
codex		codex
commandcode		commandcode
docs		docs
hermes		hermes
mcp-shims		mcp-shims
mobile		mobile
opencode/agents		opencode/agents
scripts		scripts
tests		tests
web		web
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmignore		.npmignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
NEXT_STEPS.md		NEXT_STEPS.md
README.md		README.md
package.json		package.json
prompt		prompt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

endy

Why

What it does

The stack

Local models (Ollama, LM Studio, …)

Quickstart

The 60-second demo

Status

Multiplexor

A note on terms of service

Documentation

Related

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

endy

Why

What it does

The stack

Local models (Ollama, LM Studio, …)

Quickstart

The 60-second demo

Status

Multiplexor

A note on terms of service

Documentation

Related

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages