Agents Fleet

AI coding agents like Claude Code and Codex are powerful, but they have no built-in cost controls—one runaway session can silently burn $20–$50 with no visibility into what’s happening or when to stop. Agents Fleet gives you a local web UI to launch and monitor agent sessions and automatically stop them when they hit a token or USD budget.

Local-first “mission control” for AI coding agent CLIs (and any shell commands): launch sessions in a repo, stream live output to a web UI, stop them, and keep a persisted history.

Visual Overview

✨ Recently Shipped

LiteLLM Spend Analytics tab (latest)
- Real spend data pulled from your LiteLLM proxy (/spend/logs, /user/daily/activity)
- Matches Agents Fleet layout: header stats, This Week chart, Weekly Budget strip, By Model and Daily tabs
- Weekly budget resets Sunday; projects spend and flags over-budget
Budget 80% warning notifications
- Native browser notification + in-app toast when a session reaches 80% of its USD or token budget
- Toast auto-dismisses after 8s; works even when browser notifications are blocked
One-click session resume (latest)
- claude --resume <uuid> and codex resume <uuid> commands are captured automatically on session exit and shown in the Artifacts tab
- Resume button spawns a new shell session instantly — no copy-paste needed
- Backfilled across all historical sessions in the database
Graceful session exit for Claude and Codex (latest)
- Stop button sends Ctrl+C → /exit instead of hard-killing, giving Claude/Codex time to save state and print the resume command before exiting
Interactive Git Diff Viewer
- Side-by-side diff display with line-by-line numbering
- Paired removed/added lines render adjacent for easy comparison
- File-level grouping with syntax coloring
Spend Analytics Dashboard
- View total spend by month, week, or day
- Drill down by repo, command, or model
- Real-time cost tracking with USD budgets
Budget Tracking in Session Header
- Display token budgets (input + output combined) and USD budgets side-by-side with current usage
- Shows on all tabs: Shell, Claude (SDK), LiteLLM
- Example: total 81,772 / 100,000 budget $1.23 / $5.00 This repository contains a working MVP:
pnpm workspace monorepo
React + Vite + TypeScript “Mission Control” web app
Node + Express + TypeScript server:
- SQLite persistence (data/agents_fleet.sqlite)
- session + terminal history HTTP APIs
- WebSocket streaming (/ws):
  - live PTY output for shell/CLI sessions
  - live Claude SDK chat streaming + tool events
shared TypeScript types (packages/shared)

Demo

Screenshots

Mission control overview

Local-first architecture

Create a new session

Shell session

Claude (SDK) session

LiteLLM session

Interactive sessions

Claude Code / PTY session

OpenAI Codex / PTY session

Codex scrollback / persisted terminal replay

Claude SDK chat flow

Chat conversation view

Command approval gate

Approval accepted

Approval rejected

Persisted chat history

Per-session artifacts (git diff snapshots)

Git diff viewer (side-by-side, file tabs)

Spend dashboards / budget tracking

Default spend dashboard

Spend dashboard today

Spend dashboard 7 days

Spend dashboard by repo

Spend dashboard by command

Spend dashboard by model

Session resume

Resume artifact in Artifacts tab with Copy + Resume button

Resumed session live terminal

LiteLLM Spend Analytics

By Model tab — real spend breakdown from proxy

Daily tab — per-day requests, tokens, and spend

Budget warnings

In-app toast (shown when browser notifications are blocked or as a persistent overlay)

Native browser notification

Browser notification permission prompt

SQLite persistence / debug views

Sessions table

Logs table

Videos

screenshots/AgentFleet__AI_Mission_Control.mp4

The MVP persists several tables in data/agents_fleet.sqlite:

sessions: session metadata + budgets + estimated token/cost + stop reason
pty_chunks: raw PTY stream (ANSI included) used for Terminal (persisted) replay
stdin_events: input audit trail (stored separately; not injected into replay)
session_markers: lifecycle markers like stop_requested, budget_exceeded, process_exit
session_artifacts: per-session artifacts (currently: git snapshot with changedFiles[] + combined staged/unstaged diff captured on stop/exit)

Earlier iterations used a line-based logs table. The current design persists terminal history as raw PTY chunks (pty_chunks) for xterm.js replay, which is much closer to real scrollback (especially for TUIs like Claude/Codex).

Videos

Tip: GitHub renders MP4 previews nicely in README. .mov files are ignored by default in .gitignore to avoid bloating git history.

screenshots/AgentFleet__AI_Mission_Control.mp4

Architecture

See ARCHITECTURE.md.

Prerequisites

Node.js 20.x–24.x (Node 26+ not yet supported)
pnpm (Corepack is fine)

Setup

COREPACK_HOME="$PWD/.corepack" pnpm install

Run (dev) — one command

pnpm dev:one

On first run, this may optionally prompt you for ANTHROPIC_API_KEY and save it to .env.local (gitignored). Press Enter to skip.

Environment variables:

ANTHROPIC_API_KEY (required for Claude SDK chat)
LITELLM_BASE_URL and LITELLM_API_KEY (optional for LiteLLM Chat via enterprise proxy)

This will:

install dependencies (if needed)
start apps/server + apps/web in parallel

Open: http://localhost:5173

Run (dev) — manual (two terminals)

COREPACK_HOME="$PWD/.corepack" pnpm -C apps/server dev
COREPACK_HOME="$PWD/.corepack" pnpm -C apps/web dev

Create a session

Open the web app (Vite prints the URL, typically http://localhost:5173).
Enter:
- Repo path: absolute path to a local repository (must be a directory)
- Command: any shell command to run in that repo

Example commands:

node -e "console.log('hello')"
git status
node -e "setinterval(()=>console.log('tick',Date.now()),200)"
node -e "setInterval(()=>console.log(Date.now()),200)"
claude
codex

Interactive sessions (e.g. Claude)

Claude Code / Codex (PTY)

Start a session with command claude (or codex if installed).

(Recommended) Claude Code status line for accurate budget tracking

Claude Code can run a custom status line command that receives structured JSON about the current session (context window usage, estimated cost, etc.).

For the most reliable budget tracking in Agents Fleet, configure a single-line status line that prints parse-friendly key/value pairs.

Create the script:

#!/bin/bash
input=$(cat)

CTX_IN=$(echo "$input" | jq -r '.context_window.total_input_tokens // 0')
CTX_OUT=$(echo "$input" | jq -r '.context_window.total_output_tokens // 0')
CTX_SIZE=$(echo "$input" | jq -r '.context_window.context_window_size // 0')
CTX_PCT=$(echo "$input" | jq -r '.context_window.used_percentage // 0' | cut -d. -f1)

COST=$(echo "$input" | jq -r '.cost.total_cost_usd // 0')
COST_FMT=$(printf '$%.6f' "$COST")

# Single-line, parse-friendly output:
# Use a unique prefix + delimiter to make parsing reliable even with TUI redraws.
echo "AF|ctx=${CTX_IN}/${CTX_SIZE}(${CTX_PCT}%)|in=${CTX_IN}|out=${CTX_OUT}|cost=${COST_FMT}"

Save it as ~/.claude/agents_fleet_statusline.sh and make it executable:

chmod +x ~/.claude/agents_fleet_statusline.sh

Update ~/.claude/settings.json:

{
  "statusLine": {
    "type": "command",
    "command": "~/.claude/agents_fleet_statusline.sh",
    "padding": 1,
    "refreshInterval": 1
  }
}

Notes:

Requires jq to be installed (brew install jq on macOS).
cost.total_cost_usd is an estimate computed client-side by Claude Code and may differ from your actual bill.
Type directly into the Terminal (live) pane (xterm.js).
Use Terminal (persisted) to replay and scroll through the recorded PTY output (xterm.js replay).

(Recommended) Codex status line for accurate budget tracking

Codex can also show session usage in a single-line status line. For Agents Fleet, the simplest reliable setup is to keep Codex’s built-in status line enabled and ensure it includes the usage fields below.

Update ~/.codex/config.toml:

[tui]
status_line = ["model-with-reasoning","current-dir","context-remaining","context-used","total-input-token","total-output-tokens","weekly-limit","five-hour-limit","run-state","task-progress"]
status_line_use_color = true

Make sure the output stays on one line in the Codex TUI.

Notes:

The config above matches the usage fields Agents Fleet can parse for budget tracking.
If you change the field list, keep it single-line so PTY replay remains parse-friendly.
Type directly into the Terminal (live) pane (xterm.js).
Use Terminal (persisted) to replay and scroll through the recorded PTY output (xterm.js replay).

Claude (SDK) chat (tool-calling)

Prerequisite: set ANTHROPIC_API_KEY (required). The server will reject Claude SDK requests if it’s missing.

Switch to Claude (SDK) in the UI.
Provide a repo path and chat normally.
The assistant can propose run_command tool calls; you must Approve or Reject each command.
Tool output is capped (100KB) and stored as session artifacts.

Screenshots:

Claude SDK session stopped by budget

Claude SDK tool call + output

Claude SDK tool permission gate (Approve/Reject)

LiteLLM Chat (proxy support)

Use your enterprise URL and API key to access multiple models through a LiteLLM proxy.

LiteLLM Chat allows you to:

Use your enterprise/custom LiteLLM proxy endpoint
Access models beyond Claude (OpenAI, Anthropic, etc.)
Route requests through your own infrastructure

Setup:

Set environment variables:

export LITELLM_BASE_URL="https://your-litellm-proxy.com"
export LITELLM_API_KEY="your-api-key"

Switch to LiteLLM in the UI.
Provide a repo path and select your desired model from the dropdown.
Chat and use tools normally—the same Approve/Reject workflow as Claude SDK.

Notes:

LITELLM_BASE_URL must be a valid HTTPS URL pointing to your LiteLLM proxy endpoint.
LITELLM_API_KEY is your authentication key for the proxy.
The available models depend on your LiteLLM proxy configuration.
Tool output is capped (100KB) and stored as session artifacts, just like Claude SDK.

Enterprise/Custom LLM Integration: If you're running a local or enterprise LiteLLM proxy, Agents Fleet will route all requests through your infrastructure, giving you full control and visibility over API costs and usage.

Budgets (estimated)

Optional Budget USD and/or Budget tokens apply to the entire session lifetime.
Token budget: counts input + output tokens combined. When total_tokens >= budget_tokens, the session stops.
USD budget: when estimated_cost >= budget_usd, the session stops.
Token estimation: ceil(text.length / 4).
Cost estimation:
- shell/PTY sessions use the default rates in apps/server/src/budget.ts ($3.00 per 1M input, $15.00 per 1M output by default)
- Claude SDK sessions use a model-based pricing table (computeModelCostUsd) and SDK-reported usage when available.
- LiteLLM sessions use model-specific pricing from your LiteLLM proxy configuration.
If a budget is exceeded, the session is stopped automatically and stop_reason becomes budget_exceeded.

Note: USD cost is still an estimate unless you configure model pricing to match your account/contract.

Configure pricing via a remote API (PRICING_API_URL, must be https) or via local overrides (PRICING_JSON inline JSON / PRICING_JSON_PATH file path). See apps/server/src/pricing.ts for schema + env vars.

Stop a session

Select a running session and click Stop.
The server will attempt graceful termination first, then force-kill if needed (best-effort, cross-platform).

Per-session artifacts (git diff snapshots)

On session stop and/or exit, Agents Fleet can capture a git snapshot for the session repo and store it in SQLite.

UI: open the Artifacts tab (next to Terminal tabs) to view changed files + diff.
Storage: session_artifacts table.
Toggle: set AGENTS_FLEET_CAPTURE_GIT_ON_END=0 to disable capture.

Resource metrics

Agents Fleet ships a scripts/metrics script that captures CPU, memory, swap, load, network I/O, disk I/O, open file descriptors, and SQLite DB size — scoped to AgentFleet processes only.

./scripts/metrics              # one-shot pretty print
./scripts/metrics --watch      # live refresh every 3s
./scripts/metrics --log        # continuous CSV log to data/metrics_<timestamp>.csv (every 5s)

To tail the log live while a session runs:

# Terminal 1
./scripts/metrics --log

# Terminal 2
tail -f data/metrics_<timestamp>.csv

Measured profile (Apple M4 Pro, 24GB RAM):

Scenario	CPU%	Memory
Idle (server + vite only)	0–1%	~165MB
Claude Shell session active	1–4%	~788MB
Claude SDK tool calls firing	3–17%	600–665MB
Git diff capture on stop	12–47%	spike, clears fast
LiteLLM / Spend Analytics	0–3%	~165MB

Baseline footprint is tiny — 165MB, <1% CPU when no agent is running
Memory is almost entirely the Claude/Codex process itself, not AgentFleet overhead
Git diff capture is the heaviest single event (~12% typical, up to 47% if multiple sessions stop together) — lasts <10s and clears cleanly
No memory leaks observed — memory returns to baseline after every session exits
Swap usage unchanged throughout — AgentFleet does not add swap pressure
Data dir grows ~3MB per SDK session — worth monitoring on frequent use

GPU utilization is not captured without sudo. On Apple Silicon (unified memory) run sudo powermetrics --samplers gpu_power separately if needed.

Scripts

pnpm dev:one installs deps if needed and runs dev for all workspaces (web + server).
pnpm dev runs dev for all workspaces (web + server) in parallel.
pnpm check runs lint + typecheck + test + build.
pnpm build builds all workspaces.
pnpm typecheck runs TypeScript checks across workspaces.

Tests

COREPACK_HOME="$PWD/.corepack" pnpm -C apps/server test

Notes

If you see Corepack cache permission errors, the COREPACK_HOME="$PWD/.corepack" prefix keeps Corepack’s cache inside the repo.
Node version: Node 20–24 are supported. Node 26+ is blocked by @homebridge/node-pty-prebuilt-multiarch (>=18 <25). Node 22 and 24 work fine.

Data location

SQLite DB: data/agents_fleet.sqlite (local only; do not commit).

Known limitations

PTY sessions do not preserve stdout/stderr separation.
Token/cost is an estimate unless the CLI provides actual usage.
Some TUIs (notably Claude) may clear/restore the alternate screen on exit. The persisted replay is a faithful stream replay, so end-of-session scrollback may differ from what you remember seeing just before exit.
No multi-line input in the terminal pane. The xterm.js terminal forwards keystrokes directly to the PTY; the shell owns the line and executes on Enter. Shift+Enter, Ctrl+Enter, etc. all behave the same as plain Enter — there is no way to insert a newline without submitting at the terminal protocol level. Workarounds inside the shell: end a line with \ for continuation, or use $'line1\nline2' quoting. Inside Claude Code's TUI specifically, Option+Enter inserts a newline in the prompt. For free-form multi-line composition, use the Claude (SDK) or LiteLLM chat tabs instead, where Shift+Enter works as expected.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
apps		apps
data		data
packages/shared		packages/shared
screenshots		screenshots
scripts		scripts
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
LICENSE		LICENSE
LITELLM_MODELS_REFACTOR.md		LITELLM_MODELS_REFACTOR.md
README.md		README.md
ROADMAP.md		ROADMAP.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agents Fleet

Visual Overview

✨ Recently Shipped

Demo

Screenshots

Videos

Videos

Architecture

Prerequisites

Setup

Run (dev) — one command

Run (dev) — manual (two terminals)

Create a session

Interactive sessions (e.g. Claude)

Claude Code / Codex (PTY)

(Recommended) Claude Code status line for accurate budget tracking

(Recommended) Codex status line for accurate budget tracking

Claude (SDK) chat (tool-calling)

LiteLLM Chat (proxy support)

Budgets (estimated)

Stop a session

Per-session artifacts (git diff snapshots)

Resource metrics

Scripts

Tests

Notes

Data location

Known limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agents Fleet

Visual Overview

✨ Recently Shipped

Demo

Screenshots

Videos

Videos

Architecture

Prerequisites

Setup

Run (dev) — one command

Run (dev) — manual (two terminals)

Create a session

Interactive sessions (e.g. Claude)

Claude Code / Codex (PTY)

(Recommended) Claude Code status line for accurate budget tracking

(Recommended) Codex status line for accurate budget tracking

Claude (SDK) chat (tool-calling)

LiteLLM Chat (proxy support)

Budgets (estimated)

Stop a session

Per-session artifacts (git diff snapshots)

Resource metrics

Scripts

Tests

Notes

Data location

Known limitations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages