Skip to content

akhilsinghcodes/agents_fleet

Repository files navigation

Agents Fleet

CI

AI coding agents like Claude Code and Codex are powerful, but they have no built-in cost controls—one runaway session can silently burn $20–$50 with no visibility into what’s happening or when to stop. Agents Fleet gives you a local web UI to launch and monitor agent sessions and automatically stop them when they hit a token or USD budget.

Local-first “mission control” for AI coding agent CLIs (and any shell commands): launch sessions in a repo, stream live output to a web UI, stop them, and keep a persisted history.

Visual Overview

AgentFleet: Stop Runaway AI Agents with Local Mission Control

✨ Recently Shipped

  • LiteLLM Spend Analytics tab (latest)
    • Real spend data pulled from your LiteLLM proxy (/spend/logs, /user/daily/activity)
    • Matches Agents Fleet layout: header stats, This Week chart, Weekly Budget strip, By Model and Daily tabs
    • Weekly budget resets Sunday; projects spend and flags over-budget
  • Budget 80% warning notifications
    • Native browser notification + in-app toast when a session reaches 80% of its USD or token budget
    • Toast auto-dismisses after 8s; works even when browser notifications are blocked
  • One-click session resume (latest)
    • claude --resume <uuid> and codex resume <uuid> commands are captured automatically on session exit and shown in the Artifacts tab
    • Resume button spawns a new shell session instantly — no copy-paste needed
    • Backfilled across all historical sessions in the database
  • Graceful session exit for Claude and Codex (latest)
    • Stop button sends Ctrl+C → /exit instead of hard-killing, giving Claude/Codex time to save state and print the resume command before exiting
  • Interactive Git Diff Viewer
    • Side-by-side diff display with line-by-line numbering
    • Paired removed/added lines render adjacent for easy comparison
    • File-level grouping with syntax coloring
  • Spend Analytics Dashboard
    • View total spend by month, week, or day
    • Drill down by repo, command, or model
    • Real-time cost tracking with USD budgets
  • Budget Tracking in Session Header
    • Display token budgets (input + output combined) and USD budgets side-by-side with current usage
    • Shows on all tabs: Shell, Claude (SDK), LiteLLM
    • Example: total 81,772 / 100,000 budget $1.23 / $5.00 This repository contains a working MVP:
  • pnpm workspace monorepo
  • React + Vite + TypeScript “Mission Control” web app
  • Node + Express + TypeScript server:
    • SQLite persistence (data/agents_fleet.sqlite)
    • session + terminal history HTTP APIs
    • WebSocket streaming (/ws):
      • live PTY output for shell/CLI sessions
      • live Claude SDK chat streaming + tool events
  • shared TypeScript types (packages/shared)

Demo

Screenshots

Mission control overview

Mission control overview

Local-first architecture

Local-first architecture

Create a new session

  • Shell session

New shell session

  • Claude (SDK) session

New Claude SDK session

  • LiteLLM session

New LiteLLM session

Interactive sessions

  • Claude Code / PTY session

Claude interactive session

  • OpenAI Codex / PTY session

Codex interactive session

  • Codex scrollback / persisted terminal replay

Codex scrollable terminal

Claude SDK chat flow

  • Chat conversation view

Claude SDK chat

  • Command approval gate

Claude SDK approval gate

  • Approval accepted

Claude SDK approval accepted

  • Approval rejected

Claude SDK approval rejected

  • Persisted chat history

Claude SDK history

Per-session artifacts (git diff snapshots)

  • Git diff viewer (side-by-side, file tabs)

Git diff viewer

Spend dashboards / budget tracking

  • Default spend dashboard

Spend dashboard default view

  • Spend dashboard today

Spend dashboard today

  • Spend dashboard 7 days

Spend dashboard 7 days

  • Spend dashboard by repo

Spend dashboard by repo

  • Spend dashboard by command

Spend dashboard by command

  • Spend dashboard by model

Spend dashboard by model

Session resume

  • Resume artifact in Artifacts tab with Copy + Resume button

Session resume artifact

  • Resumed session live terminal

Session resume terminal

LiteLLM Spend Analytics

  • By Model tab — real spend breakdown from proxy

LiteLLM spend by model

  • Daily tab — per-day requests, tokens, and spend

LiteLLM spend daily

Budget warnings

  • In-app toast (shown when browser notifications are blocked or as a persistent overlay)

Budget warning toast

  • Native browser notification

Budget warning notification

  • Browser notification permission prompt

Budget warning permission

SQLite persistence / debug views

  • Sessions table

sessions table

  • Logs table

logs table

Videos

  • screenshots/AgentFleet__AI_Mission_Control.mp4

The MVP persists several tables in data/agents_fleet.sqlite:

  • sessions: session metadata + budgets + estimated token/cost + stop reason
  • pty_chunks: raw PTY stream (ANSI included) used for Terminal (persisted) replay
  • stdin_events: input audit trail (stored separately; not injected into replay)
  • session_markers: lifecycle markers like stop_requested, budget_exceeded, process_exit
  • session_artifacts: per-session artifacts (currently: git snapshot with changedFiles[] + combined staged/unstaged diff captured on stop/exit)

Earlier iterations used a line-based logs table. The current design persists terminal history as raw PTY chunks (pty_chunks) for xterm.js replay, which is much closer to real scrollback (especially for TUIs like Claude/Codex).

Videos

Tip: GitHub renders MP4 previews nicely in README. .mov files are ignored by default in .gitignore to avoid bloating git history.

  • screenshots/AgentFleet__AI_Mission_Control.mp4

Architecture

See ARCHITECTURE.md.

Prerequisites

  • Node.js 20.x–24.x (Node 26+ not yet supported)
  • pnpm (Corepack is fine)

Setup

COREPACK_HOME="$PWD/.corepack" pnpm install

Run (dev) — one command

pnpm dev:one

On first run, this may optionally prompt you for ANTHROPIC_API_KEY and save it to .env.local (gitignored). Press Enter to skip.

Environment variables:

  • ANTHROPIC_API_KEY (required for Claude SDK chat)
  • LITELLM_BASE_URL and LITELLM_API_KEY (optional for LiteLLM Chat via enterprise proxy)

This will:

  • install dependencies (if needed)
  • start apps/server + apps/web in parallel

Open: http://localhost:5173

Run (dev) — manual (two terminals)

COREPACK_HOME="$PWD/.corepack" pnpm -C apps/server dev
COREPACK_HOME="$PWD/.corepack" pnpm -C apps/web dev

Create a session

  1. Open the web app (Vite prints the URL, typically http://localhost:5173).
  2. Enter:
    • Repo path: absolute path to a local repository (must be a directory)
    • Command: any shell command to run in that repo

Example commands:

node -e "console.log('hello')"
git status
node -e "setinterval(()=>console.log('tick',Date.now()),200)"
node -e "setInterval(()=>console.log(Date.now()),200)"
claude
codex

Interactive sessions (e.g. Claude)

Claude Code / Codex (PTY)

  • Start a session with command claude (or codex if installed).

(Recommended) Claude Code status line for accurate budget tracking

Claude Code can run a custom status line command that receives structured JSON about the current session (context window usage, estimated cost, etc.).

For the most reliable budget tracking in Agents Fleet, configure a single-line status line that prints parse-friendly key/value pairs.

  1. Create the script:
#!/bin/bash
input=$(cat)

CTX_IN=$(echo "$input" | jq -r '.context_window.total_input_tokens // 0')
CTX_OUT=$(echo "$input" | jq -r '.context_window.total_output_tokens // 0')
CTX_SIZE=$(echo "$input" | jq -r '.context_window.context_window_size // 0')
CTX_PCT=$(echo "$input" | jq -r '.context_window.used_percentage // 0' | cut -d. -f1)

COST=$(echo "$input" | jq -r '.cost.total_cost_usd // 0')
COST_FMT=$(printf '$%.6f' "$COST")

# Single-line, parse-friendly output:
# Use a unique prefix + delimiter to make parsing reliable even with TUI redraws.
echo "AF|ctx=${CTX_IN}/${CTX_SIZE}(${CTX_PCT}%)|in=${CTX_IN}|out=${CTX_OUT}|cost=${COST_FMT}"

Save it as ~/.claude/agents_fleet_statusline.sh and make it executable:

chmod +x ~/.claude/agents_fleet_statusline.sh
  1. Update ~/.claude/settings.json:
{
  "statusLine": {
    "type": "command",
    "command": "~/.claude/agents_fleet_statusline.sh",
    "padding": 1,
    "refreshInterval": 1
  }
}

Notes:

  • Requires jq to be installed (brew install jq on macOS).
  • cost.total_cost_usd is an estimate computed client-side by Claude Code and may differ from your actual bill.
  • Type directly into the Terminal (live) pane (xterm.js).
  • Use Terminal (persisted) to replay and scroll through the recorded PTY output (xterm.js replay).

(Recommended) Codex status line for accurate budget tracking

Codex can also show session usage in a single-line status line. For Agents Fleet, the simplest reliable setup is to keep Codex’s built-in status line enabled and ensure it includes the usage fields below.

  1. Update ~/.codex/config.toml:
[tui]
status_line = ["model-with-reasoning","current-dir","context-remaining","context-used","total-input-token","total-output-tokens","weekly-limit","five-hour-limit","run-state","task-progress"]
status_line_use_color = true
  1. Make sure the output stays on one line in the Codex TUI.

Notes:

  • The config above matches the usage fields Agents Fleet can parse for budget tracking.
  • If you change the field list, keep it single-line so PTY replay remains parse-friendly.
  • Type directly into the Terminal (live) pane (xterm.js).
  • Use Terminal (persisted) to replay and scroll through the recorded PTY output (xterm.js replay).

Claude (SDK) chat (tool-calling)

Prerequisite: set ANTHROPIC_API_KEY (required). The server will reject Claude SDK requests if it’s missing.

  • Switch to Claude (SDK) in the UI.
  • Provide a repo path and chat normally.
  • The assistant can propose run_command tool calls; you must Approve or Reject each command.
  • Tool output is capped (100KB) and stored as session artifacts.

Screenshots:

  • Claude SDK session stopped by budget

Claude SDK budget stop

  • Claude SDK tool call + output

Claude SDK tool call

  • Claude SDK tool permission gate (Approve/Reject)

Claude SDK tool permission

LiteLLM Chat (proxy support)

Use your enterprise URL and API key to access multiple models through a LiteLLM proxy.

LiteLLM Chat allows you to:

  • Use your enterprise/custom LiteLLM proxy endpoint
  • Access models beyond Claude (OpenAI, Anthropic, etc.)
  • Route requests through your own infrastructure

Setup:

  1. Set environment variables:
export LITELLM_BASE_URL="https://your-litellm-proxy.com"
export LITELLM_API_KEY="your-api-key"
  1. Switch to LiteLLM in the UI.
  2. Provide a repo path and select your desired model from the dropdown.
  3. Chat and use tools normally—the same Approve/Reject workflow as Claude SDK.

Notes:

  • LITELLM_BASE_URL must be a valid HTTPS URL pointing to your LiteLLM proxy endpoint.
  • LITELLM_API_KEY is your authentication key for the proxy.
  • The available models depend on your LiteLLM proxy configuration.
  • Tool output is capped (100KB) and stored as session artifacts, just like Claude SDK.

Enterprise/Custom LLM Integration: If you're running a local or enterprise LiteLLM proxy, Agents Fleet will route all requests through your infrastructure, giving you full control and visibility over API costs and usage.

Budgets (estimated)

  • Optional Budget USD and/or Budget tokens apply to the entire session lifetime.
  • Token budget: counts input + output tokens combined. When total_tokens >= budget_tokens, the session stops.
  • USD budget: when estimated_cost >= budget_usd, the session stops.
  • Token estimation: ceil(text.length / 4).
  • Cost estimation:
    • shell/PTY sessions use the default rates in apps/server/src/budget.ts ($3.00 per 1M input, $15.00 per 1M output by default)
    • Claude SDK sessions use a model-based pricing table (computeModelCostUsd) and SDK-reported usage when available.
    • LiteLLM sessions use model-specific pricing from your LiteLLM proxy configuration.
  • If a budget is exceeded, the session is stopped automatically and stop_reason becomes budget_exceeded.

Note: USD cost is still an estimate unless you configure model pricing to match your account/contract.

Configure pricing via a remote API (PRICING_API_URL, must be https) or via local overrides (PRICING_JSON inline JSON / PRICING_JSON_PATH file path). See apps/server/src/pricing.ts for schema + env vars.

Stop a session

  • Select a running session and click Stop.
  • The server will attempt graceful termination first, then force-kill if needed (best-effort, cross-platform).

Per-session artifacts (git diff snapshots)

On session stop and/or exit, Agents Fleet can capture a git snapshot for the session repo and store it in SQLite.

  • UI: open the Artifacts tab (next to Terminal tabs) to view changed files + diff.
  • Storage: session_artifacts table.
  • Toggle: set AGENTS_FLEET_CAPTURE_GIT_ON_END=0 to disable capture.

Resource metrics

Agents Fleet ships a scripts/metrics script that captures CPU, memory, swap, load, network I/O, disk I/O, open file descriptors, and SQLite DB size — scoped to AgentFleet processes only.

./scripts/metrics              # one-shot pretty print
./scripts/metrics --watch      # live refresh every 3s
./scripts/metrics --log        # continuous CSV log to data/metrics_<timestamp>.csv (every 5s)

To tail the log live while a session runs:

# Terminal 1
./scripts/metrics --log

# Terminal 2
tail -f data/metrics_<timestamp>.csv

Measured profile (Apple M4 Pro, 24GB RAM):

Scenario CPU% Memory
Idle (server + vite only) 0–1% ~165MB
Claude Shell session active 1–4% ~788MB
Claude SDK tool calls firing 3–17% 600–665MB
Git diff capture on stop 12–47% spike, clears fast
LiteLLM / Spend Analytics 0–3% ~165MB
  • Baseline footprint is tiny — 165MB, <1% CPU when no agent is running
  • Memory is almost entirely the Claude/Codex process itself, not AgentFleet overhead
  • Git diff capture is the heaviest single event (~12% typical, up to 47% if multiple sessions stop together) — lasts <10s and clears cleanly
  • No memory leaks observed — memory returns to baseline after every session exits
  • Swap usage unchanged throughout — AgentFleet does not add swap pressure
  • Data dir grows ~3MB per SDK session — worth monitoring on frequent use

GPU utilization is not captured without sudo. On Apple Silicon (unified memory) run sudo powermetrics --samplers gpu_power separately if needed.

Scripts

  • pnpm dev:one installs deps if needed and runs dev for all workspaces (web + server).
  • pnpm dev runs dev for all workspaces (web + server) in parallel.
  • pnpm check runs lint + typecheck + test + build.
  • pnpm build builds all workspaces.
  • pnpm typecheck runs TypeScript checks across workspaces.

Tests

COREPACK_HOME="$PWD/.corepack" pnpm -C apps/server test

Notes

  • If you see Corepack cache permission errors, the COREPACK_HOME="$PWD/.corepack" prefix keeps Corepack’s cache inside the repo.
  • Node version: Node 20–24 are supported. Node 26+ is blocked by @homebridge/node-pty-prebuilt-multiarch (>=18 <25). Node 22 and 24 work fine.

Data location

  • SQLite DB: data/agents_fleet.sqlite (local only; do not commit).

Known limitations

  • PTY sessions do not preserve stdout/stderr separation.
  • Token/cost is an estimate unless the CLI provides actual usage.
  • Some TUIs (notably Claude) may clear/restore the alternate screen on exit. The persisted replay is a faithful stream replay, so end-of-session scrollback may differ from what you remember seeing just before exit.
  • No multi-line input in the terminal pane. The xterm.js terminal forwards keystrokes directly to the PTY; the shell owns the line and executes on Enter. Shift+Enter, Ctrl+Enter, etc. all behave the same as plain Enter — there is no way to insert a newline without submitting at the terminal protocol level. Workarounds inside the shell: end a line with \ for continuation, or use $'line1\nline2' quoting. Inside Claude Code's TUI specifically, Option+Enter inserts a newline in the prompt. For free-form multi-line composition, use the Claude (SDK) or LiteLLM chat tabs instead, where Shift+Enter works as expected.

About

AgentFleet is a local-first "mission control" for AI coding agent CLIs and other shell commands. It provides a web UI to launch, monitor, and manage sessions within a repository, streaming live output and persisting a history of sessions and logs to SQLite. It supports interactive agents and budget enforcement.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages