Skip to content

dbmcco/claude-tmux-orchestration

Repository files navigation

TMUX Agent Orchestration

Multi-agent AI development with Codex coordinating Claude via TMUX + Granite Nano

Status Version License

⚠️ EXPERIMENTAL - UNDER ACTIVE DEVELOPMENT ⚠️

This project is an experimental architecture for coordinating multiple AI CLI instances (Claude, Codex) via TMUX with event-driven monitoring using Granite Nano. The system is actively being developed and tested.

Current Status:

  • ✅ Core window-based orchestration (stable)
  • 🚧 Pane-aware Granite monitoring (experimental, new in v2.0)
  • 🚧 Self-bootstrapping coordinator (experimental, new in v2.0)
  • ⚠️ Testing in real-world scenarios (in progress)

Use with caution: This is experimental software. Features may change, break, or be redesigned based on real-world usage.

Overview

This is an experimental multi-agent coordination system where Codex (coordinator) manages Claude (implementer) instances via TMUX, with optional Granite Nano (Ollama) providing event-driven monitoring.

Core Concept:

  • Codex acts as the coordinator/project manager - delegates tasks, enforces quality gates, makes decisions
  • Claude acts as the implementer - writes code following TDD, runs tests, reports status
  • Granite Nano (optional) monitors Claude's output and triggers Codex when events occur
  • Quality gates enforce TDD cycles (RED → GREEN → REFACTOR)

Two Deployment Modes:

  1. Window-Based (v1.0 - Stable)

    • Each agent in separate TMUX window
    • Codex in window 1, Claude in window 0
    • Codex polls via tmux capture-pane and sleep loops
  2. Pane-Based (v2.0 - Experimental)NEW

    • Agents in split panes within same window
    • Codex in pane 0 (left), Claude in pane 1 (right)
    • Granite monitors Claude's pane asynchronously
    • Granite triggers Codex via tmux send-keys (no context waste on sleep!)

Key Features

Stable (v1.0):

  • TDD-Enforced Development: Manager blocks implementation-before-test attempts
  • Quality Gate Automation: ESLint, TypeScript, test coverage, E2E validation
  • Multi-Agent Coordination: Parallel agent deployment with domain separation
  • Human-in-the-Loop: Critical decision points escalated to developers
  • State Monitoring: Manager tracks agent progress via TMUX capture-pane

Experimental (v2.0-dev):

  • 🚧 Pane-Aware Monitoring: Split panes within same window for side-by-side coordination
  • 🚧 Granite Nano Integration: Event-driven monitoring with Ollama (no context waste on sleep)
  • 🚧 Self-Bootstrap: Codex creates its own infrastructure from single pane
  • 🚧 Async Triggering: Granite detects events and triggers coordinator via tmux send-keys

Architecture

Window-Based Mode (v1.0 - Stable)

MacBook (Development Machine)
└── tmux session: project
    ├── window 0: Claude (implementer)
    │   └── Role: Write tests first, implement code following TDD
    │
    ├── window 1: Codex (coordinator)
    │   └── Role: Delegate tasks, enforce quality gates, manage git
    │
    └── window 2: Claude (tester)
        └── Role: Run E2E tests on deployments

Codex uses synchronous monitoring (sleep + tmux capture-pane)

Pane-Based Mode (v2.0 - Experimental) ⭐ NEW

MacBook (Development Machine)
└── tmux session: project
    └── window 0 (split horizontally)
        ├── pane 0 (left): Codex (coordinator)
        │   └── Role: Delegate tasks, wait for Granite triggers
        │
        └── pane 1 (right): Claude (implementer)
            └── Role: Write tests, implement code, report status

Background daemon: Granite Nano Monitor
  - Polls Claude's pane every 30-60 seconds
  - Sends output to Granite Nano (Ollama)
  - Detects: tests passed, errors, questions, completion
  - Triggers Codex: tmux send-keys "review-implementer-output 1" C-m

Codex goes idle (no sleep!), woken only when events occur.

Agent Roles

Codex (Coordinator)

Window Mode: In window 1 Pane Mode: In pane 0 (left side)

  • Delegates tasks to Claude implementers
  • Enforces TDD: RED → GREEN → REFACTOR
  • Runs quality gates before commits
  • Manages git operations (add, commit, push)
  • Constraint: Does NOT write implementation code - only coordinates
  • Monitoring: Window mode = sleep + poll, Pane mode = event-driven (Granite triggers)

Claude (Implementer)

Window Mode: In window 0 Pane Mode: In pane 1 (right side)

  • Follows strict TDD workflow
  • Writes failing tests first, then minimal implementation
  • Runs tests and reports status
  • Requests quality gate validation
  • Constraint: Cannot commit without Codex approval

Granite Nano (Monitor) 🚧 EXPERIMENTAL - Pane Mode Only

Deployment: Background daemon process

  • Polls Claude's pane/window every 30-60 seconds
  • Sends output to Ollama (granite4:350m model)
  • Detects events: tests passed, errors, questions, blocked
  • Pane Mode: Triggers Codex via tmux send-keys
  • Window Mode: Displays alerts via tmux display-message

Workflow Examples

Pane-Based Mode (Event-Driven) 🚧 EXPERIMENTAL

# ONE-TIME SETUP (Codex does this)
./scripts/bootstrap-coordinator.sh          # Creates pane 1, launches Claude
source .tmux-monitor/handler-function.sh    # Loads event handler

# DELEGATION (Codex → Claude)
tmux send-keys -t :0.1 "Implement user authentication with TDD" C-m

# Codex goes IDLE - no sleep!
# Granite monitors Claude's pane every 60s

# EVENT TRIGGERED (Granite → Codex)
# Granite detects "tests passed" in Claude's output
# Granite sends: review-implementer-output 1<C-m> to Codex pane
# Handler runs automatically:
#
# 📊 GRANITE ALERT from pane 1
# Event: COMPLETED
# Details: tests passed; review needed
# [Shows Claude's output]

# CODEX ACTS
./scripts/agent-orchestrator.sh validate   # Quality gates
./scripts/agent-orchestrator.sh commit "feat: user authentication"

# Next task
tmux send-keys -t :0.1 "Add password reset" C-m
# Repeat cycle...

Window-Based Mode (Polling)

# SETUP
./scripts/tmux-spawn-session.sh --attach   # Creates 3 windows

# DELEGATION (Codex in window 1 → Claude in window 0)
./scripts/agent-orchestrator.sh brief "Implement user authentication following TDD"

# MONITORING (Codex polls)
sleep 90  # Codex waits
./scripts/agent-orchestrator.sh monitor 0  # Check Claude's progress

# QUALITY GATES (Codex validates)
./scripts/agent-orchestrator.sh validate

# COMMIT (Codex executes)
./scripts/agent-orchestrator.sh commit "feat: user authentication"

Quality Gates

Every commit must pass:

  1. ESLint: 0 errors, 0 warnings
  2. TypeScript: 0 errors (strict mode)
  3. Unit Tests: All tests pass
  4. Coverage: >80% test coverage

Manager blocks commits if ANY gate fails.

TDD Enforcement

Manager strictly enforces the TDD cycle:

RED Phase

Manager: "Write a failing test for [feature]. Show me RED output."
Implementer: [writes test, runs it, shows failure]
Manager: [validates test fails for correct reason] "Proceed to GREEN."

GREEN Phase

Manager: "Implement minimal code to make test pass. Show me GREEN."
Implementer: [writes code, runs test, shows pass]
Manager: [validates test passes] "Proceed to REFACTOR if needed."

REFACTOR Phase

Manager: "Refactor for clarity. Ensure tests stay GREEN."
Implementer: [improves code, reruns tests]
Manager: [validates tests still pass] "Run quality gates."

Intervention Protocol:

  • If implementer writes code before test → STOP IMMEDIATELY
  • If implementer wants to commit without quality gates → BLOCK
  • Manager enforces TDD without exceptions

Quick Start

Pane-Based Mode (Recommended for Codex) 🚧 EXPERIMENTAL

Prerequisites:

  • Ollama with Granite Nano: ollama pull granite4:350m
  • Codex CLI installed
  • Claude CLI installed

3-Command Setup:

# 1. Start tmux and Codex
tmux
codex

# 2. In Codex, bootstrap infrastructure (creates Claude pane, starts Granite monitor)
cd /Users/braydon/projects/experiments/tmux-agent-orchestration
./scripts/bootstrap-coordinator.sh
source .tmux-monitor/handler-function.sh

# 3. Delegate task to Claude
tmux send-keys -t :0.1 "Implement user auth with TDD" C-m

# Codex goes idle - Granite wakes it when Claude has news!

See: QUICKSTART-PANE-AWARE.md for complete guide.

Window-Based Mode (Stable)

Prerequisites:

  • TMUX installed
  • Git version control
  • Node.js and npm (for JavaScript/TypeScript projects)

Setup:

# Clone and setup
git clone https://github.com/dbmcco/tmux-agent-orchestration.git
cd tmux-agent-orchestration
./setup.sh
./validate-setup.sh

# Create TMUX session with agents in separate windows
./scripts/tmux-spawn-session.sh --attach

See: INSTALLATION.md for complete guide.

Detailed Installation

Prerequisites

Required:

  • TMUX (brew install tmux on macOS)
  • Git
  • Bash shell

For Pane Mode (Experimental):

  • Codex CLI
  • Claude CLI
  • Ollama with Granite Nano: ollama pull granite4:350m

For Window Mode (Stable):

  • Claude CLI (or any AI CLI)
  • Node.js and npm (for quality gates in JS/TS projects)

Installation Steps

# Clone repository
git clone https://github.com/dbmcco/tmux-agent-orchestration.git
cd tmux-agent-orchestration

# Run interactive setup (window mode)
./setup.sh

# Validate installation
./validate-setup.sh

Full installation guide: See INSTALLATION.md

Prerequisites

  • TMUX installed (brew install tmux on macOS)
  • Git version control
  • Node.js and npm (optional, for JavaScript/TypeScript projects)
  • Claude Code CLI (optional, for AI agent integration)

Usage

Orchestration Commands

The agent orchestrator provides a simple CLI for coordinating agents:

# Brief implementer with task
./scripts/agent-orchestrator.sh brief "Implement user authentication following TDD"

# Monitor implementer progress
./scripts/agent-orchestrator.sh monitor 0

# Validate quality gates
./scripts/agent-orchestrator.sh validate

# Execute commit (after quality gates pass)
./scripts/agent-orchestrator.sh commit "feat: add user authentication"

# Request E2E tests
./scripts/agent-orchestrator.sh e2e https://your-deployment.com

# Show help
./scripts/agent-orchestrator.sh help

TMUX Session Management

# One-shot bootstrap for manager window + Granite monitor
./scripts/manager-bootstrap.sh

# (Optional) recreate session manually
./scripts/tmux-spawn-session.sh --attach

# Manual snapshot of implementer output (Granite monitor runs automatically)
./scripts/tmux-monitor.sh 0 50

# Delegate command to window
./scripts/tmux-delegate.sh 0 "npm test"

Autonomous Monitoring (Granite + Ollama)

  • scripts/tmux-monitor-granite.sh captures the latest implementer output, sends each chunk to the local Granite 4 (Ollama) model, and surfaces ALERT: summaries in the manager window plus .tmux-monitor/alerts.log.
  • The watcher deduplicates identical snippets, rate-limits payloads, emits heartbeats (.tmux-monitor/monitor.heartbeat), and automatically restarts via scripts/manager-bootstrap.sh if the heartbeat stalls.
  • Granite outages trigger explicit [monitor] warnings and exponential backoff so the manager knows to investigate Ollama before delegating more work.
  • View the rolling log with tail -f .tmux-monitor/alerts.log; restart the watcher with the bootstrap script if you see [monitor] Granite response empty or failed.

See AGENTS.md for the full manager/implementer playbook.

Working from another repo? Export TMUX_MANAGER_CONFIG=/absolute/path/to/custom-config.sh before invoking any orchestration scripts (including scripts/tmux-spawn-session.sh or scripts/manager-bootstrap.sh). This allows wrappers like work/lfw/lfw-draftforge-v1/scripts/dev-tools/run-granite-monitor.sh to bootstrap sessions with their own defaults without copying config files into this repository.

Git Hook Integration

The setup script can install pre-commit and post-commit hooks:

  • pre-commit.sh: Runs quality gates before every commit
  • post-commit.sh: Logs commit information and triggers post-commit workflows
  • quality-gates.sh: Validates ESLint, TypeScript, tests, and coverage

Hooks are installed in .claude/hooks/ and linked to .git/hooks/.

Configuration

Project Structure

tmux-agent-orchestration/
├── config/
│   ├── config.sh.example          # Configuration template (window mode)
│   └── config.sh                  # Your configuration (created by setup)
│
├── hooks/
│   ├── quality-gates.sh           # Quality gate enforcement
│   ├── pre-commit.sh              # Pre-commit validation
│   └── post-commit.sh             # Post-commit workflow
│
├── scripts/
│   ├── bootstrap-coordinator.sh   # 🚧 EXPERIMENTAL: Self-bootstrap pane mode
│   ├── register-coordinator.sh    # 🚧 EXPERIMENTAL: Pane registry setup
│   ├── manager-bootstrap.sh       # Start Granite monitor daemon
│   ├── tmux-monitor-granite.sh    # Granite/Ollama monitoring (window & pane)
│   ├── tmux-monitor.sh            # Manual output snapshot
│   ├── tmux-delegate.sh           # Send commands to windows
│   ├── tmux-spawn-session.sh      # Create multi-window session
│   ├── agent-orchestrator.sh      # Window mode coordination
│   └── verify-discovery.sh        # Test script discovery
│
├── tests/
│   ├── test-tmux-basic.sh         # Basic functionality tests
│   ├── test-monitor-granite.sh    # Granite watcher tests
│   └── test-pane-aware-mode.sh    # 🚧 Pane mode tests
│
├── .tmux-monitor/                 # Created at runtime
│   ├── pane-registry.txt          # Pane layout mapping (pane mode)
│   ├── handler-function.sh        # Event handler for Codex
│   ├── latest-event.txt           # Current event from Granite
│   └── alerts.log                 # Full audit trail
│
├── setup.sh                       # Interactive setup (window mode)
├── validate-setup.sh              # Installation validator
├── AGENTS.md                  # ✅ Agent operating handbook
├── README.md                  # This file
├── INSTALLATION.md            # ✅ Detailed installation guide
├── TESTING.md                 # ✅ Testing procedures
└── TMUX_CLAUDE_CICD_ARCHITECTURE.md  # Architecture documentation

Legend: ✅ = Implemented and working

Configuration Example

The config/config.sh file controls all behavior:

# Project Configuration
PROJECT_NAME="your-project-name"
PROJECT_ROOT="/path/to/your/project"

# TMUX Configuration
TMUX_SESSION_NAME="${PROJECT_NAME}-dev"
TMUX_IMPLEMENTER_WINDOW="0"
TMUX_MANAGER_WINDOW="1"
TMUX_TESTING_WINDOW="2"

# Quality Gate Configuration
ENABLE_ESLINT=true
ENABLE_TYPESCRIPT=true
ENABLE_TESTS=true
ENABLE_COVERAGE=true

# NPM Scripts (match your package.json)
LINT_SCRIPT="lint"
TYPE_CHECK_SCRIPT="type-check"
TEST_SCRIPT="test"
COVERAGE_SCRIPT="test:coverage"

Testing Status

Implementation Complete ✅

  • Architecture documented - Complete specification in TMUX_CLAUDE_CICD_ARCHITECTURE.md
  • Working scripts extracted - Quality gates, hooks, and orchestration from lfw-draftforge-v1
  • Scripts generalized - Configuration system for any project
  • Setup automation - Interactive installer and validator
  • Documentation complete - Installation, testing, and usage guides
  • Test suite created - Basic functionality validation

Testing Available

Run comprehensive testing:

# Validate installation
./validate-setup.sh

# Run automated tests
./tests/test-tmux-basic.sh

# Follow manual testing guide
cat TESTING.md

Success Metrics

  • Installation: Automated setup with validation
  • Quality Gates: ESLint, TypeScript, tests, coverage enforcement
  • TMUX Coordination: Session creation, monitoring, delegation
  • Git Integration: Pre-commit and post-commit hooks
  • Configuration: Flexible config system for any project
  • Production Use: Ready for real-world testing

Roadmap

Phase 1: Core Window-Based Orchestration ✅ (v1.0 - Stable)

  • ✅ Extract working scripts from lfw-draftforge-v1
  • ✅ Generalize for any project with configuration system
  • ✅ Create orchestration and coordination scripts
  • ✅ Implement setup and validation automation
  • ✅ Complete documentation (installation, testing, usage)

Phase 2: Pane-Aware Event-Driven Monitoring 🚧 (v2.0-dev - Experimental)

  • ✅ Granite Nano integration for event detection
  • ✅ Self-bootstrapping coordinator infrastructure
  • ✅ Pane-aware monitoring with tmux send-keys triggers
  • ✅ Structured event handling (COMPLETED, ERROR, BLOCKED, QUESTION)
  • 🚧 Real-world testing and refinement
  • 🚧 Performance optimization and error handling
  • 🚧 Documentation based on actual usage patterns
  • ⏳ Stability validation before v2.0 release

Phase 3: Production Hardening (Future)

  • ⏳ Comprehensive integration testing
  • ⏳ Error recovery and fault tolerance
  • ⏳ Performance benchmarking
  • ⏳ Production deployment guide
  • ⏳ User feedback incorporation

Phase 4: Advanced Features (Future)

  • 🔮 Multi-pane coordination (3+ agents)
  • 🔮 Journal MCP integration for cross-session learning
  • 🔮 Deployment webhooks and E2E triggers
  • 🔮 Agent coordination patterns library

Contributing

This is an experimental architecture project. Contributions welcome for:

  • Testing the architecture with different project types
  • Improving TMUX coordination scripts
  • Enhancing quality gate enforcement
  • Adding new agent coordination patterns

Credits

Architecture Design: Braydon Fuller + Claude Code CLI

Inspiration:

  • Claude Code CLI by Anthropic
  • obra/superpowers - Agent coordination patterns
  • Claude's sub-agent and output style system
  • TDD and quality gate best practices

Related Projects:

License

MIT License - See LICENSE file for details

Architectural Questions

This project explores several open questions:

  1. CI/CD Triggers: Manual vs Git Hooks vs Hybrid?
  2. Agent Persistence: Long-running vs On-demand vs Hybrid?
  3. Deployment Workflow: Polling vs Webhooks vs Hybrid?
  4. Error Recovery: Auto-retry vs Immediate escalation vs Tiered?
  5. State Management: File-based vs Logs vs Database?

See TMUX_CLAUDE_CICD_ARCHITECTURE.md for detailed analysis of each.

Resources


Status: ✅ Implementation Complete - Extracted from production, ready for testing

Version: 1.0 (October 2025)

Source: Extracted from lfw-draftforge-v1 production environment

About

Multi-agent AI development orchestration using TMUX and Claude Code CLI. Coordinate specialized sub-agents with TDD enforcement and quality gates.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages