Multi-agent AI development with Codex coordinating Claude via TMUX + Granite Nano
This project is an experimental architecture for coordinating multiple AI CLI instances (Claude, Codex) via TMUX with event-driven monitoring using Granite Nano. The system is actively being developed and tested.
Current Status:
- ✅ Core window-based orchestration (stable)
- 🚧 Pane-aware Granite monitoring (experimental, new in v2.0)
- 🚧 Self-bootstrapping coordinator (experimental, new in v2.0)
⚠️ Testing in real-world scenarios (in progress)
Use with caution: This is experimental software. Features may change, break, or be redesigned based on real-world usage.
This is an experimental multi-agent coordination system where Codex (coordinator) manages Claude (implementer) instances via TMUX, with optional Granite Nano (Ollama) providing event-driven monitoring.
Core Concept:
- Codex acts as the coordinator/project manager - delegates tasks, enforces quality gates, makes decisions
- Claude acts as the implementer - writes code following TDD, runs tests, reports status
- Granite Nano (optional) monitors Claude's output and triggers Codex when events occur
- Quality gates enforce TDD cycles (RED → GREEN → REFACTOR)
Two Deployment Modes:
-
Window-Based (v1.0 - Stable)
- Each agent in separate TMUX window
- Codex in window 1, Claude in window 0
- Codex polls via
tmux capture-paneandsleeploops
-
Pane-Based (v2.0 - Experimental) ⭐ NEW
- Agents in split panes within same window
- Codex in pane 0 (left), Claude in pane 1 (right)
- Granite monitors Claude's pane asynchronously
- Granite triggers Codex via
tmux send-keys(no context waste on sleep!)
Stable (v1.0):
- TDD-Enforced Development: Manager blocks implementation-before-test attempts
- Quality Gate Automation: ESLint, TypeScript, test coverage, E2E validation
- Multi-Agent Coordination: Parallel agent deployment with domain separation
- Human-in-the-Loop: Critical decision points escalated to developers
- State Monitoring: Manager tracks agent progress via TMUX capture-pane
Experimental (v2.0-dev):
- 🚧 Pane-Aware Monitoring: Split panes within same window for side-by-side coordination
- 🚧 Granite Nano Integration: Event-driven monitoring with Ollama (no context waste on sleep)
- 🚧 Self-Bootstrap: Codex creates its own infrastructure from single pane
- 🚧 Async Triggering: Granite detects events and triggers coordinator via tmux send-keys
MacBook (Development Machine)
└── tmux session: project
├── window 0: Claude (implementer)
│ └── Role: Write tests first, implement code following TDD
│
├── window 1: Codex (coordinator)
│ └── Role: Delegate tasks, enforce quality gates, manage git
│
└── window 2: Claude (tester)
└── Role: Run E2E tests on deployments
Codex uses synchronous monitoring (sleep + tmux capture-pane)
MacBook (Development Machine)
└── tmux session: project
└── window 0 (split horizontally)
├── pane 0 (left): Codex (coordinator)
│ └── Role: Delegate tasks, wait for Granite triggers
│
└── pane 1 (right): Claude (implementer)
└── Role: Write tests, implement code, report status
Background daemon: Granite Nano Monitor
- Polls Claude's pane every 30-60 seconds
- Sends output to Granite Nano (Ollama)
- Detects: tests passed, errors, questions, completion
- Triggers Codex: tmux send-keys "review-implementer-output 1" C-m
Codex goes idle (no sleep!), woken only when events occur.
Window Mode: In window 1 Pane Mode: In pane 0 (left side)
- Delegates tasks to Claude implementers
- Enforces TDD: RED → GREEN → REFACTOR
- Runs quality gates before commits
- Manages git operations (add, commit, push)
- Constraint: Does NOT write implementation code - only coordinates
- Monitoring: Window mode = sleep + poll, Pane mode = event-driven (Granite triggers)
Window Mode: In window 0 Pane Mode: In pane 1 (right side)
- Follows strict TDD workflow
- Writes failing tests first, then minimal implementation
- Runs tests and reports status
- Requests quality gate validation
- Constraint: Cannot commit without Codex approval
Deployment: Background daemon process
- Polls Claude's pane/window every 30-60 seconds
- Sends output to Ollama (granite4:350m model)
- Detects events: tests passed, errors, questions, blocked
- Pane Mode: Triggers Codex via
tmux send-keys - Window Mode: Displays alerts via
tmux display-message
# ONE-TIME SETUP (Codex does this)
./scripts/bootstrap-coordinator.sh # Creates pane 1, launches Claude
source .tmux-monitor/handler-function.sh # Loads event handler
# DELEGATION (Codex → Claude)
tmux send-keys -t :0.1 "Implement user authentication with TDD" C-m
# Codex goes IDLE - no sleep!
# Granite monitors Claude's pane every 60s
# EVENT TRIGGERED (Granite → Codex)
# Granite detects "tests passed" in Claude's output
# Granite sends: review-implementer-output 1<C-m> to Codex pane
# Handler runs automatically:
#
# 📊 GRANITE ALERT from pane 1
# Event: COMPLETED
# Details: tests passed; review needed
# [Shows Claude's output]
# CODEX ACTS
./scripts/agent-orchestrator.sh validate # Quality gates
./scripts/agent-orchestrator.sh commit "feat: user authentication"
# Next task
tmux send-keys -t :0.1 "Add password reset" C-m
# Repeat cycle...# SETUP
./scripts/tmux-spawn-session.sh --attach # Creates 3 windows
# DELEGATION (Codex in window 1 → Claude in window 0)
./scripts/agent-orchestrator.sh brief "Implement user authentication following TDD"
# MONITORING (Codex polls)
sleep 90 # Codex waits
./scripts/agent-orchestrator.sh monitor 0 # Check Claude's progress
# QUALITY GATES (Codex validates)
./scripts/agent-orchestrator.sh validate
# COMMIT (Codex executes)
./scripts/agent-orchestrator.sh commit "feat: user authentication"Every commit must pass:
- ESLint: 0 errors, 0 warnings
- TypeScript: 0 errors (strict mode)
- Unit Tests: All tests pass
- Coverage: >80% test coverage
Manager blocks commits if ANY gate fails.
Manager strictly enforces the TDD cycle:
Manager: "Write a failing test for [feature]. Show me RED output."
Implementer: [writes test, runs it, shows failure]
Manager: [validates test fails for correct reason] "Proceed to GREEN."
Manager: "Implement minimal code to make test pass. Show me GREEN."
Implementer: [writes code, runs test, shows pass]
Manager: [validates test passes] "Proceed to REFACTOR if needed."
Manager: "Refactor for clarity. Ensure tests stay GREEN."
Implementer: [improves code, reruns tests]
Manager: [validates tests still pass] "Run quality gates."
Intervention Protocol:
- If implementer writes code before test → STOP IMMEDIATELY
- If implementer wants to commit without quality gates → BLOCK
- Manager enforces TDD without exceptions
Prerequisites:
- Ollama with Granite Nano:
ollama pull granite4:350m - Codex CLI installed
- Claude CLI installed
3-Command Setup:
# 1. Start tmux and Codex
tmux
codex
# 2. In Codex, bootstrap infrastructure (creates Claude pane, starts Granite monitor)
cd /Users/braydon/projects/experiments/tmux-agent-orchestration
./scripts/bootstrap-coordinator.sh
source .tmux-monitor/handler-function.sh
# 3. Delegate task to Claude
tmux send-keys -t :0.1 "Implement user auth with TDD" C-m
# Codex goes idle - Granite wakes it when Claude has news!See: QUICKSTART-PANE-AWARE.md for complete guide.
Prerequisites:
- TMUX installed
- Git version control
- Node.js and npm (for JavaScript/TypeScript projects)
Setup:
# Clone and setup
git clone https://github.com/dbmcco/tmux-agent-orchestration.git
cd tmux-agent-orchestration
./setup.sh
./validate-setup.sh
# Create TMUX session with agents in separate windows
./scripts/tmux-spawn-session.sh --attachSee: INSTALLATION.md for complete guide.
Required:
- TMUX (
brew install tmuxon macOS) - Git
- Bash shell
For Pane Mode (Experimental):
- Codex CLI
- Claude CLI
- Ollama with Granite Nano:
ollama pull granite4:350m
For Window Mode (Stable):
- Claude CLI (or any AI CLI)
- Node.js and npm (for quality gates in JS/TS projects)
# Clone repository
git clone https://github.com/dbmcco/tmux-agent-orchestration.git
cd tmux-agent-orchestration
# Run interactive setup (window mode)
./setup.sh
# Validate installation
./validate-setup.shFull installation guide: See INSTALLATION.md
- TMUX installed (
brew install tmuxon macOS) - Git version control
- Node.js and npm (optional, for JavaScript/TypeScript projects)
- Claude Code CLI (optional, for AI agent integration)
The agent orchestrator provides a simple CLI for coordinating agents:
# Brief implementer with task
./scripts/agent-orchestrator.sh brief "Implement user authentication following TDD"
# Monitor implementer progress
./scripts/agent-orchestrator.sh monitor 0
# Validate quality gates
./scripts/agent-orchestrator.sh validate
# Execute commit (after quality gates pass)
./scripts/agent-orchestrator.sh commit "feat: add user authentication"
# Request E2E tests
./scripts/agent-orchestrator.sh e2e https://your-deployment.com
# Show help
./scripts/agent-orchestrator.sh help# One-shot bootstrap for manager window + Granite monitor
./scripts/manager-bootstrap.sh
# (Optional) recreate session manually
./scripts/tmux-spawn-session.sh --attach
# Manual snapshot of implementer output (Granite monitor runs automatically)
./scripts/tmux-monitor.sh 0 50
# Delegate command to window
./scripts/tmux-delegate.sh 0 "npm test"scripts/tmux-monitor-granite.shcaptures the latest implementer output, sends each chunk to the local Granite 4 (Ollama) model, and surfacesALERT:summaries in the manager window plus.tmux-monitor/alerts.log.- The watcher deduplicates identical snippets, rate-limits payloads, emits heartbeats (
.tmux-monitor/monitor.heartbeat), and automatically restarts viascripts/manager-bootstrap.shif the heartbeat stalls. - Granite outages trigger explicit
[monitor]warnings and exponential backoff so the manager knows to investigate Ollama before delegating more work. - View the rolling log with
tail -f .tmux-monitor/alerts.log; restart the watcher with the bootstrap script if you see[monitor] Granite response empty or failed.
See AGENTS.md for the full manager/implementer playbook.
Working from another repo? Export TMUX_MANAGER_CONFIG=/absolute/path/to/custom-config.sh before invoking any orchestration scripts (including scripts/tmux-spawn-session.sh or scripts/manager-bootstrap.sh). This allows wrappers like work/lfw/lfw-draftforge-v1/scripts/dev-tools/run-granite-monitor.sh to bootstrap sessions with their own defaults without copying config files into this repository.
The setup script can install pre-commit and post-commit hooks:
pre-commit.sh: Runs quality gates before every commitpost-commit.sh: Logs commit information and triggers post-commit workflowsquality-gates.sh: Validates ESLint, TypeScript, tests, and coverage
Hooks are installed in .claude/hooks/ and linked to .git/hooks/.
tmux-agent-orchestration/
├── config/
│ ├── config.sh.example # Configuration template (window mode)
│ └── config.sh # Your configuration (created by setup)
│
├── hooks/
│ ├── quality-gates.sh # Quality gate enforcement
│ ├── pre-commit.sh # Pre-commit validation
│ └── post-commit.sh # Post-commit workflow
│
├── scripts/
│ ├── bootstrap-coordinator.sh # 🚧 EXPERIMENTAL: Self-bootstrap pane mode
│ ├── register-coordinator.sh # 🚧 EXPERIMENTAL: Pane registry setup
│ ├── manager-bootstrap.sh # Start Granite monitor daemon
│ ├── tmux-monitor-granite.sh # Granite/Ollama monitoring (window & pane)
│ ├── tmux-monitor.sh # Manual output snapshot
│ ├── tmux-delegate.sh # Send commands to windows
│ ├── tmux-spawn-session.sh # Create multi-window session
│ ├── agent-orchestrator.sh # Window mode coordination
│ └── verify-discovery.sh # Test script discovery
│
├── tests/
│ ├── test-tmux-basic.sh # Basic functionality tests
│ ├── test-monitor-granite.sh # Granite watcher tests
│ └── test-pane-aware-mode.sh # 🚧 Pane mode tests
│
├── .tmux-monitor/ # Created at runtime
│ ├── pane-registry.txt # Pane layout mapping (pane mode)
│ ├── handler-function.sh # Event handler for Codex
│ ├── latest-event.txt # Current event from Granite
│ └── alerts.log # Full audit trail
│
├── setup.sh # Interactive setup (window mode)
├── validate-setup.sh # Installation validator
├── AGENTS.md # ✅ Agent operating handbook
├── README.md # This file
├── INSTALLATION.md # ✅ Detailed installation guide
├── TESTING.md # ✅ Testing procedures
└── TMUX_CLAUDE_CICD_ARCHITECTURE.md # Architecture documentation
Legend: ✅ = Implemented and working
The config/config.sh file controls all behavior:
# Project Configuration
PROJECT_NAME="your-project-name"
PROJECT_ROOT="/path/to/your/project"
# TMUX Configuration
TMUX_SESSION_NAME="${PROJECT_NAME}-dev"
TMUX_IMPLEMENTER_WINDOW="0"
TMUX_MANAGER_WINDOW="1"
TMUX_TESTING_WINDOW="2"
# Quality Gate Configuration
ENABLE_ESLINT=true
ENABLE_TYPESCRIPT=true
ENABLE_TESTS=true
ENABLE_COVERAGE=true
# NPM Scripts (match your package.json)
LINT_SCRIPT="lint"
TYPE_CHECK_SCRIPT="type-check"
TEST_SCRIPT="test"
COVERAGE_SCRIPT="test:coverage"- ✅ Architecture documented - Complete specification in TMUX_CLAUDE_CICD_ARCHITECTURE.md
- ✅ Working scripts extracted - Quality gates, hooks, and orchestration from lfw-draftforge-v1
- ✅ Scripts generalized - Configuration system for any project
- ✅ Setup automation - Interactive installer and validator
- ✅ Documentation complete - Installation, testing, and usage guides
- ✅ Test suite created - Basic functionality validation
Run comprehensive testing:
# Validate installation
./validate-setup.sh
# Run automated tests
./tests/test-tmux-basic.sh
# Follow manual testing guide
cat TESTING.md- ✅ Installation: Automated setup with validation
- ✅ Quality Gates: ESLint, TypeScript, tests, coverage enforcement
- ✅ TMUX Coordination: Session creation, monitoring, delegation
- ✅ Git Integration: Pre-commit and post-commit hooks
- ✅ Configuration: Flexible config system for any project
- ⏳ Production Use: Ready for real-world testing
- ✅ Extract working scripts from lfw-draftforge-v1
- ✅ Generalize for any project with configuration system
- ✅ Create orchestration and coordination scripts
- ✅ Implement setup and validation automation
- ✅ Complete documentation (installation, testing, usage)
- ✅ Granite Nano integration for event detection
- ✅ Self-bootstrapping coordinator infrastructure
- ✅ Pane-aware monitoring with tmux send-keys triggers
- ✅ Structured event handling (COMPLETED, ERROR, BLOCKED, QUESTION)
- 🚧 Real-world testing and refinement
- 🚧 Performance optimization and error handling
- 🚧 Documentation based on actual usage patterns
- ⏳ Stability validation before v2.0 release
- ⏳ Comprehensive integration testing
- ⏳ Error recovery and fault tolerance
- ⏳ Performance benchmarking
- ⏳ Production deployment guide
- ⏳ User feedback incorporation
- 🔮 Multi-pane coordination (3+ agents)
- 🔮 Journal MCP integration for cross-session learning
- 🔮 Deployment webhooks and E2E triggers
- 🔮 Agent coordination patterns library
This is an experimental architecture project. Contributions welcome for:
- Testing the architecture with different project types
- Improving TMUX coordination scripts
- Enhancing quality gate enforcement
- Adding new agent coordination patterns
Architecture Design: Braydon Fuller + Claude Code CLI
Inspiration:
- Claude Code CLI by Anthropic
- obra/superpowers - Agent coordination patterns
- Claude's sub-agent and output style system
- TDD and quality gate best practices
Related Projects:
- claude-workspace - Modular memory system for Claude Code CLI
MIT License - See LICENSE file for details
This project explores several open questions:
- CI/CD Triggers: Manual vs Git Hooks vs Hybrid?
- Agent Persistence: Long-running vs On-demand vs Hybrid?
- Deployment Workflow: Polling vs Webhooks vs Hybrid?
- Error Recovery: Auto-retry vs Immediate escalation vs Tiered?
- State Management: File-based vs Logs vs Database?
See TMUX_CLAUDE_CICD_ARCHITECTURE.md for detailed analysis of each.
- TMUX Manual
- Claude Code Documentation
- Full architecture specification:
TMUX_CLAUDE_CICD_ARCHITECTURE.md
Status: ✅ Implementation Complete - Extracted from production, ready for testing
Version: 1.0 (October 2025)
Source: Extracted from lfw-draftforge-v1 production environment