Add Codex CLI driver for OpenAI agents by moodmosaic · Pull Request #61 · protocol-security/claude-swarm

moodmosaic · 2026-04-11T16:37:55Z

Summary

New codex-cli driver (lib/drivers/codex-cli.sh) implementing the full 13-function interface: non-interactive codex exec --json, stats extraction by summing turn.completed events, activity streaming for command/reasoning/file_change/mcp/web_search items, fatal and retriable error detection, and OPENAI_API_KEY auth injection
Dockerfile refactored to share the Node.js 22 install between gemini-cli and codex-cli instead of duplicating it
Test configs (codex-only.json, codex-mixed.json) and 92 new unit tests across test_drivers.sh and test_config.sh
Updated USAGE.md, README.md, and tests/configs/README.md with the new driver

Test plan

bash tests/test_drivers.sh — 242 passed, 0 failed
bash tests/test_config.sh — 235 passed, 0 failed
bash tests/test_setup.sh — 66 passed, 0 failed
./tests/test.sh --config tests/configs/codex-only.json with OPENAI_API_KEY set
./tests/test.sh --config tests/configs/codex-mixed.json with both CLAUDE_CODE_OAUTH_TOKEN and OPENAI_API_KEY set
Verify Docker image builds with SWARM_AGENTS=codex-cli

New driver (lib/drivers/codex-cli.sh) implements the full 13-function interface: agent_run via `codex exec --json`, stats extraction by summing turn.completed events, activity streaming for command_execution/ reasoning/file_change/mcp_tool_call/web_search items, fatal/retriable error detection, and OPENAI_API_KEY auth injection. Dockerfile refactored to share Node.js 22 install between gemini-cli and codex-cli drivers. Test configs (codex-only.json, codex-mixed.json) and 92 new unit tests across test_drivers.sh and test_config.sh.

file_change path lives in .changes[].path not .file_path (verified from production logs). Remove phantom reasoning item type -- Codex thinks internally with no event emitted. Update tests to match real event structure with proper fields (id, status, changes, etc).

Cover all 15 Codex-compatible models: gpt-5.4 family (5.4/mini/nano), codex-specific (gpt-5.3-codex, gpt-5.2-codex), gpt-5.2, gpt-5 family (5/mini/nano), gpt-4.1 family (4.1/mini/nano), and reasoning models (o3, o4-mini, o3-mini). Prices per 1M tokens from OpenAI's standard tier as of April 2026.

Shows "codex" in the Driver column, matching how claude-code shows "claude" and gemini-cli shows "gemini".

Drivers like Codex CLI that lack native timing in their JSONL output now get wall-clock elapsed time for dur and api_ms, so the dashboard shows Tok/s and Time instead of blanks. Drivers that report their own timing (Claude Code, Gemini CLI) are unaffected.

Exercise multiple Codex model variants in a single mixed-driver swarm: claude-opus-4-6, gpt-5.4, gpt-5.3-codex, and gpt-5.2.

Codex CLI agent_docker_auth now supports three auth modes: - chatgpt: bind-mounts ~/.codex/auth.json into containers - apikey: passes OPENAI_API_KEY only - auto (default): uses whichever credentials are available CODEX_AUTH_JSON env var overrides the default auth.json path. Docs updated with per-driver auth tables and usage examples.

New configs: codex-chatgpt.json (chatgpt-only auth) and codex-auth-mixed.json (chatgpt + apikey + auto-detect). test_config.sh gains sections 31-33 that parse these configs and exercise agent_docker_auth resolution for each auth mode, including CODEX_AUTH_JSON env var override and default path.

- agent_settings writes config.toml to ~/.codex/ (where CLI looks) instead of /workspace/.codex/ - Dockerfile pre-creates /home/agent/.codex/ with correct ownership so bind-mounted auth.json doesn't create a root-owned directory - agent_settings falls back to sudo mkdir if dir isn't writable - agent_detect_fatal excludes harmless "could not update PATH" warning

Docker's -v silently creates a directory when the source file is missing, corrupting the host filesystem. --mount type=bind errors out cleanly instead, preventing this footgun.

Adds gpt-5.4-pro, gpt-5.2-pro, gpt-5-pro, gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-max, gpt-5.1-codex-mini, gpt-5-codex, and o3-pro to both codex-only and codex-mixed pricing configs. Pro models omit cached pricing (unsupported).

Pass effort level into Codex containers via CODEX_EFFORT env var and relay it to `codex exec -c model_reasoning_effort=...`. Add effort fields to Codex test configs and corresponding assertions.

Detect when ~/.codex/auth.json has been turned into a directory (by a stale Docker -v mount) and print a recovery command.

The tail -20 CI output window hides earlier failures. Collect failed assertion labels and print them in the summary block.

jq 1.8+ preserves source formatting (2.50 stays 2.50) while jq 1.6 normalizes to 2.5. Use values without trailing zeros in the test config so the assertion works on both versions. Also collect failed test names in summary for CI tail-20 visibility.

Context slim/none was broken for non-Claude drivers: the harness stripped .claude/ once on checkout, but git pull --rebase restored the files. Adds post-merge, post-checkout, and post-rewrite hooks that re-strip automatically. Also adds "usage limit" and "hit your...limit" to Codex CLI's retriable error patterns so ChatGPT subscription caps trigger backoff retry instead of fatal exit.

codex --version prints "codex-cli 0.120.0"; extract the version number after the last space instead of before the first.

Codex activity filter selected both item.started and item.completed, causing every shell command/edit to appear twice. Now shows commands on start (immediate feedback) and edits/mcp on completion (full path data). Push safety net now cleans up stale .git/rebase-merge before retrying, preventing "already a rebase-merge directory" from failing all retries.

codex exec with --skip-git-repo-check does not load .codex/instructions.md, so the git coordination rules were never reaching the model. Prepend them directly into the prompt argument instead.

OpenAI includes cached tokens in input_tokens, but Claude does not. The harness pricing formula adds tok_in * input_price + cache * cached_price, so cached tokens were charged twice. Subtract cached from input in Codex stats extraction for consistency.

moodmosaic · 2026-04-13T14:33:52Z

This resolves #26.

Codex CLI reads AGENTS.md for project instructions (not .claude/CLAUDE.md) and .agents/skills/ for skills (not .claude/skills/). In agent_settings, bridge both when the Codex locations are absent: - Copy .claude/CLAUDE.md (or root CLAUDE.md) to AGENTS.md - Symlink .claude/skills/ to .agents/skills/ Both are added to .git/info/exclude so the agent doesn't commit the bridged files. The skills symlink only fires when .claude/skills/ exists (context=full); slim/none strip it so it's a natural no-op.

Adds changelog entry covering the full Codex CLI driver feature set: driver implementation, ChatGPT subscription auth, .claude/ convention bridging, context stripping hooks, stale rebase cleanup, inline system prompt, and per-driver effort documentation.

moodmosaic · 2026-04-13T16:15:08Z

Tested extensively overnight on gethfuzz with 3 Codex agents
(2× ChatGPT subscription, 1× API key). Verified:

ChatGPT usage-limit errors retry correctly
API key agent ran full sessions with accurate cost tracking
AGENTS.md bridge: Codex agents now follow .claude/CLAUDE.md
project rules (commit format, epistemics, repo invariants)
Skills bridge: .claude/skills/ symlinked to .agents/skills/,
Codex agents discover and use existing skills
Context stripping (slim) survives git pull --rebase
Stale rebase-merge cleanup prevents push retry failures
System prompt inlined correctly (no custom branch pushes)
Provenance trailers show correct version numbers
485 unit tests pass (drivers: 285, dashboard: 92, harness: 111)

moodmosaic added 14 commits April 11, 2026 09:35

Add codex-cli to dashboard short_driver() display mapping

747fba6

Shows "codex" in the Driver column, matching how claude-code shows "claude" and gemini-cli shows "gemini".

Expand codex-mixed config with gpt-5.3-codex and gpt-5.2 agents

beb6318

Exercise multiple Codex model variants in a single mixed-driver swarm: claude-opus-4-6, gpt-5.4, gpt-5.3-codex, and gpt-5.2.

Widen dashboard Auth column to fit 'chatgpt' label

1c5ff1a

Use --mount instead of -v for auth.json bind mount

1cbd3b4

Docker's -v silently creates a directory when the source file is missing, corrupting the host filesystem. --mount type=bind errors out cleanly instead, preventing this footgun.

Add missing OpenAI model pricing (pro, codex, 5.1 variants)

cde4ce1

Adds gpt-5.4-pro, gpt-5.2-pro, gpt-5-pro, gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-max, gpt-5.1-codex-mini, gpt-5-codex, and o3-pro to both codex-only and codex-mixed pricing configs. Pro models omit cached pricing (unsupported).

Add Codex reasoning effort support and test coverage

5f01bf8

Pass effort level into Codex containers via CODEX_EFFORT env var and relay it to `codex exec -c model_reasoning_effort=...`. Add effort fields to Codex test configs and corresponding assertions.

Warn when auth.json is a corrupted directory

a9b8a0f

Detect when ~/.codex/auth.json has been turned into a directory (by a stale Docker -v mount) and print a recovery command.

moodmosaic mentioned this pull request Apr 11, 2026

Add optional commit signing and post-processor max_idle #62

Open

1 task

moodmosaic added 3 commits April 11, 2026 16:18

Print failed test names in summary for CI visibility

211e21c

The tail -20 CI output window hides earlier failures. Collect failed assertion labels and print them in the summary block.

moodmosaic marked this pull request as ready for review April 12, 2026 09:21

moodmosaic added 5 commits April 12, 2026 02:31

Fix Codex CLI version in provenance trailers

e11dd79

codex --version prints "codex-cli 0.120.0"; extract the version number after the last space instead of before the first.

Inline system prompt into Codex exec prompt text

7636782

codex exec with --skip-git-repo-check does not load .codex/instructions.md, so the git coordination rules were never reaching the model. Prepend them directly into the prompt argument instead.

Document per-driver effort values in USAGE.md

00ff2d7

moodmosaic force-pushed the add-codex-driver branch from ee4c0f7 to 3bbd5a1 Compare April 13, 2026 15:48

moodmosaic merged commit c00997c into master Apr 13, 2026
5 checks passed

moodmosaic deleted the add-codex-driver branch April 13, 2026 16:15

moodmosaic mentioned this pull request Apr 13, 2026

Agent abstraction #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Codex CLI driver for OpenAI agents#61

Add Codex CLI driver for OpenAI agents#61
moodmosaic merged 24 commits intomasterfrom
add-codex-driver

moodmosaic commented Apr 11, 2026 •

edited

Loading

Uh oh!

moodmosaic commented Apr 13, 2026

Uh oh!

moodmosaic commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

moodmosaic commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

moodmosaic commented Apr 13, 2026

Uh oh!

moodmosaic commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

moodmosaic commented Apr 11, 2026 •

edited

Loading