Skip to content

Add Codex CLI driver for OpenAI agents#61

Merged
moodmosaic merged 24 commits intomasterfrom
add-codex-driver
Apr 13, 2026
Merged

Add Codex CLI driver for OpenAI agents#61
moodmosaic merged 24 commits intomasterfrom
add-codex-driver

Conversation

@moodmosaic
Copy link
Copy Markdown
Member

@moodmosaic moodmosaic commented Apr 11, 2026

Summary

  • New codex-cli driver (lib/drivers/codex-cli.sh) implementing the full 13-function interface: non-interactive codex exec --json, stats extraction by summing turn.completed events, activity streaming for command/reasoning/file_change/mcp/web_search items, fatal and retriable error detection, and OPENAI_API_KEY auth injection
  • Dockerfile refactored to share the Node.js 22 install between gemini-cli and codex-cli instead of duplicating it
  • Test configs (codex-only.json, codex-mixed.json) and 92 new unit tests across test_drivers.sh and test_config.sh
  • Updated USAGE.md, README.md, and tests/configs/README.md with the new driver

Test plan

  • bash tests/test_drivers.sh — 242 passed, 0 failed
  • bash tests/test_config.sh — 235 passed, 0 failed
  • bash tests/test_setup.sh — 66 passed, 0 failed
  • ./tests/test.sh --config tests/configs/codex-only.json with OPENAI_API_KEY set
  • ./tests/test.sh --config tests/configs/codex-mixed.json with both CLAUDE_CODE_OAUTH_TOKEN and OPENAI_API_KEY set
  • Verify Docker image builds with SWARM_AGENTS=codex-cli

New driver (lib/drivers/codex-cli.sh) implements the full 13-function
interface: agent_run via `codex exec --json`, stats extraction by
summing turn.completed events, activity streaming for command_execution/
reasoning/file_change/mcp_tool_call/web_search items, fatal/retriable
error detection, and OPENAI_API_KEY auth injection.

Dockerfile refactored to share Node.js 22 install between gemini-cli
and codex-cli drivers. Test configs (codex-only.json, codex-mixed.json)
and 92 new unit tests across test_drivers.sh and test_config.sh.
file_change path lives in .changes[].path not .file_path (verified
from production logs). Remove phantom reasoning item type -- Codex
thinks internally with no event emitted. Update tests to match
real event structure with proper fields (id, status, changes, etc).
Cover all 15 Codex-compatible models: gpt-5.4 family (5.4/mini/nano),
codex-specific (gpt-5.3-codex, gpt-5.2-codex), gpt-5.2, gpt-5
family (5/mini/nano), gpt-4.1 family (4.1/mini/nano), and reasoning
models (o3, o4-mini, o3-mini). Prices per 1M tokens from OpenAI's
standard tier as of April 2026.
Shows "codex" in the Driver column, matching how claude-code shows
"claude" and gemini-cli shows "gemini".
Drivers like Codex CLI that lack native timing in their JSONL output
now get wall-clock elapsed time for dur and api_ms, so the dashboard
shows Tok/s and Time instead of blanks. Drivers that report their
own timing (Claude Code, Gemini CLI) are unaffected.
Exercise multiple Codex model variants in a single mixed-driver
swarm: claude-opus-4-6, gpt-5.4, gpt-5.3-codex, and gpt-5.2.
Codex CLI agent_docker_auth now supports three auth modes:
- chatgpt: bind-mounts ~/.codex/auth.json into containers
- apikey: passes OPENAI_API_KEY only
- auto (default): uses whichever credentials are available

CODEX_AUTH_JSON env var overrides the default auth.json path.
Docs updated with per-driver auth tables and usage examples.
New configs: codex-chatgpt.json (chatgpt-only auth) and
codex-auth-mixed.json (chatgpt + apikey + auto-detect).
test_config.sh gains sections 31-33 that parse these configs
and exercise agent_docker_auth resolution for each auth mode,
including CODEX_AUTH_JSON env var override and default path.
- agent_settings writes config.toml to ~/.codex/ (where CLI looks)
  instead of /workspace/.codex/
- Dockerfile pre-creates /home/agent/.codex/ with correct ownership
  so bind-mounted auth.json doesn't create a root-owned directory
- agent_settings falls back to sudo mkdir if dir isn't writable
- agent_detect_fatal excludes harmless "could not update PATH" warning
Docker's -v silently creates a directory when the source file
is missing, corrupting the host filesystem. --mount type=bind
errors out cleanly instead, preventing this footgun.
Adds gpt-5.4-pro, gpt-5.2-pro, gpt-5-pro, gpt-5.1,
gpt-5.1-codex, gpt-5.1-codex-max, gpt-5.1-codex-mini,
gpt-5-codex, and o3-pro to both codex-only and codex-mixed
pricing configs. Pro models omit cached pricing (unsupported).
Pass effort level into Codex containers via CODEX_EFFORT env var and
relay it to `codex exec -c model_reasoning_effort=...`. Add effort
fields to Codex test configs and corresponding assertions.
Detect when ~/.codex/auth.json has been turned into a directory
(by a stale Docker -v mount) and print a recovery command.
The tail -20 CI output window hides earlier failures. Collect
failed assertion labels and print them in the summary block.
jq 1.8+ preserves source formatting (2.50 stays 2.50) while jq 1.6
normalizes to 2.5. Use values without trailing zeros in the test
config so the assertion works on both versions. Also collect failed
test names in summary for CI tail-20 visibility.
Context slim/none was broken for non-Claude drivers: the harness
stripped .claude/ once on checkout, but git pull --rebase restored
the files. Adds post-merge, post-checkout, and post-rewrite hooks
that re-strip automatically.

Also adds "usage limit" and "hit your...limit" to Codex CLI's
retriable error patterns so ChatGPT subscription caps trigger
backoff retry instead of fatal exit.
@moodmosaic moodmosaic marked this pull request as ready for review April 12, 2026 09:21
codex --version prints "codex-cli 0.120.0"; extract the version
number after the last space instead of before the first.
Codex activity filter selected both item.started and item.completed,
causing every shell command/edit to appear twice. Now shows commands
on start (immediate feedback) and edits/mcp on completion (full path
data).

Push safety net now cleans up stale .git/rebase-merge before retrying,
preventing "already a rebase-merge directory" from failing all retries.
codex exec with --skip-git-repo-check does not load
.codex/instructions.md, so the git coordination rules
were never reaching the model. Prepend them directly
into the prompt argument instead.
OpenAI includes cached tokens in input_tokens, but Claude does not.
The harness pricing formula adds tok_in * input_price + cache *
cached_price, so cached tokens were charged twice. Subtract cached
from input in Codex stats extraction for consistency.
@moodmosaic
Copy link
Copy Markdown
Member Author

This resolves #26.

Codex CLI reads AGENTS.md for project instructions (not
.claude/CLAUDE.md) and .agents/skills/ for skills (not
.claude/skills/).  In agent_settings, bridge both when
the Codex locations are absent:

- Copy .claude/CLAUDE.md (or root CLAUDE.md) to AGENTS.md
- Symlink .claude/skills/ to .agents/skills/

Both are added to .git/info/exclude so the agent doesn't
commit the bridged files.  The skills symlink only fires
when .claude/skills/ exists (context=full); slim/none
strip it so it's a natural no-op.
Adds changelog entry covering the full Codex CLI driver
feature set: driver implementation, ChatGPT subscription
auth, .claude/ convention bridging, context stripping hooks,
stale rebase cleanup, inline system prompt, and per-driver
effort documentation.
@moodmosaic
Copy link
Copy Markdown
Member Author

Tested extensively overnight on gethfuzz with 3 Codex agents
(2× ChatGPT subscription, 1× API key). Verified:

  • ChatGPT usage-limit errors retry correctly
  • API key agent ran full sessions with accurate cost tracking
  • AGENTS.md bridge: Codex agents now follow .claude/CLAUDE.md
    project rules (commit format, epistemics, repo invariants)
  • Skills bridge: .claude/skills/ symlinked to .agents/skills/,
    Codex agents discover and use existing skills
  • Context stripping (slim) survives git pull --rebase
  • Stale rebase-merge cleanup prevents push retry failures
  • System prompt inlined correctly (no custom branch pushes)
  • Provenance trailers show correct version numbers
  • 485 unit tests pass (drivers: 285, dashboard: 92, harness: 111)

@moodmosaic moodmosaic merged commit c00997c into master Apr 13, 2026
5 checks passed
@moodmosaic moodmosaic deleted the add-codex-driver branch April 13, 2026 16:15
@moodmosaic moodmosaic mentioned this pull request Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant