fix: make agent spawning work under Codex (inline fallback + native TOML)#18
Merged
Conversation
added 2 commits
June 10, 2026 11:24
The installer copies our agent specs into .codex/agents/ as Markdown, but Codex custom agents must be TOML in its own schema — so Codex can't spawn any of them, and a scan's "Spawn agent: X" instructions pointed at agents that don't exist there. The orchestrator only knew how to spawn, not how to run an evaluator itself, so a Codex scan would fail outright. Adds a harness-neutral spawn protocol: spawn a named agent as a subagent if the harness supports it, otherwise read its spec and run it inline, in sequence. The orchestrator now falls back to running all 7 evaluators inline against the same evidence bundle — same scores and findings, just sequential. Drift-guarded. This is the robust baseline that makes the skill portable, not just Codex-aware. Native Codex TOML agents for real parallelism can come later if its subagent schema settles.
The inline fallback makes Codex work, but it leans on Codex interpreting our Markdown specs correctly every run. This adds the native path: at install, generate a Codex custom-agent TOML (.codex/agents/<name>.toml) from each spec so Codex spawns them directly. The .md stays put for the inline fallback, so it's belt-and-suspenders — native spawn when it works, inline when it doesn't. agentMdToCodexToml emits the three required fields (name, description, developer_instructions). model and sandbox_mode are omitted so a spawned agent inherits the parent's settings — mapping Claude's "sonnet" onto a Codex model would be wrong. The body goes in a TOML literal block so backslashes and regex in code examples survive verbatim. Uninstall removes the generated pixelslop*.toml. Verified end to end: a real install writes 13 valid TOMLs (6 agents + 7 evaluators). Can't live-test spawning on Codex from here, which is exactly why the inline fallback is the safety net.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Completes the cross-harness correctness work started by the asking-protocol fix. Targets the same held release as the exhaustive-by-default set (PR #16).
The bug
The installer copies our agent specs (orchestrator, setup, fixer, checker, + 7 internal evaluators) into
.codex/agents/as Markdown with Claude Code frontmatter. But Codex custom agents must be TOML in its own schema. So Codex recognizes none of them — and the orchestrator only knew how to spawn evaluators, not run them itself. A scan under Codex would fail.The fix (inline fallback)
A harness-neutral spawn protocol: spawn a named agent as a subagent if your harness supports it, otherwise read its spec and run it inline, in sequence. The orchestrator now falls back to running all 7 evaluators inline against the same evidence bundle — identical scores and findings, just sequential. Mirrors the asking-protocol pattern, and it's what impeccable's critique does.
Works on Codex and any future harness, with zero dependency on an external schema. Native Codex TOML agents (for real parallelism) are a deliberate later optimization, only worth it once Codex's subagent model stabilizes.
Tests
1028 passing. The discoverability drift-guard now also asserts the spawn protocol exists, names Codex, gives the inline fallback, and that the orchestrator carries it.