Fix codex-exec worker invocation + add phylogeny tests#3
Merged
Conversation
codex-cli 0.130.0 'codex exec' has no --ask-for-approval flag; approval policy is a config override. Use -c approval_policy=on-request (with -c approvals_reviewer=auto_review) so workers self-unblock without an interactive approver. Sync the canonical command in .codex/DELEGATION.md. Found because the first real worker delegation failed: 'unexpected argument --ask-for-approval'.
First real Codex-exec worker delegation: tests for the previously untested pure module src/sim/phylogeny.ts — empty input, minPeak filtering, relink-through-pruned-ancestor, alive/extinct, maxNodes significance cap, row/t0 invariants. Independently verified: npx tsc -b clean, npx vitest run 15/15 green (incl. determinism). Also gitignore + untrack tsconfig.tsbuildinfo (build cache, was tracked).
Owner
Author
|
@codex review |
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
VoidAxiom
added a commit
that referenced
this pull request
May 17, 2026
* Add fast review-gate 'wait' subcommand (tighten polling cadence) Fixes VOI-28. The convergence loop was hand-rolled at 30s ticks against a long fixed deadline, so a clean Codex review (lands in ~1-2min) cost ~5.5min of idle polling. New 'scripts/review-gate.sh wait <pr> [maxSec]': polls every 15s and returns the instant Codex acts — FINDINGS (new unresolved thread) or REVIEWED-CLEAN (chatgpt-codex-connector comment + CI settled, zero open threads) — else TIMEOUT at the ceiling. CLAUDE.md and /review-gate now point at it. Evidence: bash -n clean; npx tsc -b clean; functional dry-run on merged PR #3 returned REVIEWED-CLEAN in 1.3s (vs. minutes of old polling). * review-gate wait: require a FRESH Codex comment, not any historical one Addresses Codex P1 on PR #4: REVIEWED-CLEAN counted any prior chatgpt-codex-connector comment, so in the resolve->re-review loop a stale earlier clean comment could declare clean before the fresh re-review landed (merge without required re-review). Now baselines the Codex comment count at wait invocation; REVIEWED-CLEAN requires fresh_codex>0 (a new comment since waiting began). * review-gate wait: establish baseline robustly, never default to 0 Addresses second Codex P1 on PR #4: a failed/transient/non-JSON baseline fetch previously fell back to BASE_CODEX=0, so a later successful poll made historical Codex comments look fresh and could print REVIEWED-CLEAN before the required re-review. Now retries the baseline up to 5x and ABORTS (exit 3) if it can't be established — fail loud rather than risk a stale-clean shortcut.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First real Codex-exec worker delegation, end-to-end.
What
scripts/codex-run.sh: codex-cli 0.130.0codex exechas no--ask-for-approvalflag (causedunexpected argument --ask-for-approval, blocking all worker delegation). Now uses-c approval_policy=on-request+-c approvals_reviewer=auto_review. Canonical command in.codex/DELEGATION.mdsynced.src/sim/phylogeny.test.ts(new): authored by a Codex worker via the fixed pipeline. Covers the previously untested pure module — empty input,minPeakfiltering, relink-through-pruned-ancestor, alive/extinct +endTick,maxNodessignificance cap, row uniqueness +t0bounds.tsconfig.tsbuildinfo+ gitignore (build cache, was tracked).Evidence (independently verified by Claude, not trusted from Codex)
npx tsc -bcleannpx vitest run→ 15/15 passed (6 new + 9 existing incl. determinism/reproducibility).codex-runs/phylo-tests/(gitignored)Notes