Skip to content

Fix codex-exec worker invocation + add phylogeny tests#3

Merged
VoidAxiom merged 2 commits into
mainfrom
sk/phylogeny-tests
May 17, 2026
Merged

Fix codex-exec worker invocation + add phylogeny tests#3
VoidAxiom merged 2 commits into
mainfrom
sk/phylogeny-tests

Conversation

@VoidAxiom
Copy link
Copy Markdown
Owner

First real Codex-exec worker delegation, end-to-end.

What

  • Fix scripts/codex-run.sh: codex-cli 0.130.0 codex exec has no --ask-for-approval flag (caused unexpected argument --ask-for-approval, blocking all worker delegation). Now uses -c approval_policy=on-request + -c approvals_reviewer=auto_review. Canonical command in .codex/DELEGATION.md synced.
  • src/sim/phylogeny.test.ts (new): authored by a Codex worker via the fixed pipeline. Covers the previously untested pure module — empty input, minPeak filtering, relink-through-pruned-ancestor, alive/extinct + endTick, maxNodes significance cap, row uniqueness + t0 bounds.
  • Untrack tsconfig.tsbuildinfo + gitignore (build cache, was tracked).

Evidence (independently verified by Claude, not trusted from Codex)

  • npx tsc -b clean
  • npx vitest run15/15 passed (6 new + 9 existing incl. determinism/reproducibility)
  • Codex run packet retained at .codex-runs/phylo-tests/ (gitignored)

Notes

  • No Linear link — Linear MCP was just added to Claude Code (✓ connected) but its tools require a session restart to surface; back-link to follow.

VoidAxiom added 2 commits May 16, 2026 21:33
codex-cli 0.130.0 'codex exec' has no --ask-for-approval flag; approval
policy is a config override. Use -c approval_policy=on-request (with
-c approvals_reviewer=auto_review) so workers self-unblock without an
interactive approver. Sync the canonical command in .codex/DELEGATION.md.
Found because the first real worker delegation failed: 'unexpected
argument --ask-for-approval'.
First real Codex-exec worker delegation: tests for the previously
untested pure module src/sim/phylogeny.ts — empty input, minPeak
filtering, relink-through-pruned-ancestor, alive/extinct, maxNodes
significance cap, row/t0 invariants. Independently verified: npx tsc -b
clean, npx vitest run 15/15 green (incl. determinism). Also gitignore
+ untrack tsconfig.tsbuildinfo (build cache, was tracked).
@VoidAxiom
Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@VoidAxiom VoidAxiom merged commit 1fc0fb4 into main May 17, 2026
1 check passed
@VoidAxiom VoidAxiom deleted the sk/phylogeny-tests branch May 17, 2026 01:41
VoidAxiom added a commit that referenced this pull request May 17, 2026
* Add fast review-gate 'wait' subcommand (tighten polling cadence)

Fixes VOI-28. The convergence loop was hand-rolled at 30s ticks against
a long fixed deadline, so a clean Codex review (lands in ~1-2min) cost
~5.5min of idle polling. New 'scripts/review-gate.sh wait <pr> [maxSec]':
polls every 15s and returns the instant Codex acts — FINDINGS (new
unresolved thread) or REVIEWED-CLEAN (chatgpt-codex-connector comment +
CI settled, zero open threads) — else TIMEOUT at the ceiling. CLAUDE.md
and /review-gate now point at it.

Evidence: bash -n clean; npx tsc -b clean; functional dry-run on merged
PR #3 returned REVIEWED-CLEAN in 1.3s (vs. minutes of old polling).

* review-gate wait: require a FRESH Codex comment, not any historical one

Addresses Codex P1 on PR #4: REVIEWED-CLEAN counted any prior
chatgpt-codex-connector comment, so in the resolve->re-review loop a
stale earlier clean comment could declare clean before the fresh
re-review landed (merge without required re-review). Now baselines the
Codex comment count at wait invocation; REVIEWED-CLEAN requires
fresh_codex>0 (a new comment since waiting began).

* review-gate wait: establish baseline robustly, never default to 0

Addresses second Codex P1 on PR #4: a failed/transient/non-JSON
baseline fetch previously fell back to BASE_CODEX=0, so a later
successful poll made historical Codex comments look fresh and could
print REVIEWED-CLEAN before the required re-review. Now retries the
baseline up to 5x and ABORTS (exit 3) if it can't be established —
fail loud rather than risk a stale-clean shortcut.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant