fix(sdlc): reform parity completion — codex-headless + broadcast groups + executor contract + gate-manifest CI + flake-quarantine wiring#3811
Conversation
…s, executor contract, gate-manifest CI, flake-quarantine wiring Five wired-but-incomplete Phase-6 parity gaps (CASE-CROSS-RUNTIME-COMMS-001), closing "feature sleeping" / producer-without-consumer defects. 1. hapax-codex-headless: genuine `codex exec` headless lane (NOT a tmux pane). Streams JSONL to ~/.cache/hapax/codex-headless/<lane>/output.jsonl, writes a PID file for the staleness watchdog, same identity-env + task-gate + hook wiring as hapax-codex, default-deny like hapax-claude-headless. Repoints launch_codex_headless to it; adds launch_codex_interactive for the tmux path. 2. relay broadcast groups: expand_recipients gains antigrav / vibe / workers so every runtime is reachable by a `*:group` broadcast (was: ValueError). 3. Executor adapter contract: scripts/executor_contract.py defines the common adapter CLI + a capability registry. Dispatch consumes supports_route() via a launcher table instead of a hard (platform,mode) if-ladder. New probe `hapax-executor-capabilities` + `hapax-methodology-dispatch --capabilities` emit it as JSON (antigrav's IDE hook-gap is machine-legible). 4. gate-manifest-check wired into CI (lint step, --skip-claude-settings, self- listed in ci.run_markers) and pre-commit (--require-claude-settings, the live ~/.claude/settings.json — the most drift-prone wiring CI cannot see). 5. flake-quarantine / bisection WRITE side wired into the live autoqueue: reconcile_flake_quarantines opens/lifts quarantines and writes on apply; bisection_plan_for_failed_runs halves failed merge-group batches. Renamed the misleading count-freeze names to advisory (DEFAULT_ADVISORY_OPEN_PR_COUNT, ADVISORY_OPEN_PR_COUNT, --advisory-open-pr-count) with deprecated aliases. Deferred: the systemd cc-pr-autoqueue.service comment is outside this task's mutation_scope_refs; the --storm-open-pr-threshold deprecated alias keeps the unit functional. Task: reform-fix-parity-completion-20260531 Parent spec: coordination-reform-master-design-2026-05-30.md (Phase 6) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR integrates five runtime executors (Claude, Codex, Gemini, Vibe, Antigrav) with a shared capability registry, implements flake-quarantine reconciliation for PR admission, adds a Codex headless launcher, and refactors dispatch routing for capability awareness. It also extends relay broadcast groups and updates advisory thresholds for queue throttling. ChangesExecutor Governance & Quarantine Integration
🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 083a6377bd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| modes=("headless", "interactive"), | ||
| profiles=("full", "spark"), |
There was a problem hiding this comment.
Register a Codex interactive platform path
Advertising Codex interactive support here does not make the route launchable because hapax-methodology-dispatch still calls route_for(args.platform, args.mode, args.profile) against PLATFORM_PATHS before it reaches the new launcher table, and that table has no ("codex", "interactive", "full") entry. As a result, --platform codex --mode interactive --launch is reported as supported by supports_route()/the capabilities CLI but is refused earlier as an unsupported legacy route, so the new launch_codex_interactive path is unreachable.
Useful? React with 👍 / 👎.
| WORKDIR="${HAPAX_CODEX_HEADLESS_WORKDIR:-$HOME/projects/hapax-council--$SESSION}" | ||
| if [[ ! -d "$WORKDIR" ]]; then | ||
| WORKDIR="$COUNCIL_DIR" | ||
| fi |
There was a problem hiding this comment.
Create the Codex headless worktree instead of falling back
When a new cx-* headless lane has no ~/projects/hapax-council--$SESSION yet, this falls back to $COUNCIL_DIR and launches an autonomous mutating codex exec in the primary checkout. Unlike hapax-codex, which creates the codex-native worktree before launch, the first headless dispatch for a fresh lane will share and dirty the main repo rather than its isolated lane worktree.
Useful? React with 👍 / 👎.
| status = "" | ||
| if validation.task is not None: | ||
| status = strip_scalar(str(validation.task.fields.get("status", ""))) | ||
| args = [str(launcher), "--task", task_id, "--force"] |
There was a problem hiding this comment.
Stop forcing duplicate Codex headless launches
Every dispatch to the new headless launcher includes --force, but hapax-codex-headless only honors its live-PID guard when FORCE != 1. If a dispatcher targets a lane that already has a running headless Codex process, this bypasses the duplicate-launch check, overwrites the PID file, and starts a second autonomous mutator in the same lane/worktree/log instead of refusing or requiring an explicit operator relaunch.
Useful? React with 👍 / 👎.
| profiles=("full", "worker", "flash", "lite"), | ||
| mutates=False, | ||
| claims=False, |
There was a problem hiding this comment.
Expose Gemini worker mutation in capabilities
This capability record lists the worker profile but still reports mutates=False and claims=False, even though hapax-methodology-dispatch has a mutable gemini/headless/worker route and launches it with HAPAX_GEMINI_APPROVAL_MODE=auto_edit. Any cockpit or scheduler using the new machine-readable capabilities will incorrectly treat the governed Gemini worker path as read-only/non-claiming and can route or display it unsafely.
Useful? React with 👍 / 👎.
| # operator machine where the live settings exist; CI checks the rest. | ||
| entry: uv run python scripts/gate-manifest-check.py --require-claude-settings | ||
| language: system | ||
| files: ^(hooks/gate-manifest\.yaml|hooks/scripts/(codex|antigrav)-hook-adapter\.sh|scripts/hapax-(codex|antigrav|vibe|gemini)|config/codex/config\.toml|\.github/workflows/ci\.yml)$ |
There was a problem hiding this comment.
Run the live settings pre-commit check unconditionally
Because this hook has a files: filter and no always_run: true, pre-commit only invokes the --require-claude-settings live-settings check when one of these repo paths is staged. The drift-prone file it is meant to protect (~/.claude/settings.json) is outside the repo, so settings drift followed by an ordinary source/docs commit skips the check entirely instead of enforcing the advertised live wiring validation.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
.pre-commit-config.yaml (1)
58-66: ⚡ Quick winInclude self-change paths in the hook matcher.
gate-manifest-checkcurrently won’t self-trigger when.pre-commit-config.yamlorscripts/gate-manifest-check.pychanges, which weakens local drift detection.Proposed update
- files: ^(hooks/gate-manifest\.yaml|hooks/scripts/(codex|antigrav)-hook-adapter\.sh|scripts/hapax-(codex|antigrav|vibe|gemini)|config/codex/config\.toml|\.github/workflows/ci\.yml)$ + files: ^(hooks/gate-manifest\.yaml|hooks/scripts/(codex|antigrav)-hook-adapter\.sh|scripts/hapax-(codex|antigrav|vibe|gemini)|scripts/gate-manifest-check\.py|config/codex/config\.toml|\.github/workflows/ci\.yml|\.pre-commit-config\.yaml)$🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.pre-commit-config.yaml around lines 58 - 66, The gate-manifest-check hook (id: gate-manifest-check) won't run when its own config or the checker script changes; update the hook's files matcher (the files: pattern in the gate-manifest-check entry) to also match .pre-commit-config.yaml and scripts/gate-manifest-check.py so the hook self-triggers on those self-change paths while keeping pass_filenames: false and the existing matches intact.tests/hooks/test_gate_manifest_check.py (1)
146-160: ⚡ Quick winAssert wiring structurally instead of substring matching.
This test can false-pass if the command appears in comments/text but not the actual hook/step entry.
Proposed update
def test_gate_manifest_check_is_wired_in_ci_and_precommit() -> None: - ci_text = (REPO_ROOT / ".github" / "workflows" / "ci.yml").read_text(encoding="utf-8") - precommit_text = (REPO_ROOT / ".pre-commit-config.yaml").read_text(encoding="utf-8") - # The checker runs in CI (codex/antigrav/vibe/gemini/ci wiring) and in - # pre-commit (the live ~/.claude/settings.json, the most drift-prone wiring). - assert "scripts/gate-manifest-check.py" in ci_text - assert "scripts/gate-manifest-check.py" in precommit_text - assert "--require-claude-settings" in precommit_text # live check in pre-commit - assert "--skip-claude-settings" in ci_text # CI runners have no live settings + ci = yaml.safe_load((REPO_ROOT / ".github" / "workflows" / "ci.yml").read_text(encoding="utf-8")) + precommit = yaml.safe_load((REPO_ROOT / ".pre-commit-config.yaml").read_text(encoding="utf-8")) + # The checker runs in CI (codex/antigrav/vibe/gemini/ci wiring) and in + # pre-commit (the live ~/.claude/settings.json, the most drift-prone wiring). + lint_steps = ci["jobs"]["lint"]["steps"] + assert any( + step.get("name") == "Gate manifest wiring check" + and "scripts/gate-manifest-check.py" in step.get("run", "") + and "--skip-claude-settings" in step.get("run", "") + for step in lint_steps + ) + local_hooks = [ + hook + for repo in precommit["repos"] + if repo.get("repo") == "local" + for hook in repo.get("hooks", []) + ] + gate_hook = next(hook for hook in local_hooks if hook.get("id") == "gate-manifest-check") + assert "scripts/gate-manifest-check.py" in gate_hook.get("entry", "") + assert "--require-claude-settings" in gate_hook.get("entry", "") # The manifest lists its own CI invocation so the checker self-verifies the # CI wiring is present (check_ci run-marker round-trip). manifest = _manifest()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/hooks/test_gate_manifest_check.py` around lines 146 - 160, The test test_gate_manifest_check_is_wired_in_ci_and_precommit should parse the YAML structures instead of substring-searching: load the CI YAML (the content currently read into ci_text) with yaml.safe_load and walk to the job steps to assert there is a step whose "run" or "uses" invokes "scripts/gate-manifest-check.py" and that its arguments include "--skip-claude-settings"; likewise load the pre-commit YAML (precommit_text) and assert one of the defined hooks in the hooks list has an "entry" that invokes "scripts/gate-manifest-check.py" and includes "--require-claude-settings"; keep the manifest check using _manifest() but change the run_markers assertion in test_gate_manifest_check_is_wired_in_ci_and_precommit to compare markers for the exact substring "gate-manifest-check.py" within each marker element (or exact equality) rather than relying on raw file substring presence.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/cc-pr-autoqueue.py`:
- Around line 987-997: The read/modify/write of the quarantine ledger is racy:
the sequence using read_quarantine(...), reconcile_flake_quarantines(...), and
write_quarantine(...) must be serialized so two concurrent --apply runs don't
clobber each other; wrap the full read → reconcile → write sequence with an
exclusive file lock on quarantine_path (or implement an optimistic
read-verify-retry loop that re-reads the file before writing and retries if it
changed) so that the code that calls read_quarantine,
reconcile_flake_quarantines, and write_quarantine holds the lock until
write_quarantine completes; ensure the lock is acquired before calling
read_quarantine and released only after write_quarantine returns (use a
cross-platform file lock helper or retry-on-conflict loop).
In `@scripts/hapax-codex-headless`:
- Line 225: The script currently hardcodes
mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"; replace that
literal with the configured/logical LOGOS endpoint used elsewhere in the script
(the same environment variable set earlier) so non-default environments honor
the contract; specifically, remove the hardcoded URL and reference the existing
configured value (e.g., expand the LOGOS_BASE_URL/LOGOS endpoint variable used
around the earlier configuration block) so mcp_servers.hapax.env.LOGOS_BASE_URL
is assigned from the configured variable instead of the literal.
In `@scripts/hapax-methodology-dispatch`:
- Around line 906-909: The args list currently unconditionally appends "--force"
which bypasses the live-lane PID safety check; change the construction so
"--force" is only appended when an explicit intent exists (e.g., a boolean flag
like force_relaunch or a CLI/env var), e.g., build args = [str(launcher),
"--task", task_id] then if force_relaunch: args.append("--force"); retain the
existing conditional that adds "--no-claim" when status in
{"claimed","in_progress"} and still extend with [lane, prompt]; update any
callers or flag parsing to expose the new force_relaunch control so relaunches
are not forced by default.
In `@shared/merge_queue_lineage.py`:
- Around line 1053-1059: The loop currently does an unguarded int(pr) which will
raise on malformed values; wrap the conversion in a try/except (catching
ValueError and TypeError) around int(pr) inside the block that handles run_id/pr
(the same scope using run_id, pr, bucket and by_run), and on conversion failure
simply continue so malformed pr values are skipped as the docstring promises.
- Around line 1003-1029: The reconciliation currently can lift a quarantine
(lift_quarantine) and then immediately re-open it in the same pass; to prevent
that, exclude PRs that were lifted in this pass from being re-quarantined by
adding a guard in the candidate loop: after computing active_after_lift and
before calling should_quarantine_pr/open_quarantine, check if pr_number is in
the lifted collection (the lifted list populated when calling lift_quarantine)
and skip it (continue) if so; keep the existing active_after_lift and newly
handling unchanged.
---
Nitpick comments:
In @.pre-commit-config.yaml:
- Around line 58-66: The gate-manifest-check hook (id: gate-manifest-check)
won't run when its own config or the checker script changes; update the hook's
files matcher (the files: pattern in the gate-manifest-check entry) to also
match .pre-commit-config.yaml and scripts/gate-manifest-check.py so the hook
self-triggers on those self-change paths while keeping pass_filenames: false and
the existing matches intact.
In `@tests/hooks/test_gate_manifest_check.py`:
- Around line 146-160: The test
test_gate_manifest_check_is_wired_in_ci_and_precommit should parse the YAML
structures instead of substring-searching: load the CI YAML (the content
currently read into ci_text) with yaml.safe_load and walk to the job steps to
assert there is a step whose "run" or "uses" invokes
"scripts/gate-manifest-check.py" and that its arguments include
"--skip-claude-settings"; likewise load the pre-commit YAML (precommit_text) and
assert one of the defined hooks in the hooks list has an "entry" that invokes
"scripts/gate-manifest-check.py" and includes "--require-claude-settings"; keep
the manifest check using _manifest() but change the run_markers assertion in
test_gate_manifest_check_is_wired_in_ci_and_precommit to compare markers for the
exact substring "gate-manifest-check.py" within each marker element (or exact
equality) rather than relying on raw file substring presence.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: c920dfa4-c2b4-4938-ae55-53a0b592dae1
📒 Files selected for processing (16)
.github/workflows/ci.yml.pre-commit-config.yamlhooks/gate-manifest.yamlscripts/cc-pr-autoqueue.pyscripts/executor_contract.pyscripts/hapax-codex-headlessscripts/hapax-executor-capabilitiesscripts/hapax-methodology-dispatchscripts/hapax-pr-admissionshared/merge_queue_lineage.pyshared/relay_mq.pytests/hooks/test_gate_manifest_check.pytests/shared/test_relay_mq.pytests/test_cc_pr_autoqueue.pytests/test_executor_contract.pytests/test_merge_queue_lineage.py
| existing_quarantine = read_quarantine(quarantine_path) | ||
| quarantined_prs = active_quarantined_pr_numbers(existing_quarantine, now=now) | ||
| quarantine_reconciliation = reconcile_flake_quarantines( | ||
| existing_quarantine, | ||
| lineage_records, | ||
| candidate_prs={pr.number for pr in prs}, | ||
| policy=throttle_policy, | ||
| now=now, | ||
| ) | ||
| throttle_policy = FleetThrottlePolicy(advisory_open_pr_count=storm_open_pr_threshold) | ||
| if apply and (quarantine_reconciliation.newly_quarantined or quarantine_reconciliation.lifted): | ||
| write_quarantine(quarantine_path, quarantine_reconciliation.records) |
There was a problem hiding this comment.
Serialize the quarantine read/modify/write cycle.
This new write-side path reads the ledger, reconciles in memory, then overwrites the whole file without any lock. Two overlapping --apply runs can start from the same snapshot and the later write_quarantine() will silently drop the earlier run's newly opened/lifted records.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/cc-pr-autoqueue.py` around lines 987 - 997, The read/modify/write of
the quarantine ledger is racy: the sequence using read_quarantine(...),
reconcile_flake_quarantines(...), and write_quarantine(...) must be serialized
so two concurrent --apply runs don't clobber each other; wrap the full read →
reconcile → write sequence with an exclusive file lock on quarantine_path (or
implement an optimistic read-verify-retry loop that re-reads the file before
writing and retries if it changed) so that the code that calls read_quarantine,
reconcile_flake_quarantines, and write_quarantine holds the lock until
write_quarantine completes; ensure the lock is acquired before calling
read_quarantine and released only after write_quarantine returns (use a
cross-platform file lock helper or retry-on-conflict loop).
| -c "hooks.Stop=[{command=\"$HOOK\",timeout=20,statusMessage=\"Writing Hapax session summary\"}]" | ||
| -c "mcp_servers.hapax.command=\"$HOME/.local/bin/uv\"" | ||
| -c "mcp_servers.hapax.args=[\"--directory\",\"$HOME/projects/hapax-mcp\",\"run\",\"hapax-mcp\"]" | ||
| -c 'mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"' |
There was a problem hiding this comment.
Use the configured LOGOS endpoint instead of a hardcoded URL.
Line 225 hardcodes the MCP LOGOS_BASE_URL, which ignores the environment contract set at Line 141 and breaks non-default environments.
Proposed fix
- -c 'mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"'
+ -c "mcp_servers.hapax.env.LOGOS_BASE_URL=\"$LOGOS_BASE_URL\""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| -c 'mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"' | |
| -c "mcp_servers.hapax.env.LOGOS_BASE_URL=\"$LOGOS_BASE_URL\"" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/hapax-codex-headless` at line 225, The script currently hardcodes
mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"; replace that
literal with the configured/logical LOGOS endpoint used elsewhere in the script
(the same environment variable set earlier) so non-default environments honor
the contract; specifically, remove the hardcoded URL and reference the existing
configured value (e.g., expand the LOGOS_BASE_URL/LOGOS endpoint variable used
around the earlier configuration block) so mcp_servers.hapax.env.LOGOS_BASE_URL
is assigned from the configured variable instead of the literal.
| for record in existing: | ||
| if record.released_at is None and not quarantine_active(record, now=now): | ||
| out.append(lift_quarantine(record, now=now)) | ||
| lifted.append(record.pr_number) | ||
| else: | ||
| out.append(record) | ||
|
|
||
| active_after_lift = active_quarantined_pr_numbers(out, now=now) | ||
| newly: list[int] = [] | ||
| window_hours = policy.window_seconds / 3600 | ||
| for pr_number in sorted(set(candidate_prs)): | ||
| if pr_number in active_after_lift: | ||
| continue | ||
| if should_quarantine_pr(pr_number, lineage, policy=policy, now=now): | ||
| out.append( | ||
| open_quarantine( | ||
| pr_number, | ||
| reason=( | ||
| f"{counts.get(pr_number, 0)} CI failures within " | ||
| f"{window_hours:.0f}h (>= {policy.quarantine_failures}); " | ||
| "reversible flake quarantine" | ||
| ), | ||
| now=now, | ||
| cooldown_seconds=policy.quarantine_cooldown_seconds, | ||
| ) | ||
| ) | ||
| newly.append(pr_number) |
There was a problem hiding this comment.
Don't reopen a quarantine in the same reconciliation pass after lifting it.
If quarantine_cooldown_seconds is shorter than window_seconds, this pass can lift an expired record at Lines 1004-1006 and then immediately reopen it from the same historical failures at Lines 1013-1028. That makes the cooldown ineffective for supported non-default policies.
Suggested guard
active_after_lift = active_quarantined_pr_numbers(out, now=now)
newly: list[int] = []
+ lifted_set = set(lifted)
window_hours = policy.window_seconds / 3600
for pr_number in sorted(set(candidate_prs)):
+ if pr_number in lifted_set:
+ continue
if pr_number in active_after_lift:
continue
if should_quarantine_pr(pr_number, lineage, policy=policy, now=now):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for record in existing: | |
| if record.released_at is None and not quarantine_active(record, now=now): | |
| out.append(lift_quarantine(record, now=now)) | |
| lifted.append(record.pr_number) | |
| else: | |
| out.append(record) | |
| active_after_lift = active_quarantined_pr_numbers(out, now=now) | |
| newly: list[int] = [] | |
| window_hours = policy.window_seconds / 3600 | |
| for pr_number in sorted(set(candidate_prs)): | |
| if pr_number in active_after_lift: | |
| continue | |
| if should_quarantine_pr(pr_number, lineage, policy=policy, now=now): | |
| out.append( | |
| open_quarantine( | |
| pr_number, | |
| reason=( | |
| f"{counts.get(pr_number, 0)} CI failures within " | |
| f"{window_hours:.0f}h (>= {policy.quarantine_failures}); " | |
| "reversible flake quarantine" | |
| ), | |
| now=now, | |
| cooldown_seconds=policy.quarantine_cooldown_seconds, | |
| ) | |
| ) | |
| newly.append(pr_number) | |
| for record in existing: | |
| if record.released_at is None and not quarantine_active(record, now=now): | |
| out.append(lift_quarantine(record, now=now)) | |
| lifted.append(record.pr_number) | |
| else: | |
| out.append(record) | |
| active_after_lift = active_quarantined_pr_numbers(out, now=now) | |
| newly: list[int] = [] | |
| lifted_set = set(lifted) | |
| window_hours = policy.window_seconds / 3600 | |
| for pr_number in sorted(set(candidate_prs)): | |
| if pr_number in lifted_set: | |
| continue | |
| if pr_number in active_after_lift: | |
| continue | |
| if should_quarantine_pr(pr_number, lineage, policy=policy, now=now): | |
| out.append( | |
| open_quarantine( | |
| pr_number, | |
| reason=( | |
| f"{counts.get(pr_number, 0)} CI failures within " | |
| f"{window_hours:.0f}h (>= {policy.quarantine_failures}); " | |
| "reversible flake quarantine" | |
| ), | |
| now=now, | |
| cooldown_seconds=policy.quarantine_cooldown_seconds, | |
| ) | |
| ) | |
| newly.append(pr_number) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@shared/merge_queue_lineage.py` around lines 1003 - 1029, The reconciliation
currently can lift a quarantine (lift_quarantine) and then immediately re-open
it in the same pass; to prevent that, exclude PRs that were lifted in this pass
from being re-quarantined by adding a guard in the candidate loop: after
computing active_after_lift and before calling
should_quarantine_pr/open_quarantine, check if pr_number is in the lifted
collection (the lifted list populated when calling lift_quarantine) and skip it
(continue) if so; keep the existing active_after_lift and newly handling
unchanged.
| run_id = run.get("run_id") | ||
| pr = run.get("pr") | ||
| if run_id is None or pr is None: | ||
| continue | ||
| bucket = by_run[str(run_id)] | ||
| if int(pr) not in bucket: | ||
| bucket.append(int(pr)) |
There was a problem hiding this comment.
Skip malformed pr values instead of crashing.
The docstring says malformed rows are skipped, but int(pr) here will still raise on values like "abc" and take down the reconciler. Guard the conversion and continue on failure.
Suggested hardening
for run in failed_runs:
run_id = run.get("run_id")
pr = run.get("pr")
if run_id is None or pr is None:
continue
+ try:
+ pr_number = int(pr)
+ except (TypeError, ValueError):
+ continue
bucket = by_run[str(run_id)]
- if int(pr) not in bucket:
- bucket.append(int(pr))
+ if pr_number not in bucket:
+ bucket.append(pr_number)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@shared/merge_queue_lineage.py` around lines 1053 - 1059, The loop currently
does an unguarded int(pr) which will raise on malformed values; wrap the
conversion in a try/except (catching ValueError and TypeError) around int(pr)
inside the block that handles run_id/pr (the same scope using run_id, pr, bucket
and by_run), and on conversion failure simply continue so malformed pr values
are skipped as the docstring promises.
…eractive path
The merge-queue full-suite run caught a regression the PR's lighter test job
missed: tests/scripts/test_hapax_methodology_dispatch.py stubbed the codex
launcher via the old HAPAX_METHODOLOGY_CODEX_LAUNCHER env var, but the repointed
launch_codex_headless now reads HAPAX_METHODOLOGY_CODEX_HEADLESS → the real
default-deny hapax-codex-headless ran and exited 77.
- Tests: stub HAPAX_METHODOLOGY_CODEX_HEADLESS; assert the new inline CLI
(--task <id> --force <lane> <prompt>) instead of the old --bootstrap form.
- Drop launch_codex_interactive + the ("codex","interactive") launcher entry:
codex-interactive has no PLATFORM_PATHS route (route-validation rejects it
before launch), so it was unreachable dead code that also made the capability
registry inconsistent. write_codex_bootstrap (only its caller) removed too.
- executor_contract: codex modes -> ("headless",) to match the dispatch's
launchable routes; update the contract tests.
Task: reform-fix-parity-completion-20260531
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-completion-20260531 # Conflicts: # tests/test_cc_pr_autoqueue.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b18d3a7d9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| existing_quarantine = read_quarantine(quarantine_path) | ||
| quarantined_prs = active_quarantined_pr_numbers(existing_quarantine, now=now) | ||
| quarantine_reconciliation = reconcile_flake_quarantines( |
There was a problem hiding this comment.
Apply quarantines to PR admission decisions
Please feed the active flake quarantine into classify_pr/the mutation loop, not just the fleet-rate calculation. With an existing active quarantine for a PR whose checks and task metadata are otherwise ready, these new lines only exclude it from decide_fleet_throttle/recommend_max_entries_to_build; the later classify_pr(...) calls receive no quarantine context and can still return queue or enable_auto_merge, so apply=True will call gh pr merge for the quarantined flaky PR and defeat the quarantine.
Useful? React with 👍 / 👎.
| ADAPTER_CLI_CONTRACT: tuple[str, ...] = ( | ||
| "--lane", | ||
| "--task", | ||
| "--mode", # headless | interactive | receipt-only | ||
| "--prompt", | ||
| "--no-claim", | ||
| "--force", |
There was a problem hiding this comment.
Align the advertised adapter flags with launchers
The contract emitted by hapax-executor-capabilities --contract is not accepted by the new Codex headless adapter: hapax-codex-headless only parses --task, --no-claim, and --force before positional <cx-session> <governed-initial-message>, so a client following this machine-readable contract with --lane cx-red --prompt ... is rejected before launch. Either implement these flags in the adapters or stop advertising them as accepted by every launcher.
Useful? React with 👍 / 👎.
| return peer.startswith("antigrav") | ||
|
|
||
|
|
||
| def _is_vibe_lane(peer: str) -> bool: | ||
| return peer.startswith("vbe") or peer.startswith("vibe") |
There was a problem hiding this comment.
Exclude relay status files from worker broadcasts
When the relay directory contains status YAMLs such as antigrav-status.yaml (the retire helper checks <role>-status.yaml), these prefix checks include antigrav-status/vibe... as recipients for *:antigrav, *:vibe, and *:workers. Those stems are not live consumers, so group dispatches can leave unconsumable offered messages and make delivery look incomplete; restrict the predicates to actual lane names or filter *-status.
Useful? React with 👍 / 👎.
| bucket = by_run[str(run_id)] | ||
| if int(pr) not in bucket: | ||
| bucket.append(int(pr)) |
There was a problem hiding this comment.
Build bisection input from full merge-group batches
The new bisection plan expects multiple rows with the same run_id, but the live lineage store is keyed by merge_group_run_id in write_jsonl_records, so normal failed_recent_merge_group_runs contains at most one PR per failed run. In that case every bucket is a single-PR batch and bisect_failed_batch() returns no split, meaning the reported recommended_bisections stays empty precisely for the failed merge-group batches it is meant to isolate.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1205979d9c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if status in {"claimed", "in_progress"}: | ||
| args.append("--no-claim") |
There was a problem hiding this comment.
Require the Codex lane claim before launch
When dispatching a task whose note is already claimed or in_progress, this always passes --no-claim to the new headless launcher. That launcher only claims when CLAIM_TASK=1; if the target cx-* lane has no existing ~/.cache/hapax/cc-active-task-$SESSION file (for example after a stale note, reboot, or reassignment where the frontmatter says assigned to this lane but the lease is absent), it still starts codex exec with just --task and no session claim. That bypasses the mandatory cc-task lease for an autonomous mutating headless run; either verify the lane claim exists before adding --no-claim or let the launcher run cc-claim and fail closed.
Useful? React with 👍 / 👎.
Reform parity completion (Phase 6, CASE-CROSS-RUNTIME-COMMS-001)
Closes five wired-but-incomplete parity/queue gaps from the completion audit — all "feature sleeping" or producer-without-consumer.
cc-task:
reform-fix-parity-completion-20260531Parent spec:
coordination-reform-master-design-2026-05-30.md(Phase 6 §6)What changed
hapax-codex-headless— genuinecodex execheadless lane (not a tmux pane). JSONL →~/.cache/hapax/codex-headless/<lane>/output.jsonl, PID file for the staleness watchdog, same identity-env + task-gate + hook wiring ashapax-codex, default-deny likehapax-claude-headless.launch_codex_headlessrepointed to it;launch_codex_interactiveadded for the tmux path.expand_recipientsgainsantigrav/vibe/workers(wasValueError), so every runtime is reachable by a*:groupbroadcast.scripts/executor_contract.py(capability registry: modes/profiles/mutates/claims/hooks_wired/headless/read_only). Dispatch consumessupports_route()via a launcher table instead of a hard(platform,mode)if-ladder. Newhapax-executor-capabilitiesprobe +hapax-methodology-dispatch --capabilitiesemit it as JSON; antigrav's IDE hook-gap is now machine-legible.--skip-claude-settings, self-listed inci.run_markers); pre-commit hook (--require-claude-settings, the live~/.claude/settings.json— the most drift-prone wiring CI can't see).reconcile_flake_quarantinesopens/lifts quarantines and writes them on--apply;bisection_plan_for_failed_runshalves failed merge-group batches. Count-freeze names renamed to advisory (DEFAULT_ADVISORY_OPEN_PR_COUNT,ADVISORY_OPEN_PR_COUNT,--advisory-open-pr-count) with deprecated aliases.Acceptance criteria
hapax-codex-headlessexists (codex exec, streamed JSONL + PID) and headless dispatch uses itexpand_recipientssupports antigrav/vibe/workers (tested)capabilities()probe exists; dispatch consumes itgate-manifest-checkruns in CI + pre-commit (live claude-settings via pre-commit)Tests / evidence
tests/shared/test_relay_mq.py,tests/test_merge_queue_lineage.py,tests/test_cc_pr_autoqueue.py,tests/hooks/test_gate_manifest_check.py,tests/test_executor_contract.py— 163 passed in the affected set; fulltests/sharedsweep 6140 passed (7 pre-existinggoogleapiclientfailures unrelated).ruff check+pyrefly checkclean;gate-manifest-check --skip-claude-settingsand--require-claude-settingsboth green.Deferred (scope)
systemd/units/hapax-cc-pr-autoqueue.servicecomment fix is outside this task'smutation_scope_refs; the--storm-open-pr-thresholddeprecated alias keeps the unit functional.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests