Skip to content

fix(sdlc): reform parity completion — codex-headless + broadcast groups + executor contract + gate-manifest CI + flake-quarantine wiring#3811

Merged
ryanklee merged 3 commits into
mainfrom
eta/reform-fix-parity-completion-20260531
Jun 1, 2026
Merged

fix(sdlc): reform parity completion — codex-headless + broadcast groups + executor contract + gate-manifest CI + flake-quarantine wiring#3811
ryanklee merged 3 commits into
mainfrom
eta/reform-fix-parity-completion-20260531

Conversation

@ryanklee
Copy link
Copy Markdown
Collaborator

@ryanklee ryanklee commented Jun 1, 2026

Reform parity completion (Phase 6, CASE-CROSS-RUNTIME-COMMS-001)

Closes five wired-but-incomplete parity/queue gaps from the completion audit — all "feature sleeping" or producer-without-consumer.

cc-task: reform-fix-parity-completion-20260531
Parent spec: coordination-reform-master-design-2026-05-30.md (Phase 6 §6)

What changed

  1. hapax-codex-headless — genuine codex exec headless lane (not a tmux pane). JSONL → ~/.cache/hapax/codex-headless/<lane>/output.jsonl, PID file for the staleness watchdog, same identity-env + task-gate + hook wiring as hapax-codex, default-deny like hapax-claude-headless. launch_codex_headless repointed to it; launch_codex_interactive added for the tmux path.
  2. Broadcast groupsexpand_recipients gains antigrav / vibe / workers (was ValueError), so every runtime is reachable by a *:group broadcast.
  3. Executor adapter contractscripts/executor_contract.py (capability registry: modes/profiles/mutates/claims/hooks_wired/headless/read_only). Dispatch consumes supports_route() via a launcher table instead of a hard (platform,mode) if-ladder. New hapax-executor-capabilities probe + hapax-methodology-dispatch --capabilities emit it as JSON; antigrav's IDE hook-gap is now machine-legible.
  4. gate-manifest-check in CI + pre-commit — CI lint step (--skip-claude-settings, self-listed in ci.run_markers); pre-commit hook (--require-claude-settings, the live ~/.claude/settings.json — the most drift-prone wiring CI can't see).
  5. flake-quarantine / bisection write-sidereconcile_flake_quarantines opens/lifts quarantines and writes them on --apply; bisection_plan_for_failed_runs halves failed merge-group batches. Count-freeze names renamed to advisory (DEFAULT_ADVISORY_OPEN_PR_COUNT, ADVISORY_OPEN_PR_COUNT, --advisory-open-pr-count) with deprecated aliases.

Acceptance criteria

  • hapax-codex-headless exists (codex exec, streamed JSONL + PID) and headless dispatch uses it
  • expand_recipients supports antigrav/vibe/workers (tested)
  • Executor adapter contract + capabilities() probe exists; dispatch consumes it
  • gate-manifest-check runs in CI + pre-commit (live claude-settings via pre-commit)
  • flake-quarantine/bisection write-side wired into the live autoqueue; count constants renamed to advisory
  • Ruff + pyrefly clean; new tests for all 5 parts pass

Tests / evidence

tests/shared/test_relay_mq.py, tests/test_merge_queue_lineage.py, tests/test_cc_pr_autoqueue.py, tests/hooks/test_gate_manifest_check.py, tests/test_executor_contract.py — 163 passed in the affected set; full tests/shared sweep 6140 passed (7 pre-existing googleapiclient failures unrelated). ruff check + pyrefly check clean; gate-manifest-check --skip-claude-settings and --require-claude-settings both green.

Deferred (scope)

systemd/units/hapax-cc-pr-autoqueue.service comment fix is outside this task's mutation_scope_refs; the --storm-open-pr-threshold deprecated alias keeps the unit functional.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Automated flake-quarantine for managing flaky PRs with persisted reconciliation and reporting.
    • Capability-aware executor registry and routing, plus a CLI to inspect executor capabilities.
    • Headless Codex launch workflow and governed-launch safeguards.
    • CI and pre-commit gate-manifest validation to keep hook/capability wiring consistent.
  • Tests

    • Added extensive tests for quarantine behavior, executor capabilities and CLI, relay recipient expansion, headless dispatch, and merge-queue bisection/reconciliation.

…s, executor contract, gate-manifest CI, flake-quarantine wiring

Five wired-but-incomplete Phase-6 parity gaps (CASE-CROSS-RUNTIME-COMMS-001),
closing "feature sleeping" / producer-without-consumer defects.

1. hapax-codex-headless: genuine `codex exec` headless lane (NOT a tmux pane).
   Streams JSONL to ~/.cache/hapax/codex-headless/<lane>/output.jsonl, writes a
   PID file for the staleness watchdog, same identity-env + task-gate + hook
   wiring as hapax-codex, default-deny like hapax-claude-headless. Repoints
   launch_codex_headless to it; adds launch_codex_interactive for the tmux path.

2. relay broadcast groups: expand_recipients gains antigrav / vibe / workers so
   every runtime is reachable by a `*:group` broadcast (was: ValueError).

3. Executor adapter contract: scripts/executor_contract.py defines the common
   adapter CLI + a capability registry. Dispatch consumes supports_route() via a
   launcher table instead of a hard (platform,mode) if-ladder. New probe
   `hapax-executor-capabilities` + `hapax-methodology-dispatch --capabilities`
   emit it as JSON (antigrav's IDE hook-gap is machine-legible).

4. gate-manifest-check wired into CI (lint step, --skip-claude-settings, self-
   listed in ci.run_markers) and pre-commit (--require-claude-settings, the live
   ~/.claude/settings.json — the most drift-prone wiring CI cannot see).

5. flake-quarantine / bisection WRITE side wired into the live autoqueue:
   reconcile_flake_quarantines opens/lifts quarantines and writes on apply;
   bisection_plan_for_failed_runs halves failed merge-group batches. Renamed the
   misleading count-freeze names to advisory (DEFAULT_ADVISORY_OPEN_PR_COUNT,
   ADVISORY_OPEN_PR_COUNT, --advisory-open-pr-count) with deprecated aliases.

Deferred: the systemd cc-pr-autoqueue.service comment is outside this task's
mutation_scope_refs; the --storm-open-pr-threshold deprecated alias keeps the
unit functional.

Task: reform-fix-parity-completion-20260531
Parent spec: coordination-reform-master-design-2026-05-30.md (Phase 6)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 348a5fc6-b8d5-4ca5-9edd-f786afc2efac

📥 Commits

Reviewing files that changed from the base of the PR and between b18d3a7 and 1205979.

📒 Files selected for processing (2)
  • scripts/cc-pr-autoqueue.py
  • tests/test_cc_pr_autoqueue.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_cc_pr_autoqueue.py
  • scripts/cc-pr-autoqueue.py

📝 Walkthrough

Walkthrough

This PR integrates five runtime executors (Claude, Codex, Gemini, Vibe, Antigrav) with a shared capability registry, implements flake-quarantine reconciliation for PR admission, adds a Codex headless launcher, and refactors dispatch routing for capability awareness. It also extends relay broadcast groups and updates advisory thresholds for queue throttling.

Changes

Executor Governance & Quarantine Integration

Layer / File(s) Summary
Gate Manifest CI & Pre-commit Wiring
.github/workflows/ci.yml, .pre-commit-config.yaml, hooks/gate-manifest.yaml, tests/hooks/test_gate_manifest_check.py
Adds gate-manifest-check.py validation steps to CI workflow and pre-commit hooks with --skip-claude-settings (CI) and --require-claude-settings (local), plus test confirming wiring across all three locations.
Executor Capability Registry & Contract
scripts/executor_contract.py, tests/test_executor_contract.py
Introduces ExecutorCapabilities Pydantic model and EXECUTOR_REGISTRY with five runtime platforms mapping modes, profiles, governance flags, hook wiring, and headless availability. Provides capabilities(), supports_route(), and capabilities_payload() query functions. Comprehensive tests validate registry structure, derived fields, and launchability predicates.
Executor Capabilities CLI Script
scripts/hapax-executor-capabilities
New Python CLI outputs executor registry as JSON; supports --contract mode for canonical CLI contract, per-platform mode for single executor details, and default mode for full registry. Enables capability-based tooling and dispatch decisions.
Codex Headless Executor Launcher
scripts/hapax-codex-headless
New Bash script launches Codex in headless non-interactive mode with JSONL event logging. Enforces session/task governance via claim files, prevents credential leakage, manages lifecycle (logging, PID files, concurrency guard), prepends SDLC preamble, and wires hook adapter for session/tool/audit/stop phases and MCP server invocation.
Flake Quarantine Reconciliation Data Model & Logic
shared/merge_queue_lineage.py, tests/test_merge_queue_lineage.py
Adds QuarantineReconciliation model and reconcile_flake_quarantines() that lifts expired quarantines and opens new ones for PRs exceeding failure thresholds. Introduces bisection_plan_for_failed_runs() to group failed merge-group runs and propose per-run bisection splits. Comprehensive tests cover threshold crossing, non-duplication, cooldown lifting, and deterministic ordering.
PR Admission Advisory Threshold Rename
scripts/hapax-pr-admission
Establishes canonical ADVISORY_OPEN_PR_COUNT constant for fleet throttle policy; keeps AUTO_FREEZE_THRESHOLD as deprecated alias for backward compatibility.
PR Autoqueue Quarantine & Bisection Integration
scripts/cc-pr-autoqueue.py, tests/test_cc_pr_autoqueue.py
Extends StormMode with recommended_bisections field computed from recent failed merge-group runs. Integrates reconcile_flake_quarantines() into run_reconciler() to persist quarantine state, compute newly quarantined/lifted PRs, and emit quarantine status in JSON output. Updates CLI to accept --advisory-open-pr-count (with --storm-open-pr-threshold as deprecated alias). Test verifies quarantine persistence across reconciliation runs.
Relay MQ Broadcast Group Expansion
shared/relay_mq.py, tests/shared/test_relay_mq.py
Adds lane-name predicate functions to classify peers across all runtimes. Refactors expand_recipients() to use new predicates for claude, codex, gemini groups; introduces *:antigrav, *:vibe, and *:workers broadcast groups. Tests verify correct recipient expansion and exclusion of non-worker coordinators.
Methodology Dispatch Capability-Aware Routing
scripts/hapax-methodology-dispatch, tests/scripts/test_hapax_methodology_dispatch.py
Refactors launch path selection from hard-coded if-ladder to table-driven launchers map keyed by (platform, mode) and guarded by supports_route() capability checks. Adds --capabilities flag to emit registry JSON. Replaces launch_codex_headless with configurable headless launcher invocation using --task/--force and Spark-specific model/reasoning flags. On unsupported routes, prints detailed capability guidance (modes, mutates, hooks_wired, headless) and supported governed routes. Updated tests inject the headless launcher env var and assert the new invocation shape.

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • hapax-systems/hapax-council#3789: The main PR wires scripts/gate-manifest-check.py into CI/pre-commit and updates hooks/gate-manifest.yaml CI run_markers, directly extending the same gate-manifest drift checker introduced in the retrieved PR (#3789).
  • hapax-systems/hapax-council#3460: Both PRs modify scripts/cc-pr-autoqueue.py’s storm-mode logic—extending/using the StormMode dataclass and updating run_reconciler’s storm-related threshold/parameter wiring—so the changes are code-level related.
  • hapax-systems/hapax-council#3772: The main PR’s updates to scripts/cc-pr-autoqueue.py (using advisory open-PR count and new flake quarantine reconciliation feeding the same throttle/freeze logic) and scripts/hapax-pr-admission (advisory open-PR count renamed/aliased into failure-rate throttling) directly align with the retrieved PR’s failure-rate throttle + quarantine/throttle-policy refactor in those same modules.

"🐰 I hopped through code and built a gate,
Capabilities listed for every slate.
Headless Codex wakes with a governed claim,
Quarantine keeps flaky PRs tame.
Routes now check what each runtime can take."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.70% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main changes: five parity completion gaps including codex-headless, broadcast groups, executor contract, gate-manifest CI, and flake-quarantine wiring.
Description check ✅ Passed The description follows the required template with all critical sections: Summary (comprehensive overview of five changes), AuthorityCase (CASE-SDLC-REFORM-001 / SLICE-NNN), Test plan (detailed test results), and CLAUDE.md hygiene checklist. All acceptance criteria and test evidence provided.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch eta/reform-fix-parity-completion-20260531

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 083a6377bd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/executor_contract.py Outdated
Comment on lines +67 to +68
modes=("headless", "interactive"),
profiles=("full", "spark"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Register a Codex interactive platform path

Advertising Codex interactive support here does not make the route launchable because hapax-methodology-dispatch still calls route_for(args.platform, args.mode, args.profile) against PLATFORM_PATHS before it reaches the new launcher table, and that table has no ("codex", "interactive", "full") entry. As a result, --platform codex --mode interactive --launch is reported as supported by supports_route()/the capabilities CLI but is refused earlier as an unsupported legacy route, so the new launch_codex_interactive path is unreachable.

Useful? React with 👍 / 👎.

Comment on lines +110 to +113
WORKDIR="${HAPAX_CODEX_HEADLESS_WORKDIR:-$HOME/projects/hapax-council--$SESSION}"
if [[ ! -d "$WORKDIR" ]]; then
WORKDIR="$COUNCIL_DIR"
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Create the Codex headless worktree instead of falling back

When a new cx-* headless lane has no ~/projects/hapax-council--$SESSION yet, this falls back to $COUNCIL_DIR and launches an autonomous mutating codex exec in the primary checkout. Unlike hapax-codex, which creates the codex-native worktree before launch, the first headless dispatch for a fresh lane will share and dirty the main repo rather than its isolated lane worktree.

Useful? React with 👍 / 👎.

status = ""
if validation.task is not None:
status = strip_scalar(str(validation.task.fields.get("status", "")))
args = [str(launcher), "--task", task_id, "--force"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop forcing duplicate Codex headless launches

Every dispatch to the new headless launcher includes --force, but hapax-codex-headless only honors its live-PID guard when FORCE != 1. If a dispatcher targets a lane that already has a running headless Codex process, this bypasses the duplicate-launch check, overwrites the PID file, and starts a second autonomous mutator in the same lane/worktree/log instead of refusing or requiring an explicit operator relaunch.

Useful? React with 👍 / 👎.

Comment on lines +78 to +80
profiles=("full", "worker", "flash", "lite"),
mutates=False,
claims=False,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Expose Gemini worker mutation in capabilities

This capability record lists the worker profile but still reports mutates=False and claims=False, even though hapax-methodology-dispatch has a mutable gemini/headless/worker route and launches it with HAPAX_GEMINI_APPROVAL_MODE=auto_edit. Any cockpit or scheduler using the new machine-readable capabilities will incorrectly treat the governed Gemini worker path as read-only/non-claiming and can route or display it unsafely.

Useful? React with 👍 / 👎.

Comment thread .pre-commit-config.yaml
# operator machine where the live settings exist; CI checks the rest.
entry: uv run python scripts/gate-manifest-check.py --require-claude-settings
language: system
files: ^(hooks/gate-manifest\.yaml|hooks/scripts/(codex|antigrav)-hook-adapter\.sh|scripts/hapax-(codex|antigrav|vibe|gemini)|config/codex/config\.toml|\.github/workflows/ci\.yml)$
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Run the live settings pre-commit check unconditionally

Because this hook has a files: filter and no always_run: true, pre-commit only invokes the --require-claude-settings live-settings check when one of these repo paths is staged. The drift-prone file it is meant to protect (~/.claude/settings.json) is outside the repo, so settings drift followed by an ordinary source/docs commit skips the check entirely instead of enforcing the advertised live wiring validation.

Useful? React with 👍 / 👎.

@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
.pre-commit-config.yaml (1)

58-66: ⚡ Quick win

Include self-change paths in the hook matcher.

gate-manifest-check currently won’t self-trigger when .pre-commit-config.yaml or scripts/gate-manifest-check.py changes, which weakens local drift detection.

Proposed update
-        files: ^(hooks/gate-manifest\.yaml|hooks/scripts/(codex|antigrav)-hook-adapter\.sh|scripts/hapax-(codex|antigrav|vibe|gemini)|config/codex/config\.toml|\.github/workflows/ci\.yml)$
+        files: ^(hooks/gate-manifest\.yaml|hooks/scripts/(codex|antigrav)-hook-adapter\.sh|scripts/hapax-(codex|antigrav|vibe|gemini)|scripts/gate-manifest-check\.py|config/codex/config\.toml|\.github/workflows/ci\.yml|\.pre-commit-config\.yaml)$
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.pre-commit-config.yaml around lines 58 - 66, The gate-manifest-check hook
(id: gate-manifest-check) won't run when its own config or the checker script
changes; update the hook's files matcher (the files: pattern in the
gate-manifest-check entry) to also match .pre-commit-config.yaml and
scripts/gate-manifest-check.py so the hook self-triggers on those self-change
paths while keeping pass_filenames: false and the existing matches intact.
tests/hooks/test_gate_manifest_check.py (1)

146-160: ⚡ Quick win

Assert wiring structurally instead of substring matching.

This test can false-pass if the command appears in comments/text but not the actual hook/step entry.

Proposed update
 def test_gate_manifest_check_is_wired_in_ci_and_precommit() -> None:
-    ci_text = (REPO_ROOT / ".github" / "workflows" / "ci.yml").read_text(encoding="utf-8")
-    precommit_text = (REPO_ROOT / ".pre-commit-config.yaml").read_text(encoding="utf-8")
-    # The checker runs in CI (codex/antigrav/vibe/gemini/ci wiring) and in
-    # pre-commit (the live ~/.claude/settings.json, the most drift-prone wiring).
-    assert "scripts/gate-manifest-check.py" in ci_text
-    assert "scripts/gate-manifest-check.py" in precommit_text
-    assert "--require-claude-settings" in precommit_text  # live check in pre-commit
-    assert "--skip-claude-settings" in ci_text  # CI runners have no live settings
+    ci = yaml.safe_load((REPO_ROOT / ".github" / "workflows" / "ci.yml").read_text(encoding="utf-8"))
+    precommit = yaml.safe_load((REPO_ROOT / ".pre-commit-config.yaml").read_text(encoding="utf-8"))
+    # The checker runs in CI (codex/antigrav/vibe/gemini/ci wiring) and in
+    # pre-commit (the live ~/.claude/settings.json, the most drift-prone wiring).
+    lint_steps = ci["jobs"]["lint"]["steps"]
+    assert any(
+        step.get("name") == "Gate manifest wiring check"
+        and "scripts/gate-manifest-check.py" in step.get("run", "")
+        and "--skip-claude-settings" in step.get("run", "")
+        for step in lint_steps
+    )
+    local_hooks = [
+        hook
+        for repo in precommit["repos"]
+        if repo.get("repo") == "local"
+        for hook in repo.get("hooks", [])
+    ]
+    gate_hook = next(hook for hook in local_hooks if hook.get("id") == "gate-manifest-check")
+    assert "scripts/gate-manifest-check.py" in gate_hook.get("entry", "")
+    assert "--require-claude-settings" in gate_hook.get("entry", "")
     # The manifest lists its own CI invocation so the checker self-verifies the
     # CI wiring is present (check_ci run-marker round-trip).
     manifest = _manifest()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/hooks/test_gate_manifest_check.py` around lines 146 - 160, The test
test_gate_manifest_check_is_wired_in_ci_and_precommit should parse the YAML
structures instead of substring-searching: load the CI YAML (the content
currently read into ci_text) with yaml.safe_load and walk to the job steps to
assert there is a step whose "run" or "uses" invokes
"scripts/gate-manifest-check.py" and that its arguments include
"--skip-claude-settings"; likewise load the pre-commit YAML (precommit_text) and
assert one of the defined hooks in the hooks list has an "entry" that invokes
"scripts/gate-manifest-check.py" and includes "--require-claude-settings"; keep
the manifest check using _manifest() but change the run_markers assertion in
test_gate_manifest_check_is_wired_in_ci_and_precommit to compare markers for the
exact substring "gate-manifest-check.py" within each marker element (or exact
equality) rather than relying on raw file substring presence.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/cc-pr-autoqueue.py`:
- Around line 987-997: The read/modify/write of the quarantine ledger is racy:
the sequence using read_quarantine(...), reconcile_flake_quarantines(...), and
write_quarantine(...) must be serialized so two concurrent --apply runs don't
clobber each other; wrap the full read → reconcile → write sequence with an
exclusive file lock on quarantine_path (or implement an optimistic
read-verify-retry loop that re-reads the file before writing and retries if it
changed) so that the code that calls read_quarantine,
reconcile_flake_quarantines, and write_quarantine holds the lock until
write_quarantine completes; ensure the lock is acquired before calling
read_quarantine and released only after write_quarantine returns (use a
cross-platform file lock helper or retry-on-conflict loop).

In `@scripts/hapax-codex-headless`:
- Line 225: The script currently hardcodes
mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"; replace that
literal with the configured/logical LOGOS endpoint used elsewhere in the script
(the same environment variable set earlier) so non-default environments honor
the contract; specifically, remove the hardcoded URL and reference the existing
configured value (e.g., expand the LOGOS_BASE_URL/LOGOS endpoint variable used
around the earlier configuration block) so mcp_servers.hapax.env.LOGOS_BASE_URL
is assigned from the configured variable instead of the literal.

In `@scripts/hapax-methodology-dispatch`:
- Around line 906-909: The args list currently unconditionally appends "--force"
which bypasses the live-lane PID safety check; change the construction so
"--force" is only appended when an explicit intent exists (e.g., a boolean flag
like force_relaunch or a CLI/env var), e.g., build args = [str(launcher),
"--task", task_id] then if force_relaunch: args.append("--force"); retain the
existing conditional that adds "--no-claim" when status in
{"claimed","in_progress"} and still extend with [lane, prompt]; update any
callers or flag parsing to expose the new force_relaunch control so relaunches
are not forced by default.

In `@shared/merge_queue_lineage.py`:
- Around line 1053-1059: The loop currently does an unguarded int(pr) which will
raise on malformed values; wrap the conversion in a try/except (catching
ValueError and TypeError) around int(pr) inside the block that handles run_id/pr
(the same scope using run_id, pr, bucket and by_run), and on conversion failure
simply continue so malformed pr values are skipped as the docstring promises.
- Around line 1003-1029: The reconciliation currently can lift a quarantine
(lift_quarantine) and then immediately re-open it in the same pass; to prevent
that, exclude PRs that were lifted in this pass from being re-quarantined by
adding a guard in the candidate loop: after computing active_after_lift and
before calling should_quarantine_pr/open_quarantine, check if pr_number is in
the lifted collection (the lifted list populated when calling lift_quarantine)
and skip it (continue) if so; keep the existing active_after_lift and newly
handling unchanged.

---

Nitpick comments:
In @.pre-commit-config.yaml:
- Around line 58-66: The gate-manifest-check hook (id: gate-manifest-check)
won't run when its own config or the checker script changes; update the hook's
files matcher (the files: pattern in the gate-manifest-check entry) to also
match .pre-commit-config.yaml and scripts/gate-manifest-check.py so the hook
self-triggers on those self-change paths while keeping pass_filenames: false and
the existing matches intact.

In `@tests/hooks/test_gate_manifest_check.py`:
- Around line 146-160: The test
test_gate_manifest_check_is_wired_in_ci_and_precommit should parse the YAML
structures instead of substring-searching: load the CI YAML (the content
currently read into ci_text) with yaml.safe_load and walk to the job steps to
assert there is a step whose "run" or "uses" invokes
"scripts/gate-manifest-check.py" and that its arguments include
"--skip-claude-settings"; likewise load the pre-commit YAML (precommit_text) and
assert one of the defined hooks in the hooks list has an "entry" that invokes
"scripts/gate-manifest-check.py" and includes "--require-claude-settings"; keep
the manifest check using _manifest() but change the run_markers assertion in
test_gate_manifest_check_is_wired_in_ci_and_precommit to compare markers for the
exact substring "gate-manifest-check.py" within each marker element (or exact
equality) rather than relying on raw file substring presence.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c920dfa4-c2b4-4938-ae55-53a0b592dae1

📥 Commits

Reviewing files that changed from the base of the PR and between 40bcb2e and 083a637.

📒 Files selected for processing (16)
  • .github/workflows/ci.yml
  • .pre-commit-config.yaml
  • hooks/gate-manifest.yaml
  • scripts/cc-pr-autoqueue.py
  • scripts/executor_contract.py
  • scripts/hapax-codex-headless
  • scripts/hapax-executor-capabilities
  • scripts/hapax-methodology-dispatch
  • scripts/hapax-pr-admission
  • shared/merge_queue_lineage.py
  • shared/relay_mq.py
  • tests/hooks/test_gate_manifest_check.py
  • tests/shared/test_relay_mq.py
  • tests/test_cc_pr_autoqueue.py
  • tests/test_executor_contract.py
  • tests/test_merge_queue_lineage.py

Comment on lines +987 to +997
existing_quarantine = read_quarantine(quarantine_path)
quarantined_prs = active_quarantined_pr_numbers(existing_quarantine, now=now)
quarantine_reconciliation = reconcile_flake_quarantines(
existing_quarantine,
lineage_records,
candidate_prs={pr.number for pr in prs},
policy=throttle_policy,
now=now,
)
throttle_policy = FleetThrottlePolicy(advisory_open_pr_count=storm_open_pr_threshold)
if apply and (quarantine_reconciliation.newly_quarantined or quarantine_reconciliation.lifted):
write_quarantine(quarantine_path, quarantine_reconciliation.records)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Serialize the quarantine read/modify/write cycle.

This new write-side path reads the ledger, reconciles in memory, then overwrites the whole file without any lock. Two overlapping --apply runs can start from the same snapshot and the later write_quarantine() will silently drop the earlier run's newly opened/lifted records.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/cc-pr-autoqueue.py` around lines 987 - 997, The read/modify/write of
the quarantine ledger is racy: the sequence using read_quarantine(...),
reconcile_flake_quarantines(...), and write_quarantine(...) must be serialized
so two concurrent --apply runs don't clobber each other; wrap the full read →
reconcile → write sequence with an exclusive file lock on quarantine_path (or
implement an optimistic read-verify-retry loop that re-reads the file before
writing and retries if it changed) so that the code that calls read_quarantine,
reconcile_flake_quarantines, and write_quarantine holds the lock until
write_quarantine completes; ensure the lock is acquired before calling
read_quarantine and released only after write_quarantine returns (use a
cross-platform file lock helper or retry-on-conflict loop).

-c "hooks.Stop=[{command=\"$HOOK\",timeout=20,statusMessage=\"Writing Hapax session summary\"}]"
-c "mcp_servers.hapax.command=\"$HOME/.local/bin/uv\""
-c "mcp_servers.hapax.args=[\"--directory\",\"$HOME/projects/hapax-mcp\",\"run\",\"hapax-mcp\"]"
-c 'mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the configured LOGOS endpoint instead of a hardcoded URL.

Line 225 hardcodes the MCP LOGOS_BASE_URL, which ignores the environment contract set at Line 141 and breaks non-default environments.

Proposed fix
-  -c 'mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"'
+  -c "mcp_servers.hapax.env.LOGOS_BASE_URL=\"$LOGOS_BASE_URL\""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
-c 'mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"'
-c "mcp_servers.hapax.env.LOGOS_BASE_URL=\"$LOGOS_BASE_URL\""
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/hapax-codex-headless` at line 225, The script currently hardcodes
mcp_servers.hapax.env.LOGOS_BASE_URL="http://localhost:8051/api"; replace that
literal with the configured/logical LOGOS endpoint used elsewhere in the script
(the same environment variable set earlier) so non-default environments honor
the contract; specifically, remove the hardcoded URL and reference the existing
configured value (e.g., expand the LOGOS_BASE_URL/LOGOS endpoint variable used
around the earlier configuration block) so mcp_servers.hapax.env.LOGOS_BASE_URL
is assigned from the configured variable instead of the literal.

Comment thread scripts/hapax-methodology-dispatch
Comment on lines +1003 to +1029
for record in existing:
if record.released_at is None and not quarantine_active(record, now=now):
out.append(lift_quarantine(record, now=now))
lifted.append(record.pr_number)
else:
out.append(record)

active_after_lift = active_quarantined_pr_numbers(out, now=now)
newly: list[int] = []
window_hours = policy.window_seconds / 3600
for pr_number in sorted(set(candidate_prs)):
if pr_number in active_after_lift:
continue
if should_quarantine_pr(pr_number, lineage, policy=policy, now=now):
out.append(
open_quarantine(
pr_number,
reason=(
f"{counts.get(pr_number, 0)} CI failures within "
f"{window_hours:.0f}h (>= {policy.quarantine_failures}); "
"reversible flake quarantine"
),
now=now,
cooldown_seconds=policy.quarantine_cooldown_seconds,
)
)
newly.append(pr_number)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't reopen a quarantine in the same reconciliation pass after lifting it.

If quarantine_cooldown_seconds is shorter than window_seconds, this pass can lift an expired record at Lines 1004-1006 and then immediately reopen it from the same historical failures at Lines 1013-1028. That makes the cooldown ineffective for supported non-default policies.

Suggested guard
     active_after_lift = active_quarantined_pr_numbers(out, now=now)
     newly: list[int] = []
+    lifted_set = set(lifted)
     window_hours = policy.window_seconds / 3600
     for pr_number in sorted(set(candidate_prs)):
+        if pr_number in lifted_set:
+            continue
         if pr_number in active_after_lift:
             continue
         if should_quarantine_pr(pr_number, lineage, policy=policy, now=now):
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for record in existing:
if record.released_at is None and not quarantine_active(record, now=now):
out.append(lift_quarantine(record, now=now))
lifted.append(record.pr_number)
else:
out.append(record)
active_after_lift = active_quarantined_pr_numbers(out, now=now)
newly: list[int] = []
window_hours = policy.window_seconds / 3600
for pr_number in sorted(set(candidate_prs)):
if pr_number in active_after_lift:
continue
if should_quarantine_pr(pr_number, lineage, policy=policy, now=now):
out.append(
open_quarantine(
pr_number,
reason=(
f"{counts.get(pr_number, 0)} CI failures within "
f"{window_hours:.0f}h (>= {policy.quarantine_failures}); "
"reversible flake quarantine"
),
now=now,
cooldown_seconds=policy.quarantine_cooldown_seconds,
)
)
newly.append(pr_number)
for record in existing:
if record.released_at is None and not quarantine_active(record, now=now):
out.append(lift_quarantine(record, now=now))
lifted.append(record.pr_number)
else:
out.append(record)
active_after_lift = active_quarantined_pr_numbers(out, now=now)
newly: list[int] = []
lifted_set = set(lifted)
window_hours = policy.window_seconds / 3600
for pr_number in sorted(set(candidate_prs)):
if pr_number in lifted_set:
continue
if pr_number in active_after_lift:
continue
if should_quarantine_pr(pr_number, lineage, policy=policy, now=now):
out.append(
open_quarantine(
pr_number,
reason=(
f"{counts.get(pr_number, 0)} CI failures within "
f"{window_hours:.0f}h (>= {policy.quarantine_failures}); "
"reversible flake quarantine"
),
now=now,
cooldown_seconds=policy.quarantine_cooldown_seconds,
)
)
newly.append(pr_number)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shared/merge_queue_lineage.py` around lines 1003 - 1029, The reconciliation
currently can lift a quarantine (lift_quarantine) and then immediately re-open
it in the same pass; to prevent that, exclude PRs that were lifted in this pass
from being re-quarantined by adding a guard in the candidate loop: after
computing active_after_lift and before calling
should_quarantine_pr/open_quarantine, check if pr_number is in the lifted
collection (the lifted list populated when calling lift_quarantine) and skip it
(continue) if so; keep the existing active_after_lift and newly handling
unchanged.

Comment on lines +1053 to +1059
run_id = run.get("run_id")
pr = run.get("pr")
if run_id is None or pr is None:
continue
bucket = by_run[str(run_id)]
if int(pr) not in bucket:
bucket.append(int(pr))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Skip malformed pr values instead of crashing.

The docstring says malformed rows are skipped, but int(pr) here will still raise on values like "abc" and take down the reconciler. Guard the conversion and continue on failure.

Suggested hardening
     for run in failed_runs:
         run_id = run.get("run_id")
         pr = run.get("pr")
         if run_id is None or pr is None:
             continue
+        try:
+            pr_number = int(pr)
+        except (TypeError, ValueError):
+            continue
         bucket = by_run[str(run_id)]
-        if int(pr) not in bucket:
-            bucket.append(int(pr))
+        if pr_number not in bucket:
+            bucket.append(pr_number)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shared/merge_queue_lineage.py` around lines 1053 - 1059, The loop currently
does an unguarded int(pr) which will raise on malformed values; wrap the
conversion in a try/except (catching ValueError and TypeError) around int(pr)
inside the block that handles run_id/pr (the same scope using run_id, pr, bucket
and by_run), and on conversion failure simply continue so malformed pr values
are skipped as the docstring promises.

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 1, 2026
@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 1, 2026
@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 1, 2026
@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 1, 2026
@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
@ryanklee ryanklee removed this pull request from the merge queue due to a manual request Jun 1, 2026
ryanklee and others added 2 commits June 1, 2026 07:53
…eractive path

The merge-queue full-suite run caught a regression the PR's lighter test job
missed: tests/scripts/test_hapax_methodology_dispatch.py stubbed the codex
launcher via the old HAPAX_METHODOLOGY_CODEX_LAUNCHER env var, but the repointed
launch_codex_headless now reads HAPAX_METHODOLOGY_CODEX_HEADLESS → the real
default-deny hapax-codex-headless ran and exited 77.

- Tests: stub HAPAX_METHODOLOGY_CODEX_HEADLESS; assert the new inline CLI
  (--task <id> --force <lane> <prompt>) instead of the old --bootstrap form.
- Drop launch_codex_interactive + the ("codex","interactive") launcher entry:
  codex-interactive has no PLATFORM_PATHS route (route-validation rejects it
  before launch), so it was unreachable dead code that also made the capability
  registry inconsistent. write_codex_bootstrap (only its caller) removed too.
- executor_contract: codex modes -> ("headless",) to match the dispatch's
  launchable routes; update the contract tests.

Task: reform-fix-parity-completion-20260531

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-completion-20260531

# Conflicts:
#	tests/test_cc_pr_autoqueue.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b18d3a7d9a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +987 to +989
existing_quarantine = read_quarantine(quarantine_path)
quarantined_prs = active_quarantined_pr_numbers(existing_quarantine, now=now)
quarantine_reconciliation = reconcile_flake_quarantines(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply quarantines to PR admission decisions

Please feed the active flake quarantine into classify_pr/the mutation loop, not just the fleet-rate calculation. With an existing active quarantine for a PR whose checks and task metadata are otherwise ready, these new lines only exclude it from decide_fleet_throttle/recommend_max_entries_to_build; the later classify_pr(...) calls receive no quarantine context and can still return queue or enable_auto_merge, so apply=True will call gh pr merge for the quarantined flaky PR and defeat the quarantine.

Useful? React with 👍 / 👎.

Comment on lines +23 to +29
ADAPTER_CLI_CONTRACT: tuple[str, ...] = (
"--lane",
"--task",
"--mode", # headless | interactive | receipt-only
"--prompt",
"--no-claim",
"--force",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Align the advertised adapter flags with launchers

The contract emitted by hapax-executor-capabilities --contract is not accepted by the new Codex headless adapter: hapax-codex-headless only parses --task, --no-claim, and --force before positional <cx-session> <governed-initial-message>, so a client following this machine-readable contract with --lane cx-red --prompt ... is rejected before launch. Either implement these flags in the adapters or stop advertising them as accepted by every launcher.

Useful? React with 👍 / 👎.

Comment thread shared/relay_mq.py
Comment on lines +193 to +197
return peer.startswith("antigrav")


def _is_vibe_lane(peer: str) -> bool:
return peer.startswith("vbe") or peer.startswith("vibe")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude relay status files from worker broadcasts

When the relay directory contains status YAMLs such as antigrav-status.yaml (the retire helper checks <role>-status.yaml), these prefix checks include antigrav-status/vibe... as recipients for *:antigrav, *:vibe, and *:workers. Those stems are not live consumers, so group dispatches can leave unconsumable offered messages and make delivery look incomplete; restrict the predicates to actual lane names or filter *-status.

Useful? React with 👍 / 👎.

Comment on lines +1057 to +1059
bucket = by_run[str(run_id)]
if int(pr) not in bucket:
bucket.append(int(pr))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Build bisection input from full merge-group batches

The new bisection plan expects multiple rows with the same run_id, but the live lineage store is keyed by merge_group_run_id in write_jsonl_records, so normal failed_recent_merge_group_runs contains at most one PR per failed run. In that case every bucket is a single-PR batch and bisect_failed_batch() returns no split, meaning the reported recommended_bisections stays empty precisely for the failed merge-group batches it is meant to isolate.

Useful? React with 👍 / 👎.

@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1205979d9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 870 to 871
if status in {"claimed", "in_progress"}:
args.append("--no-claim")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Require the Codex lane claim before launch

When dispatching a task whose note is already claimed or in_progress, this always passes --no-claim to the new headless launcher. That launcher only claims when CLAIM_TASK=1; if the target cx-* lane has no existing ~/.cache/hapax/cc-active-task-$SESSION file (for example after a stale note, reboot, or reassignment where the frontmatter says assigned to this lane but the lease is absent), it still starts codex exec with just --task and no session claim. That bypasses the mandatory cc-task lease for an autonomous mutating headless run; either verify the lane claim exists before adding --no-claim or let the launcher run cc-claim and fail closed.

Useful? React with 👍 / 👎.

Merged via the queue into main with commit f9450a9 Jun 1, 2026
37 checks passed
@ryanklee ryanklee deleted the eta/reform-fix-parity-completion-20260531 branch June 1, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant