Fix auto skip-run closure diagnostics#1229
Conversation
Make the Codex/App Ouroboros auto loop distinguish recoverable authoring blockers from transport or system failures, and add diagnostics that explain stale App/runtime state without requiring tmux-only interactive OMX surfaces. Constraint: Codex App outside-tmux cannot rely on tmux-only interactive OMX surfaces, so long auto checks need native MCP or pollable job paths. Rejected: Treat seed ambiguity validation as a generic failed session | it requires more interview/user input and should stay resumable through the seed_generator path. Confidence: high Scope-risk: moderate Directive: Keep App transport diagnostics separate from MCP registration checks; handler reachability and active archive drift are different failure classes. Tested: git diff --check; uv run pytest tests/unit/auto/test_interview_pipeline.py tests/unit/cli/test_codex_command.py tests/unit/orchestrator/test_sandbox_class.py tests/unit/orchestrator/test_codex_cli_runtime.py tests/unit/mcp/tools/test_interview_done_streak.py -q (184 passed). Not-tested: Full implementation execution after A-grade Seed; validation intentionally used skip-run/pollable auto smoke. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1229 |
| HEAD checked | e3eb67fae48ebf34ca36c45846ea1c2d2ed783a7 |
| Request ID | req_1779720402_1 |
| Review record | d52cfad8-baf6-4728-802d-611e2b1d1e6a |
What Improved
- Adds Codex doctor diagnostics for live MCP, stale Codex/App processes, OMX state, safety defaults, and Codex App log hints.
- Distinguishes MCP transport closure from missing
ouroboros_autoregistration in auto dispatch failures. - Treats seed-generation ambiguity validation as a blocked/recoverable state instead of a hard failure.
- Allows the explicit auto safe-default synthesis marker to complete an interview without the normal ambiguity threshold.
Issue Requirements
| Requirement | Status |
|---|---|
| No linked issue requirement captured | N/A |
Prior Findings Status
No prior ouroboros-agent bot reviews were present in /tmp/pr_prior_bot_reviews_1229.md; no prior blockers to maintain or withdraw.
Blockers
No in-scope blocking findings remained after policy filtering.
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| 1 | src/ouroboros/cli/commands/codex.py:732 | Low | Medium | _safe_default_hotfix_status() probes only lib/python3.12/site-packages/.... The project supports Python >=3.12, so active uv archives created with Python 3.13+ will report unknown even when the patch is present. Consider globbing lib/python*/site-packages/... so the diagnostic stays accurate across supported runtimes. |
Non-blocking Suggestions
| 1 | src/ouroboros/cli/commands/codex.py:732 | Diagnostic fidelity | Hardcoding python3.12 limits the new archive hotfix probe on newer supported Python runtimes. |
Test Coverage Notes
- Reviewed changed unit tests for auto ambiguity blocking, safe-default interview completion, Codex doctor diagnostics, transport-closed classification, and permission provenance logging.
- Could not run pytest because
pytestis not installed in this snapshot environment (python3 -m pytestreportsNo module named pytest). - Ran
python3 -m compileall -qover changed source files successfully under Python 3.11.2; note the project declaresrequires-python >=3.12.
Design Notes
The changes are mostly diagnostic and classification improvements around existing boundaries. The core runtime behavior remains fail-closed for auto dispatch, and the new blocked-vs-failed seed ambiguity state is consistent with recoverable authoring validation.
Design / Roadmap Gate
The affected boundaries are Codex MCP diagnostics, auto interview completion, seed generation error classification, and Codex permission logging. The PR preserves terminal auto dispatch fail-closed behavior, adds clearer transport-session categorization, and persists interview completion through the existing engine completion path. Optional diagnostics may be imperfect on non-3.12 uv archives but do not alter execution state or persistence.
Directional Notes
Review focus was shaped by the runtime/persistence risk around auto resume, MCP dispatch parity, and ambiguity-before-execution contracts. Memory was advisory only; no blocker was raised without current source evidence.
Test Coverage
- Reviewed changed unit tests for auto ambiguity blocking, safe-default interview completion, Codex doctor diagnostics, transport-closed classification, and permission provenance logging.
- Could not run pytest because
pytestis not installed in this snapshot environment (python3 -m pytestreportsNo module named pytest). - Ran
python3 -m compileall -qover changed source files successfully under Python 3.11.2; note the project declaresrequires-python >=3.12.
Merge Recommendation
Approve. I found no blocking runtime, persistence, API, or contract regressions in the changed paths. The only issue is a non-blocking diagnostic path assumption around archive Python version layout.
Review-Metadata:
verdict: APPROVE
head_sha: e3eb67f
request_id: req_1779720402_1
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
SSOT triage note —
|
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|---|---|
| PR | #1229 |
| HEAD checked | |
| Review policy | strict OMX/OMC/OpenClaw harness boundary |
Blockers
-
adds a doctor path plus that inspects and reports stale OMX deep-interview locks from the core Ouroboros CLI. That brings external harness/runtime state diagnostics into Q00/ouroboros product code.
-
locks this OMX-specific behavior into the product test suite, so this is not just incidental wording; it makes the harness boundary part of the maintained CLI contract.
Maintainer policy: Ouroboros PRs should not add OMX, OMC, OpenClaw, or other external agent-harness state/diagnostics/runtime surfaces to product code unless the PR is explicitly approved as a maintained integration boundary. Move this logic to the external harness/OpenClaw/OMX side, or split out a separate integration proposal that defines ownership and product boundary guarantees first.
Prior Review Correction
This supersedes the earlier automated approval on the same HEAD. The maintainer policy was clarified after that run; with the stricter repository-boundary rule applied, this PR needs changes.
|
@heomin86 let me look into it. found some OMX state references in the code, which makes me a bit worried about potential state complications between OMX and Ouroboros |
|
Correction to the latest REQUEST_CHANGES review body: shell quoting stripped the inline path/option names from the submitted review text. The intended blocker evidence is:
The review state remains CHANGES_REQUESTED. Maintainer policy is that Q00/ouroboros PRs should not add OMX, OMC, OpenClaw, or other external agent-harness state/diagnostics/runtime surfaces to product code unless explicitly approved as a maintained integration boundary. |
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|---|---|
| PR | #1229 |
| HEAD checked | e3eb67f |
| Review policy | strict OMX/OMC/OpenClaw harness boundary |
Blockers
-
src/ouroboros/cli/commands/codex.py adds a --omx-state doctor path plus _omx_deep_interview_state_lines(...) that inspects .omx/state/... and reports stale OMX deep-interview locks from the core Ouroboros CLI. That brings external harness/runtime state diagnostics into Q00/ouroboros product code.
-
tests/unit/cli/test_codex_command.py locks this OMX-specific behavior into the product test suite, so this is not just incidental wording; it makes the harness boundary part of the maintained CLI contract.
Maintainer policy: Ouroboros PRs should not add OMX, OMC, OpenClaw, or other external agent-harness state/diagnostics/runtime surfaces to product code unless the PR is explicitly approved as a maintained integration boundary. Move this logic to the external harness/OpenClaw/OMX side, or split out a separate integration proposal that defines ownership and product boundary guarantees first.
Prior Review Correction
This supersedes the earlier automated approval and the malformed REQUEST_CHANGES body on the same HEAD. With the stricter repository-boundary rule applied, this PR needs changes.
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1229 |
| HEAD checked | e3eb67fae48ebf34ca36c45846ea1c2d2ed783a7 |
| Request ID | req_1779723823_14 |
| Review record | c37f1e8d-6bca-4f17-a94c-911ec5e99872 |
What Improved
- Reclassifies seed ambiguity validation failures as blocked/recoverable instead of failed.
- Adds clearer Codex MCP transport diagnostics and permission provenance logging.
- Adds a handler-side path for explicit safe-default interview completion.
Issue Requirements
| Requirement | Status |
|---|---|
| PR body contains no substantive requirement text | N/A |
| Complete auto skip-run closure when safe-default synthesis closes the persisted interview | Not met |
| Improve Codex/MCP transport diagnostics | Partially met |
| Add OMX state diagnostics | Not met as product-boundary design |
| Reclassify seed ambiguity threshold validation as blocked/recoverable | Met |
Prior Findings Status
Prior automated approvals are superseded. I maintain the later OMX boundary concern with current source evidence at src/ouroboros/cli/commands/codex.py:75 and tests/unit/cli/test_codex_command.py:370. I also maintain the prior non-blocking Python archive layout concern at src/ouroboros/cli/commands/codex.py:730. The earlier safe-default producer concern still applies based on current driver behavior.
Blockers
| # | File:Line | Severity | Finding |
|---|---|---|---|
| 1 | src/ouroboros/cli/commands/codex.py:75 | BLOCKING | ouroboros codex doctor --omx-state adds a core product CLI surface for external OMX harness state by reading .omx/state/... and reporting stale OMX deep-interview locks. That makes OMX runtime/handoff state part of the maintained Ouroboros CLI contract without an approved integration boundary or ownership model. |
Follow-up Findings
tests/unit/cli/test_codex_command.py:370[warning] The new tests assert OMX-specific doctor behavior, including.omx/statelayout and stale lock wording, so this is not incidental diagnostic text. It locks the external harness state model into the product test suite and reinforces the boundary violation above.
| # | File:Line | Priority | Confidence | Suggestion |
|---|-----------|----------|------------|------------|
| 1 | src/ouroboros/cli/commands/codex.py:730 | Low | Medium |_safe_default_hotfix_status()probes onlylib/python3.12/site-packages/...; uselib/python*/site-packages/...so the diagnostic works for supported Python 3.13/3.14 uv archive layouts. |
Non-blocking Suggestions
| 1 | src/ouroboros/cli/commands/codex.py:730 | Diagnostic fidelity | Hardcoded python3.12 limits the archive hotfix probe on newer supported runtimes. |
Test Coverage Notes
- Reviewed changed unit tests for auto ambiguity blocking, handler-side safe-default completion, Codex doctor diagnostics, transport classification, sandbox permission mapping, and Codex runtime permission provenance.
- Could not run pytest:
/usr/bin/python3: No module named pytest. - Ran
python3 -m compileall -qon all changed source/test files successfully. - Coverage gap: no production-path test drives
AutoInterviewDriverto emit the new safe-default completion marker because the driver currently never emits it.
Design Notes
The recoverable seed ambiguity change is narrow and consistent with the auto pipeline state model. The safe-default completion contract is split across handler and driver, and the new OMX doctor path crosses a product/external-harness boundary.
Design / Roadmap Gate
Affected boundaries include auto interview persistence, MCP authoring completion, Codex doctor diagnostics, and CLI runtime error classification. The auto safe-default consumer exists but lacks a production producer, so replay/resume remains blocked in the intended path. The OMX diagnostic reads and interprets external .omx runtime state from the core CLI, which needs to be removed, moved to the harness side, or proposed as an explicit maintained integration boundary.
Directional Notes
Review focus was shaped by auto persistence/resume safety, fail-closed MCP dispatch behavior, and the maintainer boundary against adding OMX/OMC/OpenClaw harness state surfaces to Ouroboros product code. Memory was advisory only; blockers above use current source evidence.
Test Coverage
- Reviewed changed unit tests for auto ambiguity blocking, handler-side safe-default completion, Codex doctor diagnostics, transport classification, sandbox permission mapping, and Codex runtime permission provenance.
- Could not run pytest:
/usr/bin/python3: No module named pytest. - Ran
python3 -m compileall -qon all changed source/test files successfully. - Coverage gap: no production-path test drives
AutoInterviewDriverto emit the new safe-default completion marker because the driver currently never emits it.
Merge Recommendation
REQUEST_CHANGES. Wire the safe-default completion marker through the production auto driver with a regression test, and remove or split the OMX state doctor surface from core Ouroboros unless maintainers explicitly approve that integration boundary.
Policy Notes
- Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.
Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: e3eb67f
request_id: req_1779723823_14
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1229 |
| HEAD checked | e3eb67fae48ebf34ca36c45846ea1c2d2ed783a7 |
| Request ID | req_1779724828_15 |
| Review record | 655b24ac-2229-42b1-94b0-552938ce765c |
What Improved
- Reclassifies seed ambiguity validation as blocked/recoverable.
- Adds clearer Codex MCP transport diagnostics and permission provenance logging.
- Adds handler-side support for explicit safe-default interview completion.
Issue Requirements
| Requirement | Status |
|---|---|
| PR body contains no substantive requirement text | N/A |
| Fix auto skip-run closure diagnostics / complete safe-default closure when persisted interview is closed by safe-default synthesis | Not met |
| Improve Codex/MCP transport diagnostics | Partially met |
| Add OMX state diagnostics | Not met as product-boundary design |
| Reclassify seed ambiguity threshold validation as blocked/recoverable | Met |
Prior Findings Status
Maintained the later prior bot concerns: the OMX product-boundary blocker is still present at codex.py:75/:495, and the hardcoded Python archive probe remains non-blocking at codex.py:735. I also maintain the safe-default producer concern: current source still shows the driver blocks at max rounds instead of producing the new completion marker.
Blockers
| # | File:Line | Severity | Finding |
|---|---|---|---|
| 1 | src/ouroboros/cli/commands/codex.py:75 | BLOCKING | ouroboros codex doctor --omx-state adds a maintained product CLI surface for external OMX runtime state: it advertises .omx inspection and _omx_deep_interview_state_lines() reads .omx/state/... to report stale OMX deep-interview locks at src/ouroboros/cli/commands/codex.py:495. That pulls external harness/handoff state into core Ouroboros CLI behavior without an approved integration boundary or ownership contract. |
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| 1 | src/ouroboros/cli/commands/codex.py:735 | Low | Medium | _safe_default_hotfix_status() probes only lib/python3.12/site-packages/...; use lib/python*/site-packages/... so the diagnostic works for supported Python 3.13/3.14 uv archive layouts. |
Non-blocking Suggestions
| 1 | tests/unit/cli/test_codex_command.py:370 | Product boundary | The tests assert OMX-specific .omx/state layout and stale lock wording, which makes the external harness state model part of the maintained product test contract. This reinforces blocker #1. |
Test Coverage Notes
- Reviewed changed unit tests for ambiguity blocking, handler-side safe-default completion, Codex doctor diagnostics, transport classification, sandbox mapping, and permission provenance.
- Could not run targeted pytest:
uvis not installed, andpython3 -m pytestreportsNo module named pytest. - Coverage gap: no production-path test drives
AutoInterviewDriverto emit the new[from-auto][safe-default-synthesis]completion marker.
Design Notes
The ambiguity and transport-classification changes are narrow. The risky parts are the OMX diagnostic surface crossing a repository boundary, and the incomplete safe-default producer/consumer contract between auto driver and MCP interview handler.
Design / Roadmap Gate
Affected boundaries include Codex CLI diagnostics, MCP authoring handler completion, auto interview persistence, and auto resume behavior. The PR adds an external .omx/state contract to core CLI output, and it adds a handler-side completion bypass without wiring the production auto driver to produce the audited marker. Those boundaries are not safe to merge as-is.
Directional Notes
Maintainer memory and PR comments shaped review focus around avoiding external OMX/OMC/OpenClaw harness state in product code and checking auto persistence/replay boundaries. Both blockers above are grounded in current source evidence.
Test Coverage
- Reviewed changed unit tests for ambiguity blocking, handler-side safe-default completion, Codex doctor diagnostics, transport classification, sandbox mapping, and permission provenance.
- Could not run targeted pytest:
uvis not installed, andpython3 -m pytestreportsNo module named pytest. - Coverage gap: no production-path test drives
AutoInterviewDriverto emit the new[from-auto][safe-default-synthesis]completion marker.
Merge Recommendation
REQUEST_CHANGES. Remove or split the OMX state diagnostics behind an explicitly approved integration boundary, and wire/test the production auto driver safe-default completion path so the new handler contract is reachable.
Policy Notes
- Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.
Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: e3eb67f
request_id: req_1779724828_15
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1229 |
| HEAD checked | e3eb67fae48ebf34ca36c45846ea1c2d2ed783a7 |
| Request ID | req_1779725137_16 |
| Review record | 2bc5352f-372c-4256-961c-94c08f34dab9 |
What Improved
- Reclassifies seed ambiguity validation as blocked/recoverable.
- Adds clearer Codex/MCP transport diagnostics and permission provenance logging.
- Adds a handler-side safe-default interview completion marker.
Issue Requirements
| Requirement | Status |
|---|---|
| PR body contains no substantive requirement text | N/A |
| Fix auto skip-run closure diagnostics / complete safe-default closure when the persisted interview is closed by safe-default synthesis | Not met |
| Improve Codex/MCP transport diagnostics | Partially met |
| Add OMX state diagnostics | Not met as product-boundary design |
| Reclassify seed ambiguity threshold validation as blocked/recoverable | Met |
Prior Findings Status
Prior approvals are superseded. I maintain the OMX boundary blocker with current evidence at src/ouroboros/cli/commands/codex.py:75 and test evidence at tests/unit/cli/test_codex_command.py:370. I also maintain the safe-default producer gap: the handler accepts the marker, but the production driver still never emits it and blocks at max rounds. The hardcoded Python archive probe remains non-blocking.
Blockers
| # | File:Line | Severity | Finding |
|---|---|---|---|
| 1 | src/ouroboros/cli/commands/codex.py:75 | BLOCKING | ouroboros codex doctor --omx-state adds a core product CLI surface for external OMX harness state, then reads .omx/state/... via _omx_deep_interview_state_lines() and reports stale OMX deep-interview locks. That makes OMX runtime/handoff state part of the maintained Ouroboros CLI contract without an approved integration boundary or ownership model. |
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| 1 | src/ouroboros/cli/commands/codex.py:730 | Low | Medium | _safe_default_hotfix_status() probes only lib/python3.12/site-packages/...; use lib/python*/site-packages/... so the diagnostic works for supported Python 3.13/3.14 uv archive layouts. |
Non-blocking Suggestions
| 1 | tests/unit/cli/test_codex_command.py:370 | Product boundary | The tests assert OMX .omx/state layout and stale lock wording, reinforcing the external harness state as a product CLI contract. This is supporting evidence for blocker #1. |
Test Coverage Notes
- Ran
SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest ...for changed Codex command tests and targeted auto/MCP/orchestrator/sandbox tests: 28 passed. - Coverage gap remains: no production-path test drives
AutoInterviewDriverto emit the new[from-auto][safe-default-synthesis]completion marker.
Design Notes
The transport and ambiguity changes fit existing fail-closed behavior. The risky pieces are the new external OMX product surface and the incomplete producer/consumer contract for safe-default completion.
Design / Roadmap Gate
Affected boundaries include Codex CLI diagnostics, auto interview persistence/completion, MCP authoring handler completion, and runtime dispatch classification. The OMX diagnostic crosses an ownership boundary by embedding external harness state layout into the product CLI. The safe-default marker adds a handler contract without a production auto-driver producer, so replay/resume still reaches the existing blocked max-rounds path.
Directional Notes
Review focus was shaped by runtime/persistence risk, MCP dispatch parity, and the maintainer boundary against OMX/OMC/OpenClaw harness state in product code. Memory was advisory; blockers above are based on current source evidence.
Test Coverage
- Ran
SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 /root/.local/bin/uv run python -m pytest ...for changed Codex command tests and targeted auto/MCP/orchestrator/sandbox tests: 28 passed. - Coverage gap remains: no production-path test drives
AutoInterviewDriverto emit the new[from-auto][safe-default-synthesis]completion marker.
Merge Recommendation
REQUEST_CHANGES. Remove or split the OMX state diagnostics behind an explicitly approved integration boundary, and wire the auto driver to emit the audited safe-default completion marker with a production-path regression test.
Policy Notes
- Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.
Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: e3eb67f
request_id: req_1779725137_16
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Summary
Verification
Notes