Skip to content

fix(auto): route Ralph oscillation replay through UNSTUCK_LATERAL#1187

Merged
shaun0927 merged 2 commits into
Q00:mainfrom
shaun0927:fix/pr1175-ralph-oscillation-replay
May 23, 2026
Merged

fix(auto): route Ralph oscillation replay through UNSTUCK_LATERAL#1187
shaun0927 merged 2 commits into
Q00:mainfrom
shaun0927:fix/pr1175-ralph-oscillation-replay

Conversation

@shaun0927
Copy link
Copy Markdown
Collaborator

Summary

Follow-up correction to merged PR #1175 after the latest ouroboros-agent review correctly identified a persistence/replay contract gap.

PR #1175 implemented the L5-a live handoff route:

Ralph failed + stop_reason="oscillation_detected" + complete_product + lateral_thinker -> UNSTUCK_LATERAL

but the same Ralph terminal status could still be consumed by resume polling and the direct re-attach observer as a legacy direct BLOCKED outcome. Since #1157 defines L5-a as plumbing Ralph's existing oscillation_detected signal into the existing ooo unstuck substrate, the terminal consumer boundary needs to be consistent across fresh handoff, resume polling, and re-attach.

What changed

  • Added one shared helper in src/ouroboros/auto/pipeline.py for the L5-a Ralph oscillation route.
  • Reused that helper from:
    • fresh _handoff_to_ralph,
    • _poll_ralph_job resume polling,
    • _reattach_ralph_job terminal observation.
  • Recovered the persisted Seed artifact when a replay/re-attach path does not already have a Seed object in hand.
  • Added regression tests proving resume polling and re-attach both invoke lateral recovery for oscillation_detected and can continue into the existing recovery redispatch path.

Why this stays minimal-substrate

This intentionally does not add a new escalation ladder, new detector, new terminal taxonomy, or new state machine. It only makes the existing L5-a route single-sourced across the three existing Ralph terminal consumers. Other Ralph blocked reasons (iteration_timeout, wall_clock_exhausted, grade_regressing, max_generations reached) still go directly to BLOCKED as budget/regression terminals.

Verification

  • SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto/test_pipeline_oscillation_lateral.py tests/unit/auto/test_pipeline_ralph_handoff.py -q → 46 passed
  • SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto -q → 907 passed
  • uv run ruff check src/ouroboros/auto/pipeline.py tests/unit/auto/test_pipeline_oscillation_lateral.py → clean
  • uv run ruff format --check src/ouroboros/auto/pipeline.py tests/unit/auto/test_pipeline_oscillation_lateral.py → clean
  • SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run mypy src/ouroboros/auto/pipeline.py → clean

Refs #1175, #1157, #961.

shaun0927 and others added 2 commits May 22, 2026 23:08
Keep PR Q00#1175's minimal-substrate direction intact by applying the existing oscillation_detected to UNSTUCK_LATERAL route consistently at fresh handoff, resume polling, and re-attach terminal consumers instead of adding a new escalation ladder or detector.\n\nConstraint: PR Q00#1175 is already merged, so this lands as a follow-up correction PR on current main.\nRejected: New L5 state machine or new oscillation detector | Q00#1157 explicitly scopes L5-a to plumbing existing Ralph stop_reason into existing unstuck substrate.\nConfidence: high\nScope-risk: narrow\nDirective: Keep non-oscillation Ralph blocked reasons as direct budget-terminal BLOCKED outcomes unless Q00#1157 changes.\nTested: SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto/test_pipeline_oscillation_lateral.py tests/unit/auto/test_pipeline_ralph_handoff.py -q; SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto -q; uv run ruff check src/ouroboros/auto/pipeline.py tests/unit/auto/test_pipeline_oscillation_lateral.py; uv run ruff format --check src/ouroboros/auto/pipeline.py tests/unit/auto/test_pipeline_oscillation_lateral.py; SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run mypy src/ouroboros/auto/pipeline.py\nNot-tested: Full repository test suite outside tests/unit/auto.
Request bot and CI re-evaluation for the already-pushed Q00#1175 follow-up correction without changing code.

Constraint: GitHub reported no checks or bot reviews after the initial PR push.\nConfidence: high\nScope-risk: narrow\nDirective: Do not treat this commit as a code change; review the previous commit for implementation substance.\nTested: Not rerun; empty trigger commit only.\nNot-tested: Bot availability.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 415b08a for PR #1187

Review record: f8288a27-c679-402c-ae1e-6656b256e2db

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to assess architecture or implementation because the provided source snapshot and diff could not be read in this runner.

Policy Notes

  • Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.

Recovery Notes

First recoverable review artifact generated from codex analysis log.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@shaun0927
Copy link
Copy Markdown
Collaborator Author

Merge-readiness rationale

What this PR fixes

This PR is a narrow follow-up to merged PR #1175. PR #1175 correctly implemented the L5-a live-path behavior from #1157: when Ralph terminates with stop_reason == "oscillation_detected" in complete-product mode and a lateral_thinker is wired, ooo auto should route through the existing UNSTUCK_LATERAL / ooo unstuck substrate before giving up.

The latest review on #1175 found the real gap: the same Ralph terminal contract had three consumers, but only the fresh handoff consumer used the L5-a route. Resume polling and direct re-attach still mapped oscillation_detected straight to BLOCKED, so a disconnected or replayed long-running session could skip the recovery path that the live session would have taken.

Why this is aligned with #961 / #1157

This keeps the AgentOS / ooo auto direction intentionally minimal-substrate:

  • No new escalation-ladder state machine.
  • No new oscillation detector.
  • No new terminal taxonomy.
  • No new recovery substrate.
  • Only the existing oscillation_detected signal is plumbed into the existing UNSTUCK_LATERAL route consistently across existing Ralph terminal consumers.

That matches #1157's L5-a scope: "plumb the existing detection signal into the existing recovery substrate." It also fits #961's current Track B sequencing: this is a narrow ooo auto self-healing correction outside Track C tier gates, not a broad AgentOS substrate change.

What changed technically

  • Added a single shared helper for the L5-a oscillation_detected -> UNSTUCK_LATERAL decision.
  • Reused it from:
    • fresh _handoff_to_ralph,
    • _poll_ralph_job resume polling,
    • _reattach_ralph_job terminal observation.
  • Recovered the persisted Seed artifact when replay/re-attach paths do not already have a live Seed object.
  • Added regression tests that prove both resume polling and re-attach invoke lateral recovery for oscillation_detected and can continue into the existing recovery redispatch flow.

Why it is safe to merge

  • The change is intentionally narrow: only src/ouroboros/auto/pipeline.py and tests/unit/auto/test_pipeline_oscillation_lateral.py changed.
  • Non-oscillation Ralph blocked reasons (iteration_timeout, wall_clock_exhausted, grade_regressing, max_generations reached) still follow the existing direct BLOCKED path. The existing iteration_timeout regression test remains in place.
  • The implementation removes duplicate policy from individual terminal consumers rather than adding another layer of orchestration.
  • CI is fully green on fix(auto): route Ralph oscillation replay through UNSTUCK_LATERAL #1187: Ruff, MyPy, Python 3.12/3.13/3.14 tests, boundary/envelope checks, and Bridge TypeScript all passed.
  • Local verification also passed: targeted Ralph/L5-a tests, full tests/unit/auto, Ruff, format check, and mypy on pipeline.py.
  • ouroboros-agent[bot] returned APPROVE on commit 415b08a, with no blocking or non-blocking findings remaining.

Current status

This PR improves the already-merged #1175 behavior to the level the later bot review requested. It preserves the SSOT direction, avoids over-engineering, and closes the replay/resume consistency gap. I consider it ready to merge.

@shaun0927
Copy link
Copy Markdown
Collaborator Author

PR Review Summary

Verdict

Approve

Scope Reviewed

  • PR intent: Follow-up to merged feat(auto): route Ralph oscillation_detected through UNSTUCK_LATERAL (L5-a) #1175 to make L5-a Ralph oscillation_detected handling consistent across fresh handoff, resume polling, and re-attach terminal consumers.
  • Main changed areas: src/ouroboros/auto/pipeline.py centralizes the L5-a route in _maybe_route_ralph_oscillation_to_lateral; tests/unit/auto/test_pipeline_oscillation_lateral.py adds resume/re-attach regression coverage.
  • Tests reviewed: new resume/re-attach tests, existing live-path/no-lateral/iteration-timeout tests, and the broader Ralph handoff regression suite.
  • Checks considered: local targeted tests, full tests/unit/auto, Ruff, format check, mypy; GitHub CI on fix(auto): route Ralph oscillation replay through UNSTUCK_LATERAL #1187 is fully green.

Blocking Issues

None.

Warnings

None.

Mutation-Test Thinking

  • Likely mutants that should be killed:
    • Removing the _poll_ralph_job call to _maybe_route_ralph_oscillation_to_lateral would be killed by test_ralph_resume_oscillation_enters_unstuck_lateral_when_wired.
    • Removing the _reattach_ralph_job call to the helper would be killed by test_ralph_reattach_oscillation_enters_unstuck_lateral_when_wired.
    • Broadening the route to iteration_timeout would remain caught by test_ralph_iteration_timeout_does_not_invoke_lateral.
    • Removing the lateral_thinker / complete-product gate would be caught by the existing no-lateral direct-block regression and complete-product state expectations.
  • Mutants current tests may not catch: a malformed legacy state with no persisted Seed artifact on the re-attach helper would fall back to direct blocking; that is acceptable because lateral recovery needs a Seed for redispatch.
  • Additional tests recommended: none required for merge. A future L5-b PR can add explicit unstuck_exhausted tests when that stop reason is introduced.

Complexity / CRAP-style Risk

  • High-risk functions/modules: AutoPipeline is already a complex orchestration module, but this PR reduces policy duplication by moving one branch into a shared helper.
  • Complexity increase: low. The helper replaces duplicated live-path logic and is called from three existing terminal consumers.
  • Test coverage concern: low. The exact bot-requested replay/resume gaps are now pinned by tests, and the broader auto unit suite passed.
  • Refactoring recommendation: no further refactor in this PR. Extracting broader Ralph terminal mapping can wait until another terminal consumer or L5-b typed terminal lands.

Test Quality Assessment (6/7)

  • Strong tests: The new tests exercise the actual terminal-consumer paths, assert that lateral recovery is invoked, verify the oscillation marker reaches the synthetic QA payload, and prove the existing redispatch path can continue afterward.
  • Weak tests: The direct _reattach_ralph_job helper is not currently wired by a runtime caller, but it was the exact code path flagged by the bot and is still worth pinning.
  • Missing edge cases: No explicit no-Seed replay fixture; acceptable because the fallback remains safe direct blocking and no production evidence requires a new behavior there.
  • Mocking concerns: Stubs are minimal and consistent with existing Ralph handoff tests; assertions focus on state transitions and captured integration payloads rather than implementation internals.

Security / Operational Risk

None. No new I/O, no secrets, no auth/authz surface, no migration, and no external dependency. The change only adjusts existing in-process terminal mapping for a known Ralph stop reason.

Looks Good

Final Recommendation

Approve. This PR is merge-ready: it fixes the replay/resume consistency gap from #1175, stays aligned with the #961/#1157 SSOT direction, avoids over-engineering, and has adequate regression coverage plus green CI.

@shaun0927
Copy link
Copy Markdown
Collaborator Author

Independent merge-readiness review — 2026-05-23

I re-reviewed PR #1187 independently against the current #961 / #1157 SSOT direction, ignoring bot-review conclusions and using the local worktree at head 415b08a.

What this PR does

This PR is a narrow follow-up to #1175. It fixes the consumer side of the L5-a contract: when Ralph returns terminal_status="failed" with stop_reason="oscillation_detected" in complete-product mode, the already-defined UNSTUCK_LATERAL / lateral recovery route is now applied consistently across:

  • fresh Ralph handoff,
  • persisted resume polling,
  • direct re-attach terminal observation.

SSOT / AgentOS direction check

This matches the #961 / #1157 direction. L5-a is explicitly the minimal-substrate path: plumb the existing Ralph oscillation_detected signal into the existing ooo unstuck / lateral recovery substrate. This PR does not add a new detector, escalation ladder, terminal taxonomy, state machine, dependency, or control-plane surface.

That is the important over-engineering check: the change centralizes duplicated terminal-consumer policy in one helper and reuses the existing lateral path. Non-oscillation budget/regression terminals still block directly, so the PR does not broaden recovery beyond the SSOT-approved case.

Current action plan and result

Recommended action: merge as-is. I did not push any new commit because the current implementation already satisfies the SSOT and no blocker or cleanup-worthy over-engineering issue was found.

Verification I ran locally from a separate worktree:

  • pytest tests/unit/auto/test_pipeline_oscillation_lateral.py tests/unit/auto/test_pipeline_ralph_handoff.py -q → 46 passed
  • ruff check src/ouroboros/auto/pipeline.py tests/unit/auto/test_pipeline_oscillation_lateral.py → passed
  • ruff format --check src/ouroboros/auto/pipeline.py tests/unit/auto/test_pipeline_oscillation_lateral.py → passed
  • mypy src/ouroboros/auto/pipeline.py → passed
  • pytest tests/unit/auto -q → 907 passed

GitHub checks are also green at head 415b08a: Ruff, MyPy, Bridge TypeScript, Python 3.12/3.13/3.14 tests, enforce-boundary, and enforce-envelope.

Conclusion: PR #1187 is merge-ready. It is a narrow, correct L5-a replay/resume consistency fix and remains aligned with the AgentOS minimal-substrate direction.

@shaun0927
Copy link
Copy Markdown
Collaborator Author

PR Review Summary

Verdict

Approve

Scope Reviewed

  • PR intent: make the L5-a Ralph oscillation_detectedUNSTUCK_LATERAL route consistent across fresh handoff, resume polling, and re-attach terminal consumers.
  • Main changed areas: src/ouroboros/auto/pipeline.py and tests/unit/auto/test_pipeline_oscillation_lateral.py.
  • Tests reviewed: new fresh/resume/re-attach oscillation tests, no-lateral fallback test, non-oscillation iteration_timeout direct-block regression, and the existing Ralph handoff suite.
  • Checks considered: local targeted pytest, local full tests/unit/auto, Ruff check, Ruff format check, mypy on pipeline.py, and current GitHub checks.

Blocking Issues

None.

Warnings

None.

Mutation-Test Thinking

  • Likely mutants that should be killed:
    • Removing the helper call from fresh _handoff_to_ralph should fail the fresh oscillation lateral test.
    • Removing the helper call from resume polling should fail test_ralph_resume_oscillation_enters_unstuck_lateral_when_wired.
    • Removing the helper call from _reattach_ralph_job should fail test_ralph_reattach_oscillation_enters_unstuck_lateral_when_wired.
    • Broadening the route to budget terminals should fail test_ralph_iteration_timeout_does_not_invoke_lateral.
    • Dropping the lateral_thinker gate should fail the no-lateral direct-block test.
  • Mutants current tests may not catch: malformed legacy state with neither an in-memory seed nor persisted seed_artifact still falls back to direct blocking. That is acceptable because redispatch through lateral recovery requires a recoverable Seed contract.
  • Additional tests recommended: none required for merge. Future L5-b should add coverage for unstuck_exhausted once that typed terminal lands.

Complexity / CRAP-style Risk

  • High-risk functions/modules: AutoPipeline remains a broad orchestration class, but this PR reduces policy duplication by centralizing one terminal mapping decision.
  • Complexity increase: low. The added helper is linear and gated by exact stop_reason, lateral availability, and complete-product mode.
  • Test coverage concern: low. The exact live/resume/re-attach consumer boundaries are covered.
  • Refactoring recommendation: no additional refactor in this PR. A broader Ralph terminal mapper can wait until more terminal routes are introduced.

Test Quality Assessment (6/7)

  • Strong tests: cover the real terminal-consumer paths, assert lateral invocation, preserve non-oscillation direct blocking, and verify replay/re-attach can continue into existing recovery redispatch behavior.
  • Weak tests: the re-attach path is invoked directly rather than through a full process-restart integration harness, but that matches the changed consumer boundary and is sufficient for this slice.
  • Missing edge cases: no explicit no-Seed replay fixture; acceptable because the safe fallback is direct blocking.
  • Mocking concerns: stubs are narrow and consistent with existing auto/Ralph handoff tests.

Security / Operational Risk

None. No new external I/O, credentials, subprocess behavior, migration, auth/authz surface, or dependency is introduced.

Looks Good

Final Recommendation

Approve. PR #1187 is merge-ready and should be safe to merge as a minimal-substrate L5-a consistency correction.

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 415b08a for PR #1187

Review record: b3c232d6-c112-4ed6-9755-06dce1d49034

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to assess architecture or implementation details because the local snapshot and diff could not be read due to the sandbox namespace failure.

Policy Notes

  • Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.

Recovery Notes

First recoverable review artifact generated from codex analysis log.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@shaun0927 shaun0927 merged commit 9dcbe9b into Q00:main May 23, 2026
8 checks passed
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

PR #1187
Branch: fix/pr1175-ralph-oscillation-replay | 2 files, +231/-50 | CI: Bridge TypeScript pass 19s https://github.com/Q00/ouroboros/actions/runs/26292809379/job/77397292834
Scope: architecture-level
HEAD checked: 415b08ae7468effa42cdb272f62baa7d50bcc179

What Improved

  • The Ralph oscillation_detected routing is now centralized and used by fresh handoff, resume polling, and direct re-attach.
  • Resume/re-attach paths can recover the persisted Seed artifact before entering UNSTUCK_LATERAL.
  • Added focused L5-a regression coverage; tests/unit/auto/test_pipeline_oscillation_lateral.py -q passes with 5 tests.

Issue #N/A Requirements

Requirement Status
Shared L5-a helper for Ralph oscillation route Met
Fresh _handoff_to_ralph routes oscillation_detected through lateral recovery Met
_poll_ralph_job resume polling routes oscillation_detected through lateral recovery Met
_reattach_ralph_job terminal observation routes oscillation_detected through lateral recovery Met
Replay/re-attach can recover persisted Seed artifact Met
Regression tests for resume and re-attach lateral recovery Met
Plugin/runtime metadata remains consistent with release source of truth Not met
Claude hook commands remain usable from a normal repository checkout Not met

Prior Findings Status

Prior Finding Status
Prior review context MODIFIED — Prior persistence/replay concern was modified: the Ralph oscillation consumer boundary is now shared across fresh handoff, resume polling, and direct re-attach. New current-HEAD blockers remain in the plugin metadata and Claude hook consumer boundaries.

Blockers

# File:Line Severity Confidence Finding
1 .claude-plugin/plugin.json:3 High 95% Current HEAD advertises the Claude plugin as 0.39.1, while the repo's own version source is still v0.24.0-1172-g415b08ae; python scripts/sync-plugin-version.py reports drift for .claude-plugin/plugin.json, .claude-plugin/marketplace.json, and skills/setup/SKILL.md. This breaks the plugin/version publication contract and can cause consumers to install or trust metadata for a release that does not match the Python package being built from this commit.
2 .claude/settings.json:8 High 85% The repo-local Claude hooks now depend solely on ${CLAUDE_PLUGIN_ROOT}; when that variable is unset in a normal checkout, the command expands to `python3 "/scripts/keyword-detector.py"

Follow-ups

# File:Line Priority Confidence Suggestion
None.

Test Coverage

  • Ran SETUPTOOLS_SCM_PRETEND_VERSION_FOR_OUROBOROS_AI=0.0.0 SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto/test_pipeline_oscillation_lateral.py -q: 5 passed.
  • Ran python scripts/sync-plugin-version.py: failed drift check, confirming the plugin metadata/version mismatch.
  • Missing coverage: no CI check currently prevents committed plugin metadata from drifting from hatch-vcs, and no hook test covers .claude/settings.json with CLAUDE_PLUGIN_ROOT unset.

Design / Roadmap Gate

Affected-boundary review: the state-machine/persistence/replay boundary for L5-a is coherent at current HEAD: failed Ralph terminal statuses with stop_reason="oscillation_detected" enter the same UNSTUCK_LATERAL path from live handoff, resume polling, and re-attach, while non-oscillation blocked reasons still terminate as BLOCKED.

However, the merged diff also changed consumer-facing plugin metadata and Claude hook execution surfaces. Those boundaries are now inconsistent: plugin manifests advertise a version that the repo's own sync/version tool rejects, and repo-local Claude hooks no longer resolve their scripts unless a plugin-specific environment variable is present. Those are current-HEAD blockers independent of the L5-a state-machine fix.

Merge Recommendation

Post-merge audit recommendation: keep the L5-a replay fix, but follow up immediately on current HEAD to restore plugin version consistency and make Claude hook script resolution work both in plugin-root and normal checkout contexts.

Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: post_merge_audit
merge_eligible: false
head_sha: 415b08a
source_read_ok: true
diff_read_ok: true
blocking_count: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant