Skip to content

fix(tests): align canonical live-run evidence#1195

Merged
shaun0927 merged 1 commit into
mainfrom
fix/canonical-live-evidence-docs
May 25, 2026
Merged

fix(tests): align canonical live-run evidence#1195
shaun0927 merged 1 commit into
mainfrom
fix/canonical-live-evidence-docs

Conversation

@Q00
Copy link
Copy Markdown
Owner

@Q00 Q00 commented May 23, 2026

Summary

  • Update canonical harness docs now that PR feat(tests): L0 live-wire + L1 catalog cross-validate (P1) #1191 wired the opt-in live ouroboros_auto path.
  • Change the copyable canonical summary marker from live=deferred_l0b to live=opt_in.
  • Update summary-line regression tests so PR/SSOT evidence cannot keep advertising a deferred live path after it is wired.

Review context

This follows up the post-v0.39.1 canonical harness sequence: #1174 added the L0 skeleton, #1182 added copyable summary evidence, and #1191 wired the opt-in live run. After #1191, the README and summary line still said the live run was deferred to L0-b, which could mislead maintainers reading the canonical acceptance evidence.

Validation

  • SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/canonical/ -q -> 17 passed, 1 skipped
  • SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/canonical/ -q -k cli-todo -> 8 passed, 1 skipped, 9 deselected
  • SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run ruff check tests/canonical/ -> passed
  • SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run ruff format --check tests/canonical/ -> passed
  • git diff --check -> passed
  • rg -n "deferred_l0b|Once the live wiring lands|actual invocation lands|once L0-b lands|skips with a typed reason" tests/canonical -> no matches

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit ae8850c for PR #1195

Review record: afd367e6-14cf-4d95-8e39-12474bd9ce6f

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to assess architecture or implementation design because the PR artifacts and changed files were not readable in the current command sandbox.

Policy Notes

  • Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.

Recovery Notes

First recoverable review artifact generated from codex analysis log.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

PR #1195
Branch: fix/canonical-live-evidence-docs | 3 files, +23/-24 | CI: Bridge TypeScript pass 12s https://github.com/Q00/ouroboros/actions/runs/26325082985/job/77501196509
Scope: architecture-level
HEAD checked: ae8850cbd851fc4c0d7ad48322b3c50f563ce9a9

What Improved

  • Canonical summary marker now reports live=opt_in instead of live=deferred_l0b.
  • README copy now describes the live canonical run as an explicit maintainer opt-in.
  • Summary-line regression tests were updated to pin the new marker.

Issue #N/A Requirements

Requirement Status
Change canonical summary marker from live=deferred_l0b to live=opt_in Met
Update README copy so canonical live run is no longer described as deferred to L0-b Partially met; README is updated, but tests/canonical/conftest.py:3 still describes the live invocation as future work
Update summary-line regression tests Met
Remove stale deferred-live evidence from canonical harness boundary Not met

Prior Findings Status

Prior Finding Status
Prior review context MODIFIED — No prior review concerns were available in the provided artifacts, so none were maintained, modified, or withdrawn.

Blockers

# File:Line Severity Confidence Finding
1 tests/canonical/conftest.py:3 Medium 95% The PR is meant to remove stale “live path is deferred” evidence, but the canonical harness module still says this is the “L0-a slice” and that the actual ouroboros_auto invocation lands in a follow-up sub-PR at lines 3-6. Current HEAD already has the live invocation in tests/canonical/test_canonical.py:178 and calls it at tests/canonical/test_canonical.py:246, while the updated README now tells maintainers the path is wired. Leaving this contradiction in the harness boundary preserves exactly the stale acceptance evidence this PR claims to fix.

Follow-ups

# File:Line Priority Confidence Suggestion
None.

Test Coverage

  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/canonical/ -q: 17 passed, 1 skipped.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run pytest tests/canonical/ -q -k cli-todo: 8 passed, 1 skipped, 9 deselected.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run ruff check tests/canonical/: passed.
  • Ran SETUPTOOLS_SCM_PRETEND_VERSION=0.39.2 uv run ruff format --check tests/canonical/: passed.

Design / Roadmap Gate

Affected-boundary review covered the copyable summary contract, README operator guidance, pytest terminal summary, and the opt-in live-run boundary. The functional summary marker is updated and covered by tests, but the same harness module still carries stale deferred-live documentation, so the acceptance evidence boundary remains internally inconsistent.

Merge Recommendation

Do not merge until the stale tests/canonical/conftest.py module docstring is updated to match the current opt-in live-run contract.

Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: full
merge_eligible: false
head_sha: ae8850c
source_read_ok: true
diff_read_ok: true
blocking_count: 0

Copy link
Copy Markdown
Collaborator

@shaun0927 shaun0927 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pure docs/evidence-line correction matching the post-#1191 reality. #1157 explicitly names this PR as the remaining L0 cleanup. The live=deferred_l0b → live=opt_in flip is the right copyable summary marker now that OUROBOROS_RUN_CANONICAL=1 is wired, and the README scrub of the "Once live wiring lands in L0-b" / "skips with a typed reason" wording is consistent with the canonical README "live-run path" section. Tests already pin the new marker. Safe to merge first.

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1195 |
| HEAD checked | ae8850cbd851fc4c0d7ad48322b3c50f563ce9a9 |
| Request ID | req_1779668968_23 |
| Review record | 648bfd52-daba-4ab4-89b6-8a891aa5df8b |

What Improved

  • Updates the canonical acceptance harness status from deferred live execution to explicit opt-in live execution.
  • Aligns README examples and summary-line tests with the current OUROBOROS_RUN_CANONICAL=1 behavior.

Issue Requirements

Requirement Status
#1157 / #1170 canonical acceptance harness manual opt-in live path Satisfied by the existing OUROBOROS_RUN_CANONICAL=1 live test path and this PR’s updated summary/docs language.

Prior Findings Status

No prior human or inline review comments were present in the provided artifacts.

Blockers

No in-scope blocking findings remained after policy filtering.

Follow-up Findings

# File:Line Priority Confidence Suggestion
None.

Non-blocking Suggestions

| 1 | tests/canonical/conftest.py:3 | Documentation | The module docstring still says the actual ouroboros_auto invocation lands in a follow-up sub-PR, but tests/canonical/test_canonical.py already contains the opt-in live invocation path. This is stale commentary only; the changed summary-line contract is correct. |

Test Coverage Notes

  • Reviewed changed tests in tests/canonical/test_conftest.py and surrounding live-run coverage in tests/canonical/test_canonical.py.
  • Could not execute the focused pytest suite in this container: uv is unavailable, python is unavailable, and python3 has no pytest installed.

Design Notes

The change is narrow and matches the current harness architecture: hermetic shape/catalog checks remain the default, while live LLM execution is clearly opt-in.

Design / Roadmap Gate

Affected boundary is the canonical harness reporting contract. The changed summary string, README examples, and unit assertions agree with the existing runtime boundary in test_scenario_live_run_or_skip: no live execution occurs unless OUROBOROS_RUN_CANONICAL is set.

Directional Notes

Maintainer-memory signals pushed review focus toward evidence-backed acceptance behavior and avoiding misleading status claims. Current source supports live=opt_in; no blocker found.

Test Coverage

  • Reviewed changed tests in tests/canonical/test_conftest.py and surrounding live-run coverage in tests/canonical/test_canonical.py.
  • Could not execute the focused pytest suite in this container: uv is unavailable, python is unavailable, and python3 has no pytest installed.

Merge Recommendation

Approve. No blocking runtime, contract, persistence, or test-harness regressions were found in the changed files or affected canonical test path.

Review-Metadata:
verdict: APPROVE
head_sha: ae8850c
request_id: req_1779668968_23
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@shaun0927 shaun0927 merged commit ff9e4be into main May 25, 2026
8 checks passed
@shaun0927 shaun0927 deleted the fix/canonical-live-evidence-docs branch May 25, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants