fix: reject status-masking test pipelines by Q00 · Pull Request #1208 · Q00/ouroboros

Q00 · 2026-05-24T10:29:09Z

Summary

keep presentation plumbing aliases for non-test commands and pipefail-protected test commands
reject clean pytest/test proof when a trailing output pipe can mask the upstream command status
cover the status-masking pytest ... | cat regression

Original PR blocker coverage

Addresses the anti-fabrication command proof blocker reported on #1186.

Verification

uv run pytest tests/unit/orchestrator/test_parallel_executor.py -q
uv run ruff check src/ouroboros/orchestrator/parallel_executor.py tests/unit/orchestrator/test_parallel_executor.py

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1208 |
| HEAD checked | b3034d7952c221a2292ae095738515cf7b513297 |
| Request ID | req_retry_exhausted_new_timeout_1779693693_1208 |
| Review record | 3b8bed75-0e1f-4a7f-8d3c-d4e1bd27d244 |

What Improved

Tightens test-command evidence so pager-style output pipelines such as pytest ... | cat only prove a clean test claim when pipefail is detected.
Adds regression coverage for rejecting a status-masking output pipe.

Issue Requirements

Requirement	Status
No linked issue requirement captured	N/A

Prior Findings Status

No prior human or inline review concerns were present in the provided comment artifacts.

Blockers

#	File:Line	Severity	Finding
1	src/ouroboros/orchestrator/parallel_executor.py:1175	BLOCKING	`_uses_pipefail()` treats any occurrence of the text `pipefail` as proof that the shell preserved upstream pipeline status. This is the new trust gate used before accepting stripped test pipelines at lines 923-929 and 963-969, but it also returns true for commands like `pytest tests/test_pipefail.py \

Follow-up Findings

#	File:Line	Priority	Confidence	Suggestion
1	src/ouroboros/orchestrator/parallel_executor.py:1024	Medium	Medium	Consider accepting common valid pipefail preambles such as `set -e -o pipefail` or `set -euo pipefail`; the current safe-preamble check only recognizes exactly `set -o pipefail`, so some legitimate wrapped test commands with pipefail may still be rejected.

Non-blocking Suggestions

| 1 | tests/unit/orchestrator/test_parallel_executor.py:1188 | Test coverage | Add a regression test where pipefail appears in a test path, redirect target, or shell comment without enabling shell pipefail, because that is the bypass introduced by the current substring detector. |

Test Coverage Notes

Reviewed tests/unit/orchestrator/test_parallel_executor.py coverage around command-claim aliases and output-plumbing pipelines.
Could not run the tests in this container: python is absent, python3 is too old for this project’s Python 3.12+ syntax, and pytest is not installed.

Design Notes

The design direction is sound: pipe status preservation belongs at the evidence-normalization boundary before stripping output-only pipes. The implementation needs a stricter detector for the shell option that actually provides that guarantee.

Design / Roadmap Gate

Affected boundary: runtime Bash command evidence for commands_run / tests_passed. The PR changes when a recorded pipeline can prove a clean test invocation. Because the pipefail check is substring-based, commands that do not preserve upstream pipeline failure can cross that boundary and be treated as reliable test evidence.

Directional Notes

Maintainer memory pushed the review focus toward anti-fabrication integrity and thin, reliable evidence boundaries. The blocker above is based on current changed-source behavior, not memory alone.

Test Coverage

Reviewed tests/unit/orchestrator/test_parallel_executor.py coverage around command-claim aliases and output-plumbing pipelines.
Could not run the tests in this container: python is absent, python3 is too old for this project’s Python 3.12+ syntax, and pytest is not installed.

Merge Recommendation

Do not merge until _uses_pipefail() proves an actual enabled pipefail setting rather than matching arbitrary command text. The current implementation leaves a direct status-masking bypass in the evidence verifier.

Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: b3034d7
request_id: req_retry_exhausted_new_timeout_1779693693_1208
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Tighten test-command pipeline evidence so the pipefail trust gate recognizes only an actual shell preamble instead of arbitrary command text. This closes the status-masking bypass where a filename or comment containing pipefail let output-filter pipelines prove clean test claims.\n\nServices: shared\nAffected files:\n- src/ouroboros/orchestrator/parallel_executor.py\n- tests/unit/orchestrator/test_parallel_executor.py

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1208 |
| HEAD checked | a0d35f3b8b5142a67c7d1f66c49fced8df862f01 |
| Request ID | req_1779704190_12 |
| Review record | 14b8efa4-411a-46c4-8853-295011943159 |

What Improved

Tightens test-command evidence so output-filter pipelines only support clean test claims when pipefail is detected.
Adds regression coverage for status-masking pipes and arbitrary pipefail text.

Issue Requirements

Requirement	Status
No linked issue requirement captured	N/A

Prior Findings Status

Prior bot concern about arbitrary substring matching for pipefail is withdrawn: current HEAD uses shlex.split() and exact ["set", "-o", "pipefail"], with a regression test for comment/path text. A related blocker remains because the exact pipefail segment can still appear after the masked pipeline and be treated as protecting it.

Blockers

#	File:Line	Severity	Finding
1	src/ouroboros/orchestrator/parallel_executor.py:965	BLOCKING	The pipefail check is not ordered relative to the pipeline it is protecting. `_test_invocation_from_shell_body()` calls `_uses_pipefail(body)` for the whole shell body, and `_uses_pipefail()` accepts any later `set -o pipefail` segment. A transcript command like `/bin/bash -lc 'cd /workspace && pytest tests/test_foo.py \

Follow-up Findings

#	File:Line	Priority	Confidence	Suggestion
1	src/ouroboros/orchestrator/parallel_executor.py:1024	Medium	Medium	Consider accepting common valid pipefail preambles such as `set -e -o pipefail` or `set -euo pipefail`; the current safe-preamble check only recognizes exactly `set -o pipefail`.

Non-blocking Suggestions

| 1 | tests/unit/orchestrator/test_parallel_executor.py:1188 | Test coverage | Add a regression test where set -o pipefail appears after pytest ... \| cat, proving the verifier rejects pipefail settings that occur too late to preserve that pipeline’s status. |

Test Coverage Notes

Reviewed tests/unit/orchestrator/test_parallel_executor.py coverage around pipefail, status-masking output pipes, and arbitrary pipefail text.
Could not run pytest: /usr/bin/python3: No module named pytest. python3 -m py_compile passed for the changed source and test file.

Design Notes

The evidence-normalization boundary is the right place for this check, but the pipefail proof must be tied to shell execution order for the specific pipeline being stripped.

Design / Roadmap Gate

Affected boundary: runtime Bash command evidence for commands_run / tests_passed. The PR changes when a recorded pipeline can prove a clean test invocation. Because pipefail is accepted from anywhere in the shell body, stale or late shell state can still let status-masking output filters cross the evidence boundary.

Directional Notes

Maintainer memory shaped the focus on anti-fabrication integrity and reliable runtime evidence boundaries. The blocker above is based on current HEAD source behavior, not memory alone.

Test Coverage

Reviewed tests/unit/orchestrator/test_parallel_executor.py coverage around pipefail, status-masking output pipes, and arbitrary pipefail text.
Could not run pytest: /usr/bin/python3: No module named pytest. python3 -m py_compile passed for the changed source and test file.

Merge Recommendation

Do not merge until pipefail detection is ordered and scoped to the specific stripped pipeline. The current implementation still accepts a clean test claim for a pipeline whose failing upstream command can be masked before pipefail is enabled.

Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: a0d35f3
request_id: req_1779704190_12
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true

Reviewed by ouroboros-agent[bot] via Codex deep analysis

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Metadata

| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|---|
| PR | #1208 |
| HEAD checked | 2288efa130c32b25105ebec680e707e8f7f4058c |
| Request ID | req_1779710548_26 |
| Review record | 56037304-fa2b-4f1e-a7b7-e8b1587a8c39 |

What Improved

Tightens runtime evidence matching so pager-style output pipelines for test commands only prove clean commands_run / tests_passed claims when pipefail is enabled before the protected pipeline.
Adds regression coverage for unprotected status-masking pipes, arbitrary pipefail text, and late set -o pipefail.

Issue Requirements

Requirement	Status
No linked issue requirement captured	N/A

Prior Findings Status

Prior bot blockers are withdrawn. The original arbitrary substring concern is addressed by exact shlex.split() based set -o pipefail recognition. The later ordering concern is addressed by _segments_after_safe_shell_preamble_with_pipefail() tracking pipefail state before each shell segment and by the regression at tests/unit/orchestrator/test_parallel_executor.py:1225.

Blockers

No in-scope blocking findings remained after policy filtering.

Follow-up Findings

#	File:Line	Priority	Confidence	Suggestion
None.

Non-blocking Suggestions

| 1 | src/ouroboros/orchestrator/parallel_executor.py:1198 | Cleanup | _uses_pipefail() is now unused after the ordered pipefail checks were introduced. Removing it would reduce duplicate pipefail helper surface. |

Test Coverage Notes

Reviewed changed tests in tests/unit/orchestrator/test_parallel_executor.py, including the new regressions for status-masking output pipes, arbitrary pipefail text, and set -o pipefail appearing after the pipeline.
Ran python3 -m py_compile src/ouroboros/orchestrator/parallel_executor.py tests/unit/orchestrator/test_parallel_executor.py: passed.
Could not run pytest because this container lacks pytest (/usr/bin/python3: No module named pytest).

Design Notes

The change stays at the runtime evidence normalization boundary, which is the right layer for deciding whether a stripped output pipe still preserves command status.

Design / Roadmap Gate

Affected boundary: Bash runtime evidence for commands_run and tests_passed. Current HEAD ties clean test-command aliases to pipefail being enabled before the specific stripped pipeline, rejects stale or late pipefail state, and preserves compatibility for non-test command output plumbing.

Directional Notes

Maintainer memory shaped the focus on anti-fabrication integrity and reliable runtime evidence boundaries. No blocker is based on memory alone.

Test Coverage

Reviewed changed tests in tests/unit/orchestrator/test_parallel_executor.py, including the new regressions for status-masking output pipes, arbitrary pipefail text, and set -o pipefail appearing after the pipeline.
Ran python3 -m py_compile src/ouroboros/orchestrator/parallel_executor.py tests/unit/orchestrator/test_parallel_executor.py: passed.
Could not run pytest because this container lacks pytest (/usr/bin/python3: No module named pytest).

Merge Recommendation

Approve. I found no current blocking evidence-boundary regression in the changed source; test execution remains limited only by missing local pytest.

Review-Metadata:
verdict: APPROVE
head_sha: 2288efa
request_id: req_1779710548_26
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true

Reviewed by ouroboros-agent[bot] via Codex deep analysis

fix: reject status-masking test pipelines

b3034d7

Q00 mentioned this pull request May 24, 2026

Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961

Open

ouroboros-agent Bot requested changes May 25, 2026

View reviewed changes

Q00 added 2 commits May 25, 2026 20:49

fix(orchestrator): require pipefail before test pipeline

f601310

style(orchestrator): format pipefail helper

2288efa

ouroboros-agent Bot approved these changes May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reject status-masking test pipelines#1208

fix: reject status-masking test pipelines#1208
Q00 wants to merge 4 commits into
mainfrom
fix/command-proof-pipeline-status

Q00 commented May 24, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Q00 commented May 24, 2026

Summary

Original PR blocker coverage

Verification

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Metadata

What Improved

Issue Requirements

Prior Findings Status

Blockers

Follow-up Findings

Non-blocking Suggestions

Test Coverage Notes

Design Notes

Design / Roadmap Gate

Directional Notes

Test Coverage

Merge Recommendation

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Metadata

What Improved

Issue Requirements

Prior Findings Status

Blockers

Follow-up Findings

Non-blocking Suggestions

Test Coverage Notes

Design Notes

Design / Roadmap Gate

Directional Notes

Test Coverage

Merge Recommendation

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Metadata

What Improved

Issue Requirements

Prior Findings Status

Blockers

Follow-up Findings

Non-blocking Suggestions

Test Coverage Notes

Design Notes

Design / Roadmap Gate

Directional Notes

Test Coverage

Merge Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant