fix: project blocked plugin invocations as workflow failures#1215
Conversation
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1215 |
| HEAD checked | da7022664fd675197d7783687d50dddc159db0fe |
| Request ID | req_retry_exhausted_new_timeout_1779693693_1215 |
| Review record | 4875eb95-fd6d-4921-a238-3e0903726c3c |
What Improved
- Moves the plugin firewall to Workflow IR lifecycle projection out of a local test helper and into a production adapter.
- Preserves the blocked-plugin contract that permission denials produce failed lifecycle events instead of node completion.
Issue Requirements
| Requirement | Status |
|---|---|
| #1131 / #956 / #939: blocked plugin invocation must not project as successful node completion | Satisfied for the single-node blocked fixture |
| Production adapter should safely project plugin firewall outcomes into Workflow IR lifecycle history | Not satisfied for multi-action plugin specs because one successful node completes the whole run |
Prior Findings Status
No prior human or inline review concerns were present in the provided comment artifacts.
Blockers
| # | File:Line | Severity | Finding |
|---|---|---|---|
| 1 | src/ouroboros/orchestrator/workflow_ir_adapter.py:326 | BLOCKING | workflow_lifecycle_from_plugin_invocation_result() unconditionally emits RUN_COMPLETED for any successful plugin node. The same adapter can create plugin specs with multiple action nodes, each pointing at the shared terminal (workflow_spec_from_plugin_descriptor() loops over all descriptor actions at lines 222-290). For a multi-action plugin spec, completing one node would falsely terminalize the entire run while sibling plugin nodes remain incomplete, and downstream projections would treat the workflow as successful. The adapter should only emit a node-level completion unless the graph/run state proves the terminal condition is satisfied, or it should reject multi-node specs for this API. |
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| None. |
Non-blocking Suggestions
| None. | | | |
Test Coverage Notes
- Reviewed changed conformance test and surrounding Workflow IR/plugin adapter tests by direct file inspection.
- Could not run pytest:
/usr/bin/python3is present, butpytestand project runtime dependencypydanticare not installed in this snapshot environment. - Coverage gap: the new lifecycle adapter is tested only with a single plugin node; it lacks a multi-action/multi-node plugin spec case that would catch the false
RUN_COMPLETEDbehavior above.
Design Notes
The boundary direction is reasonable, but the adapter currently mixes per-node invocation projection with whole-run terminalization without checking graph completion.
Design / Roadmap Gate
Affected boundary: plugin firewall InvocationResult to Workflow IR lifecycle events. The changed adapter creates durable run terminal events, so incorrect terminalization affects replay/projection compatibility and can silently mark incomplete plugin workflows as complete.
Directional Notes
Maintainer memory emphasized strict boundary correctness and low-trust automation. I focused on whether the new public adapter preserves Workflow IR lifecycle semantics for graph shapes produced by the current source.
Test Coverage
- Reviewed changed conformance test and surrounding Workflow IR/plugin adapter tests by direct file inspection.
- Could not run pytest:
/usr/bin/python3is present, butpytestand project runtime dependencypydanticare not installed in this snapshot environment. - Coverage gap: the new lifecycle adapter is tested only with a single plugin node; it lacks a multi-action/multi-node plugin spec case that would catch the false
RUN_COMPLETEDbehavior above.
Merge Recommendation
Do not merge until the success projection is scoped to node completion only, validates that the completed node actually satisfies terminal conditions, or explicitly rejects multi-node specs. Add a regression test with a two-action plugin descriptor.
Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: da70226
request_id: req_retry_exhausted_new_timeout_1779693693_1215
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Limit plugin invocation success projection to node completion when a WorkflowSpec contains multiple plugin task nodes. Single-node plugin specs still emit run completion, while multi-action specs no longer terminalize the whole run after one action succeeds.\n\nServices: shared\nAffected files:\n- src/ouroboros/orchestrator/workflow_ir_adapter.py\n- tests/unit/orchestrator/test_workflow_ir_plugin_adapter.py
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1215 |
| HEAD checked | 639f32f1e237a0cb4605839acf2c95ff6dc184f9 |
| Request ID | req_1779704190_11 |
| Review record | 2e1d66f7-e281-431b-a0aa-095c838e6d3a |
What Improved
- Moves blocked plugin firewall outcome projection from a conformance-local helper into the production Workflow IR adapter.
- Adds a regression test ensuring a single successful node in a multi-action plugin spec no longer marks the whole run completed.
Issue Requirements
| Requirement | Status |
|---|---|
No linked issue or PR-body requirement captured; /tmp/pr_body_1215.md contains only empty title/body placeholders. |
N/A |
Prior Findings Status
Prior blocker about a single successful node in a multi-action spec emitting RUN_COMPLETED is withdrawn: current HEAD only appends RUN_COMPLETED when there is exactly one plugin task, and tests/unit/orchestrator/test_workflow_ir_plugin_adapter.py:106 covers that. A related blocker remains because the new behavior still cannot compose multiple plugin invocation results into one valid lifecycle run.
Blockers
| # | File:Line | Severity | Finding |
|---|---|---|---|
| 1 | src/ouroboros/orchestrator/workflow_ir_adapter.py:327 | BLOCKING | workflow_lifecycle_from_plugin_invocation_result() still cannot safely project multi-action plugin runs. workflow_spec_from_plugin_descriptor() can create one plugin task per action, all in the same WorkflowSpec (lines 222-290), but each success projection always starts a fresh RUN_CREATED at lines 327-340. If a caller appends projections for two successful action nodes, the second RUN_CREATED occurs while the prior run is still active, which lifecycle conformance explicitly reports as run_created_before_terminal (workflow_lifecycle.py:894-906). If the caller avoids appending and uses only the last projection, it loses the sibling node completion and still never emits a terminal completion for the multi-action run. The adapter needs run-context input or a separate run-start API so multiple plugin invocation results compose into one legal lifecycle history. |
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| None. |
Non-blocking Suggestions
| None. | | | |
Test Coverage Notes
- Reviewed the unified diff, changed adapter, changed conformance test, changed unit test, firewall
InvocationResult, manifest descriptor projection, and Workflow lifecycle conformance rules by direct file inspection. - Could not run tests in this environment:
python3is 3.11.2 while the source uses newer syntax and project metadata requires>=3.12;pytestis also not installed. - Current tests cover single blocked projection and single success projection in a multi-action spec, but not composing two invocation projections for the same multi-action workflow.
Design Notes
The adapter boundary is directionally right, but run lifecycle ownership is still mixed into a per-node projection API without enough state to produce valid multi-node histories.
Design / Roadmap Gate
Affected boundary: plugin firewall InvocationResult to Workflow IR lifecycle history. This boundary creates durable run events; duplicate RUN_CREATED rows or missing terminal rows make replay/projection behavior invalid for graph shapes the same adapter creates from multi-action plugin descriptors.
Directional Notes
Maintainer memory emphasized strict runtime-boundary correctness and low-trust automation. I focused on whether plugin firewall outcomes can be replayed into Workflow IR lifecycle rows without corrupting run state.
Test Coverage
- Reviewed the unified diff, changed adapter, changed conformance test, changed unit test, firewall
InvocationResult, manifest descriptor projection, and Workflow lifecycle conformance rules by direct file inspection. - Could not run tests in this environment:
python3is 3.11.2 while the source uses newer syntax and project metadata requires>=3.12;pytestis also not installed. - Current tests cover single blocked projection and single success projection in a multi-action spec, but not composing two invocation projections for the same multi-action workflow.
Merge Recommendation
Do not merge until the lifecycle adapter can compose multiple plugin action results for one WorkflowSpec without duplicate active-run boundaries and with a defined terminalization rule. Add a regression test that projects two successful actions from the same multi-action plugin descriptor and validates lifecycle conformance.
Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: 639f32f
request_id: req_1779704190_11
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
…roduction-contract # Conflicts: # tests/unit/orchestrator/test_workflow_ir_plugin_adapter.py
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1215 |
| HEAD checked | 6dc4321f3ad9b961fbac10baa8501cc559b3b34a |
| Request ID | req_1779710664_27 |
| Review record | 8c5cb80e-060b-4e02-b04c-dd150ddb3910 |
What Improved
- Moves plugin firewall result to Workflow IR lifecycle projection into the production adapter.
- Preserves blocked plugin invocation semantics as
NODE_FAILED/RUN_FAILED. - Adds multi-action success composition coverage so separate plugin action results can form one conforming lifecycle run.
Issue Requirements
| Requirement | Status |
|---|---|
No linked issue requirement captured; /tmp/pr_body_1215.md contains only empty title/body placeholders. |
N/A |
| PR intent from changed tests: blocked plugin firewall results must never project as node/run completion. | Met |
| PR intent from prior review context: multi-action plugin success projections must compose into one legal lifecycle run. | Met |
Prior Findings Status
Prior blockers are withdrawn. The first blocker about single-node success terminalizing a multi-action run is addressed by only emitting RUN_COMPLETED when all plugin task ids are present after completion (src/ouroboros/orchestrator/workflow_ir_adapter.py:354-368) and by coverage at tests/unit/orchestrator/test_workflow_ir_plugin_adapter.py:145-175. The second blocker about duplicate RUN_CREATED when composing multi-action results is addressed by the include_run_created=False continuation path plus completed_plugin_node_ids, with conformance coverage at tests/unit/orchestrator/test_workflow_ir_plugin_adapter.py:178-222.
Blockers
No in-scope blocking findings remained after policy filtering.
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| None. |
Non-blocking Suggestions
| None. | | | |
Test Coverage Notes
- Reviewed changed adapter, conformance test, unit tests, firewall
InvocationResult, and Workflow lifecycle conformance behavior by direct file inspection. python3 -m py_compilesucceeded for the changed Python files.- Could not run focused pytest because this environment has no
pytestinstalled (/usr/bin/python3: No module named pytest).
Design Notes
The adapter boundary is now coherent for the currently produced plugin specs: per-node plugin results can be projected without falsely completing sibling actions, and callers can compose later action results into the same run.
Design / Roadmap Gate
Affected boundary: plugin firewall InvocationResult to Workflow IR lifecycle rows. Current HEAD validates node identity, rejects non-plugin nodes, preserves blocked outcomes as failed lifecycle rows, and provides explicit caller controls for run-start inclusion and already-completed plugin nodes so replay histories avoid duplicate active run boundaries.
Directional Notes
Maintainer memory and prior reviews focused this pass on runtime-boundary correctness between plugin firewall outcomes and Workflow IR lifecycle history. Blocker decisions were based only on the current source, tests, and diff.
Test Coverage
- Reviewed changed adapter, conformance test, unit tests, firewall
InvocationResult, and Workflow lifecycle conformance behavior by direct file inspection. python3 -m py_compilesucceeded for the changed Python files.- Could not run focused pytest because this environment has no
pytestinstalled (/usr/bin/python3: No module named pytest).
Merge Recommendation
Approve. I found no current blocking issues in the changed boundary; the prior lifecycle composition blocker appears addressed, with focused regression coverage present though not executable in this environment due missing pytest.
Review-Metadata:
verdict: APPROVE
head_sha: 6dc4321
request_id: req_1779710664_27
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Summary
Verification