refactor(#2429): instruct orchestrator to trust subagent results#2431
refactor(#2429): instruct orchestrator to trust subagent results#2431fullsend-ai-coder[bot] wants to merge 1 commit into
Conversation
The review orchestrator was re-executing ~20 external commands (npm view, gh api for tags/releases/commits) during synthesis that subagents had already run, roughly doubling tool calls on dependency-update PRs. The root cause: the meta-prompt had no explicit instruction to treat subagent investigation outputs as authoritative evidence. Add a "Trust subagent investigation results" section at the top of step 6 (Synthesis) with three rules: consume subagent evidence as-is, re-investigate only when subagent findings conflict, and do not re-read files subagents already examined. Also add a matching constraint to the Constraints section for reinforcement. Note: make lint could not run in sandbox due to shellcheck-py network restriction (infrastructure issue, not code issue). Closes #2429
E2E tests did not runE2E tests run automatically for org/repo members and collaborators on pull requests. For other contributors, a maintainer must add the See E2E testing guide for details. |
Site previewPreview: https://5f9e4e5f-site.fullsend-ai.workers.dev Commit: |
|
🤖 Finished Review · ✅ Success · Started 4:10 PM UTC · Completed 4:21 PM UTC |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
ReviewFindingsLow
Info
Labels: PR modifies review agent skill definitions under the scaffold harness. |
| adds latency without producing new information. | ||
| 2. **Re-investigate only on conflict.** The only justification for | ||
| re-executing a subagent's command is when two subagents return | ||
| contradictory findings about the same artifact and the orchestrator |
There was a problem hiding this comment.
[low] edge-case
Rule 2 ('Re-investigate only on conflict') does not explicitly account for subagent commands that returned errors or empty results. If a subagent's external command failed transiently, the orchestrator would be instructed to trust the failed output as authoritative. Step 5 handles complete subagent failures, but the narrower scenario of a subagent completing successfully while one of its internal commands errored is not covered.
Suggested fix: Consider adding a clause: 'or when a subagent command returned an error or empty result indicating a transient failure.'
| dimensions, so only the orchestrator can detect overlaps and | ||
| cross-references. | ||
|
|
||
| **Trust subagent investigation results.** Sub-agents perform thorough |
There was a problem hiding this comment.
[low] architectural-conflict
The added guidance improves LLM-orchestrator efficiency through prompting. ADR-0018 rejected LLM-based orchestration for deterministic workflows, but SKILL.md lines 13-19 already explicitly acknowledge this departure and flag that a superseding ADR is needed. This PR does not widen the architectural conflict — it optimizes behavior within the already-acknowledged non-conforming implementation.
| @@ -828,3 +849,9 @@ wins. | |||
| - **In pipeline mode, `gh pr review` is reserved for the post-script.** | |||
| The sandbox token is read-only. Write JSON to | |||
| `$FULLSEND_OUTPUT_DIR/agent-result.json` and exit. | |||
There was a problem hiding this comment.
[info] code-organization
The new constraint bullet at the end of the Constraints section restates the 'Trust subagent investigation results' section from step 6. This duplication is consistent with the document's existing pattern — several constraint bullets restate rules from earlier sections to serve as a quick-reference checklist.
The review orchestrator was re-executing ~20 external commands (npm view, gh api for tags/releases/commits) during synthesis that subagents had already run, roughly doubling tool calls on dependency-update PRs. The root cause: the meta-prompt had no explicit instruction to treat subagent investigation outputs as authoritative evidence.
Add a "Trust subagent investigation results" section at the top of step 6 (Synthesis) with three rules: consume subagent evidence as-is, re-investigate only when subagent findings conflict, and do not re-read files subagents already examined. Also add a matching constraint to the Constraints section for reinforcement.
Note: make lint could not run in sandbox due to shellcheck-py network restriction (infrastructure issue, not code issue).
Closes #2429
Post-script verification
agent/2429-trust-subagent-results)a4a5008a7ef4a358574a29d0936e419cef8272be..HEAD)