Skip to content

refactor(#2429): instruct orchestrator to trust subagent results#2431

Open
fullsend-ai-coder[bot] wants to merge 1 commit into
mainfrom
agent/2429-trust-subagent-results
Open

refactor(#2429): instruct orchestrator to trust subagent results#2431
fullsend-ai-coder[bot] wants to merge 1 commit into
mainfrom
agent/2429-trust-subagent-results

Conversation

@fullsend-ai-coder

Copy link
Copy Markdown
Contributor

The review orchestrator was re-executing ~20 external commands (npm view, gh api for tags/releases/commits) during synthesis that subagents had already run, roughly doubling tool calls on dependency-update PRs. The root cause: the meta-prompt had no explicit instruction to treat subagent investigation outputs as authoritative evidence.

Add a "Trust subagent investigation results" section at the top of step 6 (Synthesis) with three rules: consume subagent evidence as-is, re-investigate only when subagent findings conflict, and do not re-read files subagents already examined. Also add a matching constraint to the Constraints section for reinforcement.

Note: make lint could not run in sandbox due to shellcheck-py network restriction (infrastructure issue, not code issue).


Closes #2429

Post-script verification

  • Branch is not main/master (agent/2429-trust-subagent-results)
  • Secret scan passed (gitleaks — a4a5008a7ef4a358574a29d0936e419cef8272be..HEAD)
  • Pre-commit hooks passed (authoritative run on runner)
  • Tests ran inside sandbox

The review orchestrator was re-executing ~20 external commands
(npm view, gh api for tags/releases/commits) during synthesis
that subagents had already run, roughly doubling tool calls on
dependency-update PRs. The root cause: the meta-prompt had no
explicit instruction to treat subagent investigation outputs as
authoritative evidence.

Add a "Trust subagent investigation results" section at the top
of step 6 (Synthesis) with three rules: consume subagent evidence
as-is, re-investigate only when subagent findings conflict, and
do not re-read files subagents already examined. Also add a
matching constraint to the Constraints section for reinforcement.

Note: make lint could not run in sandbox due to shellcheck-py
network restriction (infrastructure issue, not code issue).

Closes #2429
@github-actions

Copy link
Copy Markdown

E2E tests did not run

E2E tests run automatically for org/repo members and collaborators on pull requests.

For other contributors, a maintainer must add the ok-to-test label after the latest push.

See E2E testing guide for details.

@github-actions

Copy link
Copy Markdown

Site preview

Preview: https://5f9e4e5f-site.fullsend-ai.workers.dev

Commit: c40fab636ad7766b02023a454b2a8c51569f5c1c

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 18, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 4:10 PM UTC · Completed 4:21 PM UTC
Commit: c40fab6 · View workflow run →

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@fullsend-ai-review

Copy link
Copy Markdown

Review

Findings

Low

  • [edge-case] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:435 — Rule 2 ("Re-investigate only on conflict") does not explicitly account for subagent commands that returned errors or empty results. If a subagent's external command failed transiently, the orchestrator would be instructed to trust the failed output as authoritative. Step 5 handles complete subagent failures, but the narrower scenario of a subagent completing successfully while one of its internal commands errored is not covered.
    Remediation: Consider adding a clause: "or when a subagent command returned an error or empty result indicating a transient failure."

  • [architectural-conflict] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:422 — The added guidance improves LLM-orchestrator efficiency through prompting. ADR-0018 rejected LLM-based orchestration for deterministic workflows, but SKILL.md lines 13–19 already explicitly acknowledge this departure and flag that a superseding ADR is needed. This PR does not widen the architectural conflict — it optimizes behavior within the already-acknowledged non-conforming implementation.

Info

  • [code-organization] internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md:851 — The new constraint bullet at the end of the Constraints section restates the "Trust subagent investigation results" section from step 6. This duplication is consistent with the document's existing pattern — several constraint bullets restate rules from earlier sections to serve as a quick-reference checklist.

Labels: PR modifies review agent skill definitions under the scaffold harness.

adds latency without producing new information.
2. **Re-investigate only on conflict.** The only justification for
re-executing a subagent's command is when two subagents return
contradictory findings about the same artifact and the orchestrator

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[low] edge-case

Rule 2 ('Re-investigate only on conflict') does not explicitly account for subagent commands that returned errors or empty results. If a subagent's external command failed transiently, the orchestrator would be instructed to trust the failed output as authoritative. Step 5 handles complete subagent failures, but the narrower scenario of a subagent completing successfully while one of its internal commands errored is not covered.

Suggested fix: Consider adding a clause: 'or when a subagent command returned an error or empty result indicating a transient failure.'

dimensions, so only the orchestrator can detect overlaps and
cross-references.

**Trust subagent investigation results.** Sub-agents perform thorough

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[low] architectural-conflict

The added guidance improves LLM-orchestrator efficiency through prompting. ADR-0018 rejected LLM-based orchestration for deterministic workflows, but SKILL.md lines 13-19 already explicitly acknowledge this departure and flag that a superseding ADR is needed. This PR does not widen the architectural conflict — it optimizes behavior within the already-acknowledged non-conforming implementation.

@@ -828,3 +849,9 @@ wins.
- **In pipeline mode, `gh pr review` is reserved for the post-script.**
The sandbox token is read-only. Write JSON to
`$FULLSEND_OUTPUT_DIR/agent-result.json` and exit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] code-organization

The new constraint bullet at the end of the Constraints section restates the 'Trust subagent investigation results' section from step 6. This duplication is consistent with the document's existing pattern — several constraint bullets restate rules from earlier sections to serve as a quick-reference checklist.

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge component/harness Agent harness, config, and skills loading labels Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/harness Agent harness, config, and skills loading ready-for-merge All reviewers approved — ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Review orchestrator should trust subagent investigation results instead of re-executing commands

0 participants