fix(#1835): require file reads before asserting contents in findings by fullsend-ai-coder[bot] · Pull Request #2443 · fullsend-ai/fullsend

fullsend-ai-coder · 2026-06-18T19:52:20Z

The review agent hallucinated file contents on PR konflux-ci/konflux-test#833, claiming a Dockerfile contained
--nogpgcheck when it never did. The root cause was that the
code-review skill and correctness sub-agent had no explicit requirement to read files outside the PR diff before asserting what they contain.

Changes:

code-review SKILL.md step 2: added cross-file verification
bullet requiring the agent to read any file it references
in a finding, even if not in the diff.
code-review SKILL.md step 4: added cross-file finding
self-check requiring verification that referenced files
were read before finalizing findings.
correctness sub-agent: added cross-file verification
section with the same read-before-assert requirement.

Note: make lint could not run due to sandbox network restrictions preventing shellcheck installation.

Closes #1835

Post-script verification

Branch is not main/master (agent/1835-verify-file-contents-before-asserting)
Secret scan passed (gitleaks — 59159d05e2cb01f868054a4c7143303d6049214b..HEAD)
Pre-commit hooks passed (authoritative run on runner)
Tests ran inside sandbox

The review agent hallucinated file contents on PR konflux-ci/konflux-test#833, claiming a Dockerfile contained --nogpgcheck when it never did. The root cause was that the code-review skill and correctness sub-agent had no explicit requirement to read files outside the PR diff before asserting what they contain. Changes: - code-review SKILL.md step 2: added cross-file verification bullet requiring the agent to read any file it references in a finding, even if not in the diff. - code-review SKILL.md step 4: added cross-file finding self-check requiring verification that referenced files were read before finalizing findings. - correctness sub-agent: added cross-file verification section with the same read-before-assert requirement. Note: make lint could not run due to sandbox network restrictions preventing shellcheck installation. Closes #1835

github-actions · 2026-06-18T19:52:34Z

E2E tests did not run

E2E tests run automatically for org/repo members and collaborators on pull requests.

For other contributors, a maintainer must add the ok-to-test label after the latest push.

See E2E testing guide for details.

github-actions · 2026-06-18T19:53:53Z

Site preview

Preview: https://01f829bf-site.fullsend-ai.workers.dev

Commit: 0f5066f722c1862e4ccc8e160b10f8125fc66a91

fullsend-ai-review · 2026-06-18T19:56:01Z

🤖 Finished Review · ✅ Success · Started 7:56 PM UTC · Completed 8:05 PM UTC
Commit: 0f5066f · View workflow run →

codecov · 2026-06-18T20:02:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

fullsend-ai-review · 2026-06-18T20:05:44Z

Review

Findings

Medium

[scope-completeness] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/security.md — The security sub-agent makes factual assertions about files outside the PR diff (auth middleware, RBAC config, IAM policies, workflow files) but lacks the same cross-file verification checkpoint added to correctness. The security agent's exploration budget instructs it to "Read the full file for every changed auth/authorization module" and "Read related config files," but there is no self-check step before recording findings. The root cause from Review agent should verify file contents before asserting facts about them #1835 could manifest in the security dimension. See also: [consumer-completeness] finding on correctness.md.
Remediation: Add a cross-file verification section to security.md similar to the one added in correctness.md. Place it after the exploration budget section and before the fail-open evaluation section.

Low

[consumer-completeness] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/correctness.md:76 — The cross-file verification instruction is added to the correctness sub-agent but not to the security or docs-currency sub-agents. The SKILL.md additions do not reach sub-agents dispatched by the pr-review orchestrator because the orchestrator's spawn prompt composes from sub-agent definition + meta-prompt + context package only — code-review/SKILL.md is not included. Security is the higher-risk gap given its exploration budget explicitly instructs reading external files. See also: [scope-completeness] finding on security.md.
[scope-completeness] internal/scaffold/fullsend-repo/skills/pr-review/sub-agents/docs-currency.md — The docs-currency sub-agent searches documentation files for stale references and could theoretically make assertions about documentation contents without reading files first. The risk is lower than security/correctness because the sub-agent's core task is grep-based searching which inherently requires reading. No immediate action required; monitor for hallucinations in this dimension.

Labels: PR modifies review agent skill and sub-agent definitions under the scaffold harness.

ben-alkov · 2026-06-18T20:24:58Z

/fs-fix

github-actions Bot deployed to site-preview June 18, 2026 19:53 View deployment

fullsend-ai-review Bot added requires-manual-review Review requires human judgment component/harness Agent harness, config, and skills loading labels Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#1835): require file reads before asserting contents in findings#2443

fix(#1835): require file reads before asserting contents in findings#2443
fullsend-ai-coder[bot] wants to merge 1 commit into
mainfrom
agent/1835-verify-file-contents-before-asserting

fullsend-ai-coder Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

fullsend-ai-review Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 18, 2026

Uh oh!

fullsend-ai-review Bot commented Jun 18, 2026

Uh oh!

ben-alkov commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fullsend-ai-coder Bot commented Jun 18, 2026

Post-script verification

Uh oh!

github-actions Bot commented Jun 18, 2026

E2E tests did not run

Uh oh!

github-actions Bot commented Jun 18, 2026

Site preview

Uh oh!

fullsend-ai-review Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jun 18, 2026

Codecov Report

Uh oh!

fullsend-ai-review Bot commented Jun 18, 2026

Review

Findings

Medium

Low

Uh oh!

ben-alkov commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fullsend-ai-review Bot commented Jun 18, 2026 •

edited

Loading