Skip to content

Review agent should not emit findings that contradict each other within the same review #2484

@fullsend-ai-retro

Description

@fullsend-ai-retro

What happened

On PR #543, the review agent (run 27891376450) produced two findings that directly contradicted each other:

  1. High [protected-path]: "This file is under the .github/ protected path. The PR has no linked issue providing explicit authorization for modifying governance/infrastructure files."
  2. Info [implicit-authorization]: "Authorization inferred from renovate.json configuration (lines 32-39) which explicitly enables automated GitHub Actions updates for .github/** files on a weekly schedule."

The Info finding explicitly refutes the premise of the High finding (that no authorization exists). Despite recognizing the authorization, the agent still submitted CHANGES_REQUESTED, forcing two humans to manually approve a trivial Renovate digest-pin update.

What could go better

The review agent's finding generation appears to run multiple independent checks without a reconciliation pass. Each check evaluates a concern in isolation — one checks for protected-path authorization, another checks for implicit bot authorization — but their outputs are never compared for logical consistency before the final verdict.

This is distinct from existing issues like #2057 (verdict choice for bot PRs) and #1392 (suppressing protected-path findings). Those address severity calibration and verdict selection. This issue is about internal consistency: the agent should not assert "no authorization exists" (High) when it simultaneously asserts "authorization is inferred from config" (Info) in the same review.

Confidence: High — the contradiction is plainly visible in the review output. The root cause (no cross-finding reconciliation) is a reasonable hypothesis but may be wrong if the agent uses a single-pass architecture; in that case the issue may be prompt-level rather than architectural.

Proposed change

Add a finding reconciliation step to the review agent's output pipeline. After all sub-checks produce their findings, a reconciliation pass should:

  1. Detect pairs of findings where one finding's evidence directly negates another finding's premise.
  2. When a conflict is detected, either merge them into a single finding at the lower severity, or suppress the higher-severity finding and note the resolution.
  3. Specifically: when an implicit-authorization Info finding cites a config file that authorizes the exact change pattern flagged by a protected-path High finding, the High finding should be suppressed or downgraded.

This could be implemented as a post-processing step in the review skill or agent definition (likely in skills/pr-review/SKILL.md or the review agent's output formatting logic).

Validation criteria

On the next 5 bot-authored dependency-update PRs that touch protected paths (e.g., .github/) and have discoverable authorization config (renovate.json, dependabot.yml), the review agent should not produce contradictory findings. Specifically: if an Info finding acknowledges authorization, no High finding in the same review should assert that authorization is missing for the same file or change.


Generated by retro agent from konflux-ci/konflux-test-tasks#543

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent/reviewReview agentcomponent/harnessAgent harness, config, and skills loadingpriority/mediumNormal priority, plan for next cycleready-to-codeTriaged and ready for the code agent

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions