Code/fix agent should detect missing validation tools and escalate instead of guessing

## What happened

PR [#145](https://github.com/konflux-ci/konflux-build-cli/pull/145) addressed issue [#144](https://github.com/konflux-ci/konflux-build-cli/issues/144) (integration tests fail with docker engine). The code agent produced an initial fix on Jun 15. When humans tested with docker and reported failures, the fix agent ran 4 more iterations (workflow runs [27682539146](https://github.com/konflux-ci/.fullsend/actions/runs/27682539146), [27689678762](https://github.com/konflux-ci/.fullsend/actions/runs/27689678762), [27692981710](https://github.com/konflux-ci/.fullsend/actions/runs/27692981710), [27696404532](https://github.com/konflux-ci/.fullsend/actions/runs/27696404532)). Each iteration tried a different strategy (privileged mode, security opts, vfs storage driver) — none worked because the agent could never run the tests with docker. In fix iteration 2, the agent explicitly acknowledged 'Docker is not available in the sandbox environment' but continued producing unvalidated code. The PR was closed without merge after a human reviewer noted the agent was 'throwing shit at the wall to see what sticks.'

## What could go better

The agent should have recognized at the outset that the task required docker to validate, detected that docker was absent from the sandbox, and escalated with a clear message rather than producing speculative fixes. This is distinct from existing issue #2117 (which covers unresolvable *test failures*) — here the tests passed in the sandbox with podman; the problem was that the required validation tool was entirely absent. The agent wasted ~4 fix cycles and ~5 review cycles of token budget on code that was never mergeable. Confidence: HIGH — the agent itself acknowledged the limitation in iteration 2 but was not instructed to stop.

## Proposed change

Add a pre-flight check to the code and fix agent definitions (agents/code.md, agents/fix.md) that instructs the agent to: (1) identify what tools or environments are needed to validate the fix based on the issue description, (2) check whether those tools are available in the sandbox, and (3) if a critical validation tool is missing, post a comment explaining what's needed and why the agent cannot proceed, rather than producing unvalidated code. For example, if an issue says 'fails with docker' and `docker` is not available, the agent should say so and suggest the human either provide docker access or solve this manually. This could be implemented as a prompt section like: 'Before writing code, verify you have the tools needed to test your changes against the reported failure. If the issue requires a specific runtime (e.g., docker, a specific OS, a cloud service) that is not available in your environment, stop and post a comment explaining the limitation instead of producing untestable code.'

## Validation criteria

On the next 3 issues where the required validation tool is absent from the sandbox, the code/fix agent should: (1) post a comment identifying the missing tool within the first response, (2) not produce speculative fixes, and (3) not trigger more than 1 fix iteration before escalating. Measure by checking whether closed-without-merge PRs due to sandbox limitations drop to zero over the next 30 days.

---
_Generated by retro agent from https://github.com/konflux-ci/konflux-build-cli/pull/145_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code/fix agent should detect missing validation tools and escalate instead of guessing #2487

What happened

What could go better

Proposed change

Validation criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Code/fix agent should detect missing validation tools and escalate instead of guessing #2487

Description

What happened

What could go better

Proposed change

Validation criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions