feat(foreman): bounded fix iteration on reviewer NO-GO instead of terminal failure by joryirving · Pull Request #959 · defilantech/LLMKube

joryirving · 2026-07-03T02:57:06Z

What

On a reviewer NO-GO the WorkloadReconciler now appends a bounded fix iteration instead of terminally failing the Workload: a new coder step code-<n>-r<k> (same coder AgentRef, same branch, payload.allowOverwrite: true) whose payload.prompt carries the reviewer's summary + structured findings, chained to fresh verify-<n>-r<k> + review-<n>-<i>-r<k> steps. Bounded by new Workload.spec.maxReviewIterations (*int32; unset defaults to 1, explicit 0 restores fail-on-first-NO-GO). status.reviewIterations records the emitted count and a ReviewIterationTriggered condition names the re-dispatched issues.

Why

The review result carries exactly what a fix needs (findings + summary + branch), but a NO-GO discarded it; external retry systems could only delete + recreate the Workload and re-run the coder blind, reproducing the same rejected patch.

Fixes #946

How

internal/foreman/controller/workload_iteration.go: emission modeled on emitEscalations — pure step-synthesis function (reviewIterationSteps), idempotency by step-name existence (partial creates repaired next reconcile), MaxTasks accounting (MaxTasksIterationCap), sovereignty gates on iteration reviewers, steady-state condition short-circuit, labeled in-flight placeholders so rollup/escalation see the new round in the same pass.
Feedback prompt distills the review task's status.result.extra.modelExtra via pkg/foreman/agent/reviewer.ParseFindings, with a raw-JSON fallback for non-conforming (legacy map-shaped) findings.
Rollup and the escalation trigger now judge each issue by its latest iteration only (activeChildren): a converged retry completes the Workload, a superseded round's NO-GO no longer pins it at Failed, and the escalation tier defers until iterations settle. Escalation semantics are unchanged for workloads that opt out (existing [FEATURE] Foreman v0.2 / v0.8.0: close the reviewer-step empirical gap #546 tests now set maxReviewIterations: 0 since the default changed to 1).
patchReviewAdvisories wires each iteration's reviews to their own round's coder task (escalate reads the latest) instead of always code-<n>.
CRD manifests + chart CRDs regenerated.

Notable default change: unset maxReviewIterations means one fix iteration, so a NO-GO now fails the Workload one round later than before.

Verification: KUBEBUILDER_ASSETS=... go test ./internal/foreman/controller/ -count=1 (envtest specs for NO-GO-with-budget → r1 steps with feedback prompt + allowOverwrite, exhausted budget → Failed/ChildrenIncomplete, GO → no iteration, iteration count in status, idempotency; plus table tests for step synthesis, prompt rendering, and superseded-round filtering); golangci-lint run ./internal/foreman/... ./api/... 0 issues.

AI assistance: authored by Claude (Opus 4.8) operating under the maintainer's direction; reviewed before submitting.

Checklist

Tests added/updated
make test passes locally
make lint passes locally
Commit messages follow conventional commits
All commits are signed off (git commit -s) per DCO
AI assistance (if any) is disclosed above, per CONTRIBUTING.md
Documentation updated (if user-facing change)

…minal failure A reviewer NO-GO used to terminally fail the Workload, discarding the review's structured findings; external retry systems could only delete + recreate the Workload and re-run the coder blind (4 production issues burned 3 attempts each reproducing the same rejected patch). The WorkloadReconciler now appends a bounded fix iteration on NO-GO: - new coder step code-<n>-r<k> (same coderAgentRef, same branch, payload.allowOverwrite=true so the coder can replace its own prior attempt via force-with-lease) whose payload.prompt distills the NO-GO reviewers' summary + structured findings from status.result.extra.modelExtra - fresh verify-<n>-r<k> + review-<n>-<i>-r<k> steps chained behind it - bounded by the new Workload.spec.maxReviewIterations (*int32; nil defaults to 1, explicit 0 restores fail-on-first-NO-GO); exhausting the budget fails the Workload as before - status.reviewIterations records the emitted iteration count and a ReviewIterationTriggered condition names the re-dispatched issues Emission mirrors emitEscalations: pure step-synthesis function, idempotency by step-name existence (partial creates repaired on the next reconcile), MaxTasks accounting (MaxTasksIterationCap), and the sovereignty gates on iteration reviewers. Rollup and the escalation trigger now judge each issue by its latest iteration only (activeChildren), so a converged retry completes the Workload while a superseded round's NO-GO no longer pins it at Failed, and the escalation tier defers until iterations settle. patchReviewAdvisories wires each iteration's review to its own round's coder task (and the escalation tier to the latest one) instead of always reading code-<n>. Fixes defilantech#946 Signed-off-by: Jory Irving <jory.irving@stackadapt.com>

codecov · 2026-07-03T03:04:30Z

Codecov Report

❌ Patch coverage is 82.56579% with 53 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/foreman/controller/workload_iteration.go	83.92%	31 Missing and 14 partials ⚠️
api/foreman/v1alpha1/zz_generated.deepcopy.go	0.00%	4 Missing ⚠️
internal/foreman/controller/workload_controller.go	50.00%	1 Missing and 1 partial ⚠️
...l/foreman/controller/workload_review_advisories.go	87.50%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

joryirving · 2026-07-03T03:05:52Z

Interaction note for merge ordering: this PR and #956 (open-PR-on-review-GO) touch adjacent payload/synthesis code, and this branch intentionally does not stamp openPullRequest onto iteration review steps because #956 isn't in its base. Whichever merges second should get a one-line follow-up (stamp openPullRequest on the review-<n>-<i>-r<k> payloads, mirroring the base-round stamp) so an iterated-then-approved Workload still opens its PR. Happy to push that follow-up once both land.

Defilan · 2026-07-03T03:23:23Z

Review

Request changes. The controller state-machine work is excellent — but the feature isn't functional yet because the execution half (restoring the prior attempt) is missing, and as written the retry can destroy the base attempt's work.

Revision coder rebuilds from main and force-overwrites the prior attempt (blocking): workload_iteration.go:217,323 sets the iteration coder's Payload.Branch to the same base name with AllowOverwrite: true, and reviewFeedbackPrompt (:281) tells the model "your previous attempt is already pushed on this task's branch." But the executor never restores it: each task runs in a fresh workspace, resolveUpstream(payload.Repo) is non-empty, so repo.CreateBranchFromUpstream runs → git checkout -B <branch> FETCH_HEAD (pkg/foreman/agent/repo/branch.go:164) which resets the branch to a fresh fetch of main. The prior commits live only on origin/foreman/<w>/issue-<n> and are never fetched. AllowOverwrite only feeds push.ReplaceOnReject (force-with-lease). So on the common "you changed too much / revert X" NO-GO (exactly the envtest fixture), the retry starts from main, sees only the feedback, produces a feedback-only diff, and force-with-lease overwrites the base attempt — the branch ends with less than it started. This is the precise failure mode #951 was filed to prevent (its reviseFromBranch: executor fetches + checks out the prior ref so the files are simply present). The envtest can't see it — it asserts task-spec fields and marks the retry GO by hand, never running the executor.

Related: the retry also loses the original ask — fetchIssueBodyIfNeeded (executor_native.go:1791) early-returns when Payload.Prompt != "", so the iteration coder gets the review feedback but never the issue body/acceptance criteria. And it reuses the issue-fix CoderAgentRef (workload_iteration.go:320), i.e. the issue-fix forcing profile — #951's live validation (#478 r1/r2/r3) showed that profile's editFreeTurnsLimit: 4 collapses a revision task; only a revision-tuned profile converged.

Recommendation: land this as the loop skeleton, but it does not supersede #951 — keep #951 open, rescoped to the two load-bearing pieces (executor-owned reviseFromBranch restore + revision profile pairing), and make #946's closure note explicit that today the retry rebuilds rather than amends. Until the restore lands I'd gate the CRITICAL or drop the "previous attempt is on this branch" claim from the prompt.

Verified clean and genuinely well done: the bound truly bounds (round k derived from observed child step-names, no mutable counter, superseded rounds filtered not deleted so re-emission is permanently blocked — no unbounded/off-by-one path), feedback plumbing, state-machine ordering vs escalation/rollup, placeholder anti-double-dispatch, CRD/chart/deepcopy sync.

joryirving requested a review from Defilan as a code owner July 3, 2026 02:57

Defilan mentioned this pull request Jul 3, 2026

[FEATURE] Foreman: coder-tier escalation — re-dispatch a failed coder task to a larger model #963

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(foreman): bounded fix iteration on reviewer NO-GO instead of terminal failure#959

feat(foreman): bounded fix iteration on reviewer NO-GO instead of terminal failure#959
joryirving wants to merge 1 commit into
defilantech:mainfrom
joryirving:feat/review-nogo-iteration

joryirving commented Jul 3, 2026

Uh oh!

codecov Bot commented Jul 3, 2026

Uh oh!

joryirving commented Jul 3, 2026

Uh oh!

Defilan commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

Conversation

joryirving commented Jul 3, 2026

What

Why

How

Checklist

Uh oh!

codecov Bot commented Jul 3, 2026

Codecov Report

Uh oh!

joryirving commented Jul 3, 2026

Uh oh!

Defilan commented Jul 3, 2026

Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants