Skip to content

fix(#1835): require file reads before asserting contents in findings#54

Closed
guyoron1 wants to merge 186 commits into
mainfrom
mirror/2443-1835-verify-file-contents-before-asserting
Closed

fix(#1835): require file reads before asserting contents in findings#54
guyoron1 wants to merge 186 commits into
mainfrom
mirror/2443-1835-verify-file-contents-before-asserting

Conversation

@guyoron1

Copy link
Copy Markdown
Owner

Mirror of upstream fullsend-ai#2443

The review agent hallucinated file contents, claiming a Dockerfile contained --nogpgcheck when it never did. Adds explicit requirement to read files outside the PR diff before asserting what they contain.

Benkapner and others added 30 commits June 7, 2026 09:54
Add a new subsection under "CI pipeline for agent configurations"
elaborating on Step 1 (static analysis). Covers component-level
checks (structural integrity, security patterns, token budget),
setup-level analysis (redundancy detection, dependency validation,
token budget distribution, trigger overlap, dimension scoring),
and optional LLM-based rubric scoring. Presents similarity
techniques as options (TF-IDF, embeddings, LLM-based) rather
than prescribing a single approach. Adds three open questions
on thresholds, lint rule universality, and token budgets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Benjamin Kapner <bkapner@redhat.com>
Introduce --vendor to install vendored binaries, reusable workflows,
actions, and agent content. Vendored upstream mirror content is committed
under .defaults/ (same layout as runtime sparse checkout); layered installs
fetch fullsend-ai/fullsend@v0 into .defaults when the marker file is absent.

Reusable workflows use inline workspace preparation and reference infra
from ./.defaults/, matching the pre-vendor layered design. Thin callers
render local reusable paths when --vendor is set.

--fullsend-source pins the source tree for both content and binary
cross-compile; --fullsend-binary remains an explicit ELF override.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Write vendor-manifest.yaml on --vendor installs so cleanup and analyze work
without a local fullsend checkout. Workflows analyze stays embed-only;
vendor layer reports presence, manifest alignment, and optional source
alignment via admin analyze --fullsend-source.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Consolidate thin-stage caller registry, reuse resolved source root for
binary vendoring, reject oversized tar members during extraction, restore
workflows scope comment, fix testing-workflows prose, and introduce
InstallFiles as the canonical collector return type.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Re-add the full download_test.go suite and append extractSourceTree size
limit coverage.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Delete vendored paths atomically via forge.DeleteFiles, reuse resolved
source root for cross-compile, preserve extracted file modes, and tighten
WouldFix deduplication to exact path matches.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Document intentional breaking change: old flag callers should use --vendor;
only known usage was e2e, already updated in this branch.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Document VendorBinaryLayer legacy naming, restore Uninstall/Analyze
comments, and use Title Case for stale-cleanup progress messages.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Batch binary, content, and manifest in one CommitFiles call; validate
manifest version on read; trim leading slash in extractSourceTree; wrap
DeleteFiles ref PATCH in retryOnTransient.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Use the existing blob mode from the recursive tree and set type blob
so deletion entries match GitHub Trees API expectations.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Guard against regressions in delete-entry construction per review.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>

# Conflicts:
#	internal/forge/fake.go
#	internal/forge/forge.go

Signed-off-by: Barak Korren <bkorren@redhat.com>
Encode CommitFiles tree entries as base64 to preserve ELF binaries,
add tar extract containment check, consolidate stale cleanup with a
manifest/binary quick-check, and deduplicate cleanup between CLI and layer.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>

# Conflicts:
#	action.yml
#	docs/guides/dev/testing-workflows.md

Signed-off-by: Barak Korren <bkorren@redhat.com>
Clarify removed distribution-mode artifacts, drop e2e vendor line, and
document action.yml source-build fallback.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Empty commit to re-dispatch review; prior synchronize dispatch was cancelled.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Keep enumerateVendoredPaths aligned with CollectVendoredAssets after
main added the composite action (fullsend-ai#2106); fixes CI parity test.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…t dispatch

GitHub Actions may return 422 when repo-maintenance is dispatched immediately
after a separate vendor CommitFiles on a fresh .fullsend repo. Merge scaffold
and vendored assets into one atomic commit and retry dispatch on indexing lag.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…nance

Poll GitHub until repo-maintenance.yml is active before dispatch, re-touch
config.yaml after scaffold so the push trigger can run enrollment when
dispatch is still rejected, and fall back to awaiting a push-triggered run.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…nary

Tree entries with encoding:base64 stored base64 text literally on GitHub,
corrupting YAML workflows and vendor-manifest.yaml. Restore UTF-8 inline
content for text and upload binary via the Git Blob API instead.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Design for a new `prerequisites` triage action that replaces `blocked`.
The agent can now express both existing blockers and new issues that need
to be created upstream before progress can happen. Includes allowlist
configuration for cross-repo issue creation and a degraded path when
targets are not authorized.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…nd-ai#401)

Seven-task plan covering config structs, JSON schema, agent prompt,
post-script, user docs, and caller updates. TDD approach with exact
file paths and code blocks.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
Add CreateIssuesConfig and AllowTargets types to both OrgConfig and
PerRepoConfig. NewOrgConfig populates defaults with the org and
fullsend-ai/fullsend. NewPerRepoConfig populates with the target repo
and fullsend-ai/fullsend.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…ues (fullsend-ai#401)

Pass org name and target repo to config constructors so create_issues
defaults are populated at install time.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
)

Replace the blocked action and blocked_by field with a prerequisites
action containing existing[] and create[] arrays. At least one array
must be non-empty.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…pt (fullsend-ai#401)

The triage agent can now recommend creating upstream issues via the
prerequisites action's create array, in addition to referencing existing
blockers. Adds hard constraint against emitting sufficient when
prerequisites exist.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…d-ai#401)

Update triage agent docs to explain the new prerequisites action and the
create_issues.allow_targets configuration surface.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…#401)

Replace the blocked handler with prerequisites. The post-script reads
the create_issues allowlist from config.yaml, creates permitted upstream
issues via gh, and includes collapsed draft bodies for disallowed or
failed creates so humans can file them manually.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…ullsend-ai#401)

The agent prompt referenced a nonexistent `prerequisites` label when
checking for prior blockers — the post-script actually applies the
`blocked` label. Also removed unused SOURCE_ORG variable from
post-triage.sh.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
ralphbean and others added 12 commits June 18, 2026 12:17
Two call sites in commitFilesTo were missed during the rename, causing
build failures.

Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
…pdate

The mergeEnrollmentPR function in the e2e test calls
MergeChangeProposal once without handling GitHub's 409 "Head branch
is out of date" response. When the reconcile workflow pushes to the
default branch between PR creation and the merge attempt, the
enrollment PR's base falls behind and the merge is rejected.

Add an UpdatePullRequestBranch method to the forge.Client interface
(wrapping GitHub's PUT /repos/{owner}/{repo}/pulls/{number}/update-branch)
and implement it in the GitHub LiveClient and FakeClient. In
mergeEnrollmentPR, wrap the merge call in a retry loop (up to 3
attempts) that detects 409 errors via the APIError status code,
calls UpdatePullRequestBranch to bring the PR branch up to date,
waits 5 seconds for GitHub to process, and retries the merge.

Note: pre-commit could not run in sandbox (shellcheck install
failed due to network restrictions). The post-script runs it
authoritatively.

Closes fullsend-ai#2432
fix(forge): retry 5xx server errors at the HTTP client level
docs(problems): add static analysis layer to testing-agents
…-merge-409

fix(fullsend-ai#2432): retry enrollment PR merge on 409 with branch update
The heading used `# ADR 0047: Vendored installs with --vendor`
but all other ADRs use `# <number>. <title>` without the ADR
prefix or zero-padded number. Updated to
`# 47. Vendored installs with --vendor` for consistency.

Note: pre-commit could not run in sandbox due to shellcheck
network error (exit 3). Post-script will run authoritatively.

Closes fullsend-ai#2440
…view

refactor(config): make OrgConfig.Agents optional and add Phase 4 plan (ADR-0045 Phase 3 PR 6)
Add ADR recording the decision to instrument fullsend with OpenTelemetry
using a three-level opt-in model (local files → OTLP metadata export →
content capture). Separates telemetry from evaluation concerns.

Key changes:
- ADR 0048: three-level content sensitivity model per OTEL GenAI spec,
  explicit scope boundary (evals consume traces, separate concern),
  multi-backend via OTEL Collector (not multi-endpoint config)
- Infrastructure guide: env var precedence, local dev section, content
  capture warning; backend-agnostic language throughout
- Cross-reference annotation in ADR 0021 (OTel future → now decided)
- Update cross-references in architecture.md and problem doc

Addresses review feedback from ralphbean, maruiz93, and review bot.

Signed-off-by: Adam Scerra <ascerra@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Adam Scerra <ascerra@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Adam Scerra <ascerra@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Adam Scerra <ascerra@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Adam Scerra <ascerra@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Adam Scerra <ascerra@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
docs: ADR 0050 — distributed tracing instrumentation with OpenTelemetry
…dr-heading-format

docs(fullsend-ai#2440): fix ADR 0047 heading to match convention
…n findings

The review agent hallucinated file contents on PR
konflux-ci/konflux-test#833, claiming a Dockerfile contained
--nogpgcheck when it never did. The root cause was that the
code-review skill and correctness sub-agent had no explicit
requirement to read files outside the PR diff before asserting
what they contain.

Changes:
- code-review SKILL.md step 2: added cross-file verification
  bullet requiring the agent to read any file it references
  in a finding, even if not in the diff.
- code-review SKILL.md step 4: added cross-file finding
  self-check requiring verification that referenced files
  were read before finalizing findings.
- correctness sub-agent: added cross-file verification
  section with the same read-before-assert requirement.

Note: make lint could not run due to sandbox network
restrictions preventing shellcheck installation.

Closes fullsend-ai#1835
@guyoron1

Copy link
Copy Markdown
Owner Author

/fs-qf

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 21, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:59 AM UTC · Completed 8:10 AM UTC
Commit: 53ee5b2 · View workflow run →

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 21, 2026

Copy link
Copy Markdown

Review

Reason: token-limit

This PR was NOT reviewed. Do not count this as an approval.

Previous run

Review

Reason: stale-head

The review agent reviewed commit c954becc75be3be25f268a0c6c46a24df8d8f8b5 but the PR HEAD is now fa3dcc241db4ce6641bb0decf63ec21f4139175c. This review was discarded to avoid approving unreviewed code.

Previous run (2)

Review

Reason: stale-head

The review agent reviewed commit 0f5066f722c1862e4ccc8e160b10f8125fc66a91 but the PR HEAD is now 3d2a530502f4d2d61c45ac55db1c33d5ced8d37a. This review was discarded to avoid approving unreviewed code.

@fullsend-ai-review

Copy link
Copy Markdown

/fs-review

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 21, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:12 AM UTC · Completed 8:48 AM UTC
Commit: 53ee5b2 · View workflow run →

QualityFlow and others added 3 commits June 21, 2026 08:25
…rios [skip ci]

- Add patterns field to all 11 scenarios (CRITICAL fix)
- Fix tier values: "Functional" -> "Tier 1" (MAJOR fix)
- Replace vague test step commands with concrete Go expressions
- Add 2 negative/error-path scenarios (012, 013) with Go stubs
- Remove related_prs from document_metadata
- Tighten overly broad search criterion in scenario 006
- Add regexp import, remove unused project imports
- Update review: NEEDS_REVISION -> APPROVED_WITH_FINDINGS (score 72->88)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fullsend-ai-review

Copy link
Copy Markdown

/fs-review

QualityFlow and others added 2 commits June 21, 2026 08:49
Generated 13 Go test implementations from STD YAML across 5 test files:
- eval_document_completeness_test.go (3 tests: project coverage, architecture, capability mapping)
- integration_surface_analysis_test.go (3 tests: forge.Client, harness/sandbox, config.OrgConfig)
- eval_recommendation_test.go (3 tests: recommendation, justification, follow-up)
- repo_accessibility_test.go (2 tests: accessibility status, deprecation handling)
- eval_negative_checks_test.go (2 tests: missing project detection, missing recommendation detection)
Replaces intermediate pipeline artifacts with organized test files.

Total: 5 test files → qf-tests/GH-54/
Jira: GH-54
[skip ci]
@github-actions

Copy link
Copy Markdown

QualityFlow Pipeline Summary

Stage Agent Status
1 STP Builder
2 STP Reviewer
3 STP Refiner
4 STD Builder
5 STD Reviewer
6 STD Refiner
7 Test Generator

Test Output

Language Count Location
Go 5 files qf-tests/GH-54/go/

Issue: GH-54


Generated by QualityFlow

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 21, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 8:51 AM UTC · Completed 8:53 AM UTC
Commit: 53ee5b2 · View workflow run →

@guyoron1 guyoron1 closed this Jun 21, 2026
@guyoron1 guyoron1 deleted the mirror/2443-1835-verify-file-contents-before-asserting branch June 21, 2026 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants