fix(#1835): require file reads before asserting contents in findings#54
fix(#1835): require file reads before asserting contents in findings#54guyoron1 wants to merge 186 commits into
Conversation
Add a new subsection under "CI pipeline for agent configurations" elaborating on Step 1 (static analysis). Covers component-level checks (structural integrity, security patterns, token budget), setup-level analysis (redundancy detection, dependency validation, token budget distribution, trigger overlap, dimension scoring), and optional LLM-based rubric scoring. Presents similarity techniques as options (TF-IDF, embeddings, LLM-based) rather than prescribing a single approach. Adds three open questions on thresholds, lint rule universality, and token budgets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Benjamin Kapner <bkapner@redhat.com>
Introduce --vendor to install vendored binaries, reusable workflows, actions, and agent content. Vendored upstream mirror content is committed under .defaults/ (same layout as runtime sparse checkout); layered installs fetch fullsend-ai/fullsend@v0 into .defaults when the marker file is absent. Reusable workflows use inline workspace preparation and reference infra from ./.defaults/, matching the pre-vendor layered design. Thin callers render local reusable paths when --vendor is set. --fullsend-source pins the source tree for both content and binary cross-compile; --fullsend-binary remains an explicit ELF override. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Write vendor-manifest.yaml on --vendor installs so cleanup and analyze work without a local fullsend checkout. Workflows analyze stays embed-only; vendor layer reports presence, manifest alignment, and optional source alignment via admin analyze --fullsend-source. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Consolidate thin-stage caller registry, reuse resolved source root for binary vendoring, reject oversized tar members during extraction, restore workflows scope comment, fix testing-workflows prose, and introduce InstallFiles as the canonical collector return type. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Re-add the full download_test.go suite and append extractSourceTree size limit coverage. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Delete vendored paths atomically via forge.DeleteFiles, reuse resolved source root for cross-compile, preserve extracted file modes, and tighten WouldFix deduplication to exact path matches. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Document intentional breaking change: old flag callers should use --vendor; only known usage was e2e, already updated in this branch. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Document VendorBinaryLayer legacy naming, restore Uninstall/Analyze comments, and use Title Case for stale-cleanup progress messages. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Batch binary, content, and manifest in one CommitFiles call; validate manifest version on read; trim leading slash in extractSourceTree; wrap DeleteFiles ref PATCH in retryOnTransient. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Use the existing blob mode from the recursive tree and set type blob so deletion entries match GitHub Trees API expectations. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Guard against regressions in delete-entry construction per review. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # internal/forge/fake.go # internal/forge/forge.go Signed-off-by: Barak Korren <bkorren@redhat.com>
Encode CommitFiles tree entries as base64 to preserve ELF binaries, add tar extract containment check, consolidate stale cleanup with a manifest/binary quick-check, and deduplicate cleanup between CLI and layer. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # action.yml # docs/guides/dev/testing-workflows.md Signed-off-by: Barak Korren <bkorren@redhat.com>
Clarify removed distribution-mode artifacts, drop e2e vendor line, and document action.yml source-build fallback. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Empty commit to re-dispatch review; prior synchronize dispatch was cancelled. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Keep enumerateVendoredPaths aligned with CollectVendoredAssets after main added the composite action (fullsend-ai#2106); fixes CI parity test. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…t dispatch GitHub Actions may return 422 when repo-maintenance is dispatched immediately after a separate vendor CommitFiles on a fresh .fullsend repo. Merge scaffold and vendored assets into one atomic commit and retry dispatch on indexing lag. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…nance Poll GitHub until repo-maintenance.yml is active before dispatch, re-touch config.yaml after scaffold so the push trigger can run enrollment when dispatch is still rejected, and fall back to awaiting a push-triggered run. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…nary Tree entries with encoding:base64 stored base64 text literally on GitHub, corrupting YAML workflows and vendor-manifest.yaml. Restore UTF-8 inline content for text and upload binary via the Git Blob API instead. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Design for a new `prerequisites` triage action that replaces `blocked`. The agent can now express both existing blockers and new issues that need to be created upstream before progress can happen. Includes allowlist configuration for cross-repo issue creation and a degraded path when targets are not authorized. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…nd-ai#401) Seven-task plan covering config structs, JSON schema, agent prompt, post-script, user docs, and caller updates. TDD approach with exact file paths and code blocks. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
Add CreateIssuesConfig and AllowTargets types to both OrgConfig and PerRepoConfig. NewOrgConfig populates defaults with the org and fullsend-ai/fullsend. NewPerRepoConfig populates with the target repo and fullsend-ai/fullsend. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…ues (fullsend-ai#401) Pass org name and target repo to config constructors so create_issues defaults are populated at install time. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…pt (fullsend-ai#401) The triage agent can now recommend creating upstream issues via the prerequisites action's create array, in addition to referencing existing blockers. Adds hard constraint against emitting sufficient when prerequisites exist. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…d-ai#401) Update triage agent docs to explain the new prerequisites action and the create_issues.allow_targets configuration surface. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…#401) Replace the blocked handler with prerequisites. The post-script reads the create_issues allowlist from config.yaml, creates permitted upstream issues via gh, and includes collapsed draft bodies for disallowed or failed creates so humans can file them manually. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…ullsend-ai#401) The agent prompt referenced a nonexistent `prerequisites` label when checking for prior blockers — the post-script actually applies the `blocked` label. Also removed unused SOURCE_ORG variable from post-triage.sh. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
Two call sites in commitFilesTo were missed during the rename, causing build failures. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…pdate
The mergeEnrollmentPR function in the e2e test calls
MergeChangeProposal once without handling GitHub's 409 "Head branch
is out of date" response. When the reconcile workflow pushes to the
default branch between PR creation and the merge attempt, the
enrollment PR's base falls behind and the merge is rejected.
Add an UpdatePullRequestBranch method to the forge.Client interface
(wrapping GitHub's PUT /repos/{owner}/{repo}/pulls/{number}/update-branch)
and implement it in the GitHub LiveClient and FakeClient. In
mergeEnrollmentPR, wrap the merge call in a retry loop (up to 3
attempts) that detects 409 errors via the APIError status code,
calls UpdatePullRequestBranch to bring the PR branch up to date,
waits 5 seconds for GitHub to process, and retries the merge.
Note: pre-commit could not run in sandbox (shellcheck install
failed due to network restrictions). The post-script runs it
authoritatively.
Closes fullsend-ai#2432
fix(forge): retry 5xx server errors at the HTTP client level
docs(problems): add static analysis layer to testing-agents
…-merge-409 fix(fullsend-ai#2432): retry enrollment PR merge on 409 with branch update
The heading used `# ADR 0047: Vendored installs with --vendor` but all other ADRs use `# <number>. <title>` without the ADR prefix or zero-padded number. Updated to `# 47. Vendored installs with --vendor` for consistency. Note: pre-commit could not run in sandbox due to shellcheck network error (exit 3). Post-script will run authoritatively. Closes fullsend-ai#2440
…view refactor(config): make OrgConfig.Agents optional and add Phase 4 plan (ADR-0045 Phase 3 PR 6)
Add ADR recording the decision to instrument fullsend with OpenTelemetry using a three-level opt-in model (local files → OTLP metadata export → content capture). Separates telemetry from evaluation concerns. Key changes: - ADR 0048: three-level content sensitivity model per OTEL GenAI spec, explicit scope boundary (evals consume traces, separate concern), multi-backend via OTEL Collector (not multi-endpoint config) - Infrastructure guide: env var precedence, local dev section, content capture warning; backend-agnostic language throughout - Cross-reference annotation in ADR 0021 (OTel future → now decided) - Update cross-references in architecture.md and problem doc Addresses review feedback from ralphbean, maruiz93, and review bot. Signed-off-by: Adam Scerra <ascerra@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Adam Scerra <ascerra@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Adam Scerra <ascerra@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Adam Scerra <ascerra@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Adam Scerra <ascerra@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Adam Scerra <ascerra@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
docs: ADR 0050 — distributed tracing instrumentation with OpenTelemetry
…dr-heading-format docs(fullsend-ai#2440): fix ADR 0047 heading to match convention
…n findings The review agent hallucinated file contents on PR konflux-ci/konflux-test#833, claiming a Dockerfile contained --nogpgcheck when it never did. The root cause was that the code-review skill and correctness sub-agent had no explicit requirement to read files outside the PR diff before asserting what they contain. Changes: - code-review SKILL.md step 2: added cross-file verification bullet requiring the agent to read any file it references in a finding, even if not in the diff. - code-review SKILL.md step 4: added cross-file finding self-check requiring verification that referenced files were read before finalizing findings. - correctness sub-agent: added cross-file verification section with the same read-before-assert requirement. Note: make lint could not run due to sandbox network restrictions preventing shellcheck installation. Closes fullsend-ai#1835
|
/fs-qf |
|
🤖 Finished Review · ✅ Success · Started 7:59 AM UTC · Completed 8:10 AM UTC |
ReviewReason: token-limit This PR was NOT reviewed. Do not count this as an approval. Previous runReviewReason: stale-head The review agent reviewed commit Previous run (2)ReviewReason: stale-head The review agent reviewed commit |
|
/fs-review |
|
🤖 Finished Review · ✅ Success · Started 8:12 AM UTC · Completed 8:48 AM UTC |
…rios [skip ci] - Add patterns field to all 11 scenarios (CRITICAL fix) - Fix tier values: "Functional" -> "Tier 1" (MAJOR fix) - Replace vague test step commands with concrete Go expressions - Add 2 negative/error-path scenarios (012, 013) with Go stubs - Remove related_prs from document_metadata - Tighten overly broad search criterion in scenario 006 - Add regexp import, remove unused project imports - Update review: NEEDS_REVISION -> APPROVED_WITH_FINDINGS (score 72->88) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/fs-review |
Generated 13 Go test implementations from STD YAML across 5 test files: - eval_document_completeness_test.go (3 tests: project coverage, architecture, capability mapping) - integration_surface_analysis_test.go (3 tests: forge.Client, harness/sandbox, config.OrgConfig) - eval_recommendation_test.go (3 tests: recommendation, justification, follow-up) - repo_accessibility_test.go (2 tests: accessibility status, deprecation handling) - eval_negative_checks_test.go (2 tests: missing project detection, missing recommendation detection)
QualityFlow Pipeline Summary
Test Output
Issue: GH-54 Generated by QualityFlow |
|
🤖 Finished Review · ✅ Success · Started 8:51 AM UTC · Completed 8:53 AM UTC |
Mirror of upstream fullsend-ai#2443
The review agent hallucinated file contents, claiming a Dockerfile contained --nogpgcheck when it never did. Adds explicit requirement to read files outside the PR diff before asserting what they contain.