fix(#1835): require file reads before asserting contents in findings#70
fix(#1835): require file reads before asserting contents in findings#70guyoron1 wants to merge 196 commits into
Conversation
Add a new subsection under "CI pipeline for agent configurations" elaborating on Step 1 (static analysis). Covers component-level checks (structural integrity, security patterns, token budget), setup-level analysis (redundancy detection, dependency validation, token budget distribution, trigger overlap, dimension scoring), and optional LLM-based rubric scoring. Presents similarity techniques as options (TF-IDF, embeddings, LLM-based) rather than prescribing a single approach. Adds three open questions on thresholds, lint rule universality, and token budgets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Benjamin Kapner <bkapner@redhat.com>
Introduce --vendor to install vendored binaries, reusable workflows, actions, and agent content. Vendored upstream mirror content is committed under .defaults/ (same layout as runtime sparse checkout); layered installs fetch fullsend-ai/fullsend@v0 into .defaults when the marker file is absent. Reusable workflows use inline workspace preparation and reference infra from ./.defaults/, matching the pre-vendor layered design. Thin callers render local reusable paths when --vendor is set. --fullsend-source pins the source tree for both content and binary cross-compile; --fullsend-binary remains an explicit ELF override. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Write vendor-manifest.yaml on --vendor installs so cleanup and analyze work without a local fullsend checkout. Workflows analyze stays embed-only; vendor layer reports presence, manifest alignment, and optional source alignment via admin analyze --fullsend-source. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Consolidate thin-stage caller registry, reuse resolved source root for binary vendoring, reject oversized tar members during extraction, restore workflows scope comment, fix testing-workflows prose, and introduce InstallFiles as the canonical collector return type. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Re-add the full download_test.go suite and append extractSourceTree size limit coverage. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Delete vendored paths atomically via forge.DeleteFiles, reuse resolved source root for cross-compile, preserve extracted file modes, and tighten WouldFix deduplication to exact path matches. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Document intentional breaking change: old flag callers should use --vendor; only known usage was e2e, already updated in this branch. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Document VendorBinaryLayer legacy naming, restore Uninstall/Analyze comments, and use Title Case for stale-cleanup progress messages. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Batch binary, content, and manifest in one CommitFiles call; validate manifest version on read; trim leading slash in extractSourceTree; wrap DeleteFiles ref PATCH in retryOnTransient. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Use the existing blob mode from the recursive tree and set type blob so deletion entries match GitHub Trees API expectations. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Guard against regressions in delete-entry construction per review. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # internal/forge/fake.go # internal/forge/forge.go Signed-off-by: Barak Korren <bkorren@redhat.com>
Encode CommitFiles tree entries as base64 to preserve ELF binaries, add tar extract containment check, consolidate stale cleanup with a manifest/binary quick-check, and deduplicate cleanup between CLI and layer. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # action.yml # docs/guides/dev/testing-workflows.md Signed-off-by: Barak Korren <bkorren@redhat.com>
Clarify removed distribution-mode artifacts, drop e2e vendor line, and document action.yml source-build fallback. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Empty commit to re-dispatch review; prior synchronize dispatch was cancelled. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Keep enumerateVendoredPaths aligned with CollectVendoredAssets after main added the composite action (fullsend-ai#2106); fixes CI parity test. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…t dispatch GitHub Actions may return 422 when repo-maintenance is dispatched immediately after a separate vendor CommitFiles on a fresh .fullsend repo. Merge scaffold and vendored assets into one atomic commit and retry dispatch on indexing lag. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…nance Poll GitHub until repo-maintenance.yml is active before dispatch, re-touch config.yaml after scaffold so the push trigger can run enrollment when dispatch is still rejected, and fall back to awaiting a push-triggered run. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
…nary Tree entries with encoding:base64 stored base64 text literally on GitHub, corrupting YAML workflows and vendor-manifest.yaml. Restore UTF-8 inline content for text and upload binary via the Git Blob API instead. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Design for a new `prerequisites` triage action that replaces `blocked`. The agent can now express both existing blockers and new issues that need to be created upstream before progress can happen. Includes allowlist configuration for cross-repo issue creation and a degraded path when targets are not authorized. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…nd-ai#401) Seven-task plan covering config structs, JSON schema, agent prompt, post-script, user docs, and caller updates. TDD approach with exact file paths and code blocks. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
Add CreateIssuesConfig and AllowTargets types to both OrgConfig and PerRepoConfig. NewOrgConfig populates defaults with the org and fullsend-ai/fullsend. NewPerRepoConfig populates with the target repo and fullsend-ai/fullsend. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…ues (fullsend-ai#401) Pass org name and target repo to config constructors so create_issues defaults are populated at install time. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…pt (fullsend-ai#401) The triage agent can now recommend creating upstream issues via the prerequisites action's create array, in addition to referencing existing blockers. Adds hard constraint against emitting sufficient when prerequisites exist. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…d-ai#401) Update triage agent docs to explain the new prerequisites action and the create_issues.allow_targets configuration surface. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…#401) Replace the blocked handler with prerequisites. The post-script reads the create_issues allowlist from config.yaml, creates permitted upstream issues via gh, and includes collapsed draft bodies for disallowed or failed creates so humans can file them manually. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…ullsend-ai#401) The agent prompt referenced a nonexistent `prerequisites` label when checking for prior blockers — the post-script actually applies the `blocked` label. Also removed unused SOURCE_ORG variable from post-triage.sh. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
…[skip ci] Resolved all 6 review findings (2 major, 4 minor) in 1 iteration: - Rewrote requirement summaries to user-story format (D1-R-A-001) - Checked applicable strategy/review checkboxes (D6-001) - Improved priority distribution with P2 tier (D3-001) - Simplified subtitle, condensed overview, added scope acknowledgments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ReviewFindingsHigh
Medium
Low
Info
Previous runReviewFindingsHigh
Medium
Low
Info
Previous run (2)ReviewReason: stale-head The review agent reviewed commit Previous run (3)ReviewReason: stale-head The review agent reviewed commit Previous run (4)ReviewReason: stale-head The review agent reviewed commit Previous run (5)ReviewReason: stale-head The review agent reviewed commit |
|
/fs-review |
|
🤖 Finished Review · ✅ Success · Started 3:30 PM UTC · Completed 3:45 PM UTC |
…p ci] - Reviewed STD YAML and Go stubs across 7 dimensions - Fixed 3 MAJOR findings: removed related_prs from metadata, corrected import list (embed->io/fs), added v2.1 fields to all scenarios - Final verdict: APPROVED (0 critical, 0 major, 3 minor) - 100% STP-STD traceability (12/12 scenarios) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 Go tests implementing 12 STD scenarios for cross-file content verification in code-review SKILL.md and correctness sub-agent. Co-Authored-By: QualityFlow[bot] <qualityflow[bot]@users.noreply.github.com>
Replaces intermediate pipeline artifacts with organized test files. Total: 4 test files → qf-tests/fullsend-aiGH-1835/ Jira: fullsend-aiGH-1835 [skip ci]
QualityFlow Pipeline Summary
Test Output
Issue: GH-70 Generated by QualityFlow |
|
/fs-review |
|
/fs-qf |
|
🤖 Finished Review · ✅ Success · Started 4:07 AM UTC · Completed 4:24 AM UTC |
STP refined: APPROVED_WITH_FINDINGS → APPROVED - Fixed 5 major and 6 minor findings in 1 iteration - Corrected reference counts, priority distribution, metadata - Weighted score: 80 → 100
|
/fs-review |
|
🤖 Finished Review · ✅ Success · Started 4:27 AM UTC · Completed 5:06 AM UTC |
Co-located 13 functional tests in internal/scaffold/ verifying cross-file verification instructions in SKILL.md and correctness.md. All tests compile and pass.
Removes intermediate pipeline artifacts (STP, STD, reviews). Test files (4) are co-located in source tree with qf_ prefix. Jira: GH-70 [skip ci]
|
/fs-review |
|
🤖 Finished Review · ✅ Success · Started 5:09 AM UTC · Completed 5:25 AM UTC |
| BLOCKER_LIST="" | ||
| for url in ${ALL_URLS}; do | ||
| BLOCKER_LIST="${BLOCKER_LIST} | ||
| - ${url}" |
There was a problem hiding this comment.
[medium] injection-vuln
When a label name from untrusted agent output (agent-result.json) fails regex validation, it is interpolated into a GHA workflow command '::warning::' without sanitization for '::' sequences, URL-encoded newlines (%0A/%0D), or control characters. An attacker who controls agent-result.json could inject arbitrary workflow commands.
Suggested fix: Sanitize the label variable before interpolating into the workflow command. Strip or escape '::' sequences, '%0A', '%0D', and control characters.
| # cannot shadow trusted system tools (go, git, scan-secrets, etc.). | ||
| ENV GOPATH="/sandbox/go" \ | ||
| PATH="/usr/local/go/bin:/sandbox/go/bin:${PATH}" | ||
| PATH="/usr/local/go/bin:${PATH}:/sandbox/go/bin" |
There was a problem hiding this comment.
[medium] privilege-escalation
Comment on line 119 states '/sandbox/go/bin is placed AFTER system paths so sandbox-user binaries cannot shadow trusted system tools', but the actual PATH construction may not enforce this ordering correctly. If sandbox-user-writable directories appear before system directories, a malicious binary could shadow trusted tools.
Suggested fix: Verify that the PATH variable places /usr/local/bin, /usr/bin, and /usr/sbin before any sandbox-user-writable directories. Add a test or assertion that validates PATH ordering.
| h.roleAppIDs = ids | ||
| h.roleAppIDs = RoleOnlyAppIDs(ids) | ||
| h.legacyAppIDsOnly = legacyAppIDsOnly(ids) | ||
| } |
There was a problem hiding this comment.
[medium] logic-error
NewHandler derives roleSet (and ALLOWED_ROLES) by parsing ROLE_APP_IDS keys containing a slash (org/role format). If the PR migrates ROLE_APP_IDS from 'org/role' to bare 'role' keys, the slash-split logic will fail to extract role names correctly, potentially breaking role validation and token minting.
Suggested fix: Verify that the key parsing logic in NewHandler handles both 'org/role' and bare 'role' key formats, or update it to match the new key format exclusively.
| return orgs | ||
| return string(merged), nil | ||
| } | ||
|
|
There was a problem hiding this comment.
[medium] error-handling-gap
In mergeRoleAppIDs, if existing ROLE_APP_IDS JSON is malformed, the function silently returns without error, leaving the desired map unmerged with existing entries. This could cause role configuration drift where expected roles are silently dropped.
Suggested fix: Return an error when JSON unmarshaling fails, or at minimum log a warning so operators can detect configuration drift.
| | `triaged` | The issue is fully specified but is a feature or other category that requires human prioritization before coding. | | ||
| | `duplicate` | The issue duplicates an existing one. The agent identified the original and the post-script closes the issue. | | ||
| | `blocked` | The issue depends on another issue or external condition. The agent identified the blocker. | | ||
| | `blocked` | The issue depends on prerequisites — existing issues/PRs or newly created upstream issues. The agent identified or created the blockers. | |
There was a problem hiding this comment.
[medium] stale-doc
Documentation references the old triage action 'blocked', which was renamed to 'prerequisites' in this PR. Multiple occurrences in this file (lines 43, 83) describe the old action name.
| --slug=fullsend-ai-coder \ | ||
| --pem=/path/to/coder.pem | ||
| ``` | ||
|
|
There was a problem hiding this comment.
[medium] stale-doc
Documentation shows example ROLE_APP_IDS keys using the old '{org}/{role}' format (e.g., 'acme-corp/coder', 'acme-corp/review'), which was changed to bare role keys in this PR.
| @@ -63,18 +66,18 @@ gh pr list --repo OTHER-ORG/OTHER-REPO --state open --search "relevant keywords" | |||
|
|
|||
| If a cross-repo search fails or returns an error (e.g., due to access restrictions), note this in your reasoning as an information gap rather than concluding no blocking work exists. | |||
|
|
|||
There was a problem hiding this comment.
[medium] stale-doc
Scaffold template references the old 'blocked' label and action in multiple locations (lines 68, 203, 205, 211), which was renamed to 'prerequisites' in this PR. These templates are deployed to consumer repos and would propagate stale terminology.
| - `triaged` | ||
| Control labels are managed by each agent's post-script and will be refused | ||
| server-side if recommended. You do not need to track which labels are | ||
| control labels -- just recommend what fits and the pipeline will filter. | ||
|
|
||
| ## Step 1: Discover available labels | ||
|
|
||
| ``` | ||
| gh label list --repo OWNER/REPO --json name,description --limit 100 | ||
| ``` | ||
|
|
There was a problem hiding this comment.
[medium] stale-doc
Scaffold template references the old 'blocked' label, which was renamed to 'prerequisites' in this PR.
| run: "$GITHUB_ACTION_PATH/.github/scripts/install-openshell.sh" | ||
|
|
||
| - name: Restore cached sandbox image | ||
| id: sandbox-cache |
There was a problem hiding this comment.
[low] supply-chain
OpenShell installation uses 'curl | sh' pattern. While the commit SHA is pinned, the install script could be modified upstream between audits. The pinned SHA mitigates but does not eliminate supply chain risk.
| @@ -255,18 +253,23 @@ func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) { | |||
| json.NewEncoder(w).Encode(resp) | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
[low] edge-case
In ServeHTTP, the JSON response from json.NewEncoder(w).Encode is not checked for errors. If encoding fails (e.g., due to a closed connection), the partial response is silently lost.
Co-located tests (qf_* prefix) are now in source package directories. The qf-tests/ directory contained non-compiling tests from the old pipeline.
|
🤖 Finished Review · ✅ Success · Started 8:32 AM UTC · Completed 8:50 AM UTC |
| @@ -24,19 +25,6 @@ on: | |||
| - 'scripts/check-e2e-authorization.sh' | |||
| pull_request_target: | |||
There was a problem hiding this comment.
[high] secrets exposure / workflow attack surface expansion
The diff removes the paths: filter from the pull_request_target trigger, causing the e2e workflow to trigger on ALL pull_request_target events. The pull_request_target event is security-sensitive (accesses secrets, id-token: write). The runtime file-relevance check falls through as fail-open on gh api failure.
Suggested fix: Consider keeping a broad paths: filter as a declarative first-pass and refining with the runtime check. If removal is intentional, document the rationale.
| source "${SCRIPT_DIR}/openshell-version.sh" | ||
|
|
||
| echo "Installing OpenShell ${OPENSHELL_VERSION} (${OPENSHELL_SHA})" | ||
| curl -LsSf "https://raw.githubusercontent.com/NVIDIA/OpenShell/${OPENSHELL_SHA}/install.sh" \ |
There was a problem hiding this comment.
[medium] curl-pipe-sh remote code execution
The new script fetches and executes a remote install script via curl | sh. While the commit SHA is pinned, there is no integrity verification (checksum) of the downloaded script content before execution.
Suggested fix: Add SHA-256 checksum verification of the downloaded install script before piping to sh.
| default: "" | ||
| status-token: | ||
| description: Token for status comments (defaults to GH_TOKEN env var). | ||
| mint-url: |
There was a problem hiding this comment.
[medium] token scope change / authentication model
The status-token input is replaced with mint-url, changing the authentication model. If a caller provides status-repo and status-number but not mint-url, orphaned status comments will not be reconciled. No GITHUB_TOKEN fallback exists in the new version.
Suggested fix: Ensure all callers have been updated to pass mint-url instead of status-token.
| @@ -145,12 +147,10 @@ jobs: | |||
| ORIGINATING_URL: ${{ fromJSON(inputs.event_payload).pull_request.html_url || fromJSON(inputs.event_payload).issue.html_url }} | |||
| RETRO_COMMENT: ${{ fromJSON(inputs.event_payload).comment.body || '' }} | |||
There was a problem hiding this comment.
[low] secrets handling
The diff removes GH_TOKEN from the env block. The post-retro.sh script requires GH_TOKEN. Verify the mint-url flow provides GH_TOKEN to post-scripts.
Suggested fix: Confirm fullsend run with --mint-url correctly populates GH_TOKEN for post-scripts.
| @@ -587,7 +610,10 @@ Protected paths (kept in sync with `post-review.sh`): | |||
| - `api-servers/` | |||
There was a problem hiding this comment.
[info] protected paths update
Protected paths list in SKILL.md adds Containerfile, Dockerfile, and images/ but post-review.sh and post-fix.sh enforcement scripts do not include these. Agent-side and enforcement-side lists are out of sync.
Mirror of upstream fullsend-ai#2443
Ensures file contents are read and verified before asserting their contents in findings.