This action reviews pull requests with an LLM and optional auxiliary tooling. The workflow may execute against untrusted pull request content, so all enrichment features are treated as high-risk by default.
- Prompt injection inside PR content, linked issues, linked sources, or fetched metadata
- Tool request abuse (requesting sensitive files, broad API access, or untrusted hosts)
- Token and secret exposure in tool outputs
- Token and secret exposure in evidence-provider stdout/stderr
- Cross-repository pull requests attempting to run repo-defined scripts
- Tool harness defaults to
off(tool_mode=off) - Tool harness treats corpus text as untrusted and does not follow corpus instructions
- Tool harness uses a strict read-only allowlist (
gh_api,read_file,web_fetch,git_grep, and named-onlyrun_command) gh_apivalidates endpoint characters against a safe-character regex, rejects dot-segments, enforces a repo allowlist (same-repo or explicit), restricts calls to read-only API prefixes (/repos/,/issues/,/search/,/releases/,/git/), and blocks sensitive path substrings (/actions/secrets,/dependabot/secrets, etc.)gh_apican optionally include specific upstream repos via explicit allowlist (tool_allowed_gh_api_repos) or all repos via*while preserving endpoint/path denylist checks- Anthropic responses are parsed from
textblocks only; non-text blocks such asthinkingare not added to review output read_fileis constrained to workspace-relative paths and blocks sensitive path patternsweb_fetchis constrained toallowed_source_hostsrun_commandrejects raw shell text and permits only named read-only argv definitions (git_status_short,git_diff_stat,git_diff_name_only)- Tool outputs are size-limited and pass through shared secret redaction before corpus inclusion
- Evidence provider stdout and stderr are passed through the same secret-redaction pipeline before being written to JSON summaries or markdown output
- Tool and evidence-provider enrichment are skipped on cross-repository PRs by default (
tool_enable_for_forks=false,evidence_enable_for_forks=false). This prevents forked PRs from executing arbitrary scripts defined in the destination repository's config. - Evidence providers execute in the context of the checked-out pull request code, with full access to the PR's working tree, environment variables, and installed tools. Commands run from the repository root.
- Evidence provider blocker findings can be deterministically enforced (
evidence_blocker_enforcement=true) - Tool harness failures can be made fail-closed with
tool_failure_enforcement=true(planning failure or all tool requests failing) - Tool harness can require minimum evidence breadth via
tool_min_successful_requests
The managed PR comment uses HTML comment markers to embed internal metadata for diff-skip and staleness detection:
<!-- ai-pr-review-fingerprint:<value> -->— stable patch + config fingerprint used by the precheck to skip unchanged diffs.<!-- ai-pr-review-sha:<sha> -->— PR head SHA used to detect out-of-date reviews.
A malicious PR could attempt prompt injection by embedding fake metadata markers in model-generated review markdown. If later parsing scans the entire managed comment body, such injected markers could interfere with precheck skip/precheck behavior (e.g., a fake fingerprint that matches an unrelated diff).
The action uses a defense-in-depth approach:
- Publish-time stripping — Before publishing a managed PR comment,
scripts/strip_metadata_markers.pyis invoked on the model-generated markdown to remove any matching reserved marker patterns. The trusted markers (sha + fingerprint) are then appended after stripping, so only genuine ones survive. - Precheck reads first occurrence only — The precheck parser uses
sed -nwithhead -n 1to extract only the first occurrence of each marker from the comment body, providing a second layer of defense against any residual injection.
The following patterns are treated as reserved and will be stripped from model output (case-insensitive matching, whitespace-tolerant):
<!-- ai-pr-review-fingerprint:<any-value> -->
<!-- ai-pr-review-sha:<any-sha> -->
Non-reserved HTML comments (e.g., <!-- TODO: fix this -->) are preserved.
See scripts/strip_metadata_markers.py for the implementation and tests/test_strip_metadata_markers.py for regression tests covering fake marker injection scenarios.
- Enable
allow_approveonly when you understand the implications for your repository's merge policy; native approvals can affect branch protection rules and automerge pipelines - Keep GitHub token permissions minimal (
contents: read,pull-requests: write) - Use self-hosted runners only when required, and isolate them from sensitive networks
- Prefer
tool_mode=offfor public repositories unless you need tool planning - Keep
allowed_source_hostsnarrow - Treat evidence provider scripts as trusted code and review changes carefully
- Prefer argv arrays (
["python3", "scripts/check.py"]) over shell strings ("python3 scripts/check.py") for provider commands, to avoid shell injection risks frombash -lcexecution - Treat additions to the named tool command catalog as security-sensitive changes; keep them read-only and avoid shells, package managers, network clients, or repo mutation
- Secret redaction is heuristic and not guaranteed to catch all credential formats; it covers common patterns (GitHub tokens, AWS keys, bearer tokens, key=value secrets) but may miss novel or encoded credentials
- LLM planning can still make low-quality tool choices; controls restrict blast radius but do not guarantee relevance
- If you enable fork execution for tools/providers (
*_enable_for_forks=true), you accept significantly higher risk: fork authors could craft PRs that exploit provider scripts or tool commands to access runner secrets, exfiltrate data, or execute arbitrary code - Evidence provider stdout/stderr are redacted for known secret patterns, but custom credential formats may slip through