feat(code-review): add the on-demand diff/PR reviewer skill#4
feat(code-review): add the on-demand diff/PR reviewer skill#4Timmy-Lane wants to merge 1 commit into
Conversation
A native compound-v skill — the on-demand counterpart to recheck's in-pipeline gate. Point it at a PR, branch, or working set; it scopes the target, scales review depth to the diff (low→ultra), fans parallel review lenses, confidence-gates findings to keep false positives off the PR, and can post to GitHub (--comment) or apply fixes (--fix). The review stays read-only; --fix is a separate, re-verified phase. Wires into the using-compound-v router (Build group) and the README skills table; grounds its load-bearing claims in references/sources.md (new ## code-review section), incl. the confidence-scored multi-agent architecture from Anthropic's official code-review plugin. Claude-Session: https://claude.ai/code/session_01C5WYe368wXRGA7UAGVDgYL
📝 WalkthroughWalkthroughA new Changescode-review skill
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
skills/code-review/SKILL.md (2)
71-76: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd language specifier to fenced code block.
The output format block at lines 71–76 is missing a language identifier (e.g.,
```textor```plaintextinstead of just```). This is required by markdownlint-cli2 (MD040).📝 Suggested fix
-``` +```text [Critical|Important|Minor] (confidence NN) path/to/file.ext:line issue: one sentence — what is wrong why: one sentence — the concrete impact / the input that triggers it fix: one sentence — what would resolve it -``` +```🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/code-review/SKILL.md` around lines 71 - 76, The fenced code block containing the output format specification (starting with [Critical|Important|Minor] and including the issue, why, and fix fields) is missing a language identifier on the opening fence. Add a language specifier such as "text" or "plaintext" to the opening triple backticks to comply with markdownlint-cli2 rule MD040 (fenced code blocks must have a language identifier).Source: Linters/SAST tools
63-63: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueMinor style consistency: "on purpose" → "deliberately".
Line 63 uses "on purpose"; earlier at line 78, the same concept is phrased "deliberate design choice". Consider aligning to "deliberately" for consistency.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/code-review/SKILL.md` at line 63, The phrase "on purpose" in the sentence about a change the author clearly made should be replaced with "deliberately" to maintain consistency with the "deliberate design choice" phrasing used later in the same sentence and align with terminology used elsewhere in the document at line 78. Update the wording to read "A change the author clearly made deliberately" instead.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@references/sources.md`:
- Around line 42-56: Correct three source attribution and URL errors in the
code-review grounding table. First, update the Anthropic plugin repository URL
in the "Confidence-scored, multi-agent" row from claude-plugins-public to
claude-plugins-official (the correct repository). Second, in the "Clean-context
reviewer catches ~2 bugs/PR" row, either remove the unsupported ~2 bugs/PR
metric since the cited Cognition article does not provide this specific
statistic, or replace it with a source that does contain this data point. Third,
in the "Parallelize the read/analysis lenses; keep any write single-threaded"
row, correct the citation from "Don't Build Multi-Agents" to "Multi-Agents:
What's Actually Working" since that is the article where this specific pattern
is actually discussed and the original article does not contain this claim.
---
Nitpick comments:
In `@skills/code-review/SKILL.md`:
- Around line 71-76: The fenced code block containing the output format
specification (starting with [Critical|Important|Minor] and including the issue,
why, and fix fields) is missing a language identifier on the opening fence. Add
a language specifier such as "text" or "plaintext" to the opening triple
backticks to comply with markdownlint-cli2 rule MD040 (fenced code blocks must
have a language identifier).
- Line 63: The phrase "on purpose" in the sentence about a change the author
clearly made should be replaced with "deliberately" to maintain consistency with
the "deliberate design choice" phrasing used later in the same sentence and
align with terminology used elsewhere in the document at line 78. Update the
wording to read "A change the author clearly made deliberately" instead.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 31b6cd5e-fa45-43f5-9a3e-c3ec6c6687e8
📒 Files selected for processing (4)
README.mdreferences/sources.mdskills/code-review/SKILL.mdskills/using-compound-v/SKILL.md
| ## code-review | ||
|
|
||
| | Claim (short) | skill:line | Category | Source / note | | ||
| |---|---|---|---| | ||
| | **Confidence-scored, multi-agent** review — fan parallel review lenses across a PR/diff, then score each candidate finding and filter the low-confidence ones to keep false positives off the PR | `code-review` (Steps 3-4) | PRIMARY | Anthropic Claude Code — official `code-review` plugin (`claude-plugins-official` marketplace): "Automated code review for pull requests using multiple specialized agents with confidence-based scoring to filter false positives." https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review | | ||
| | Confidence filter at **~80 / 100** | `code-review` (Step 4) | JUDGMENT-CALL (recipe-knob) | The threshold is a tunable knob (the official plugin filters below 80); the *mechanism* (confidence-gate to drop false positives) is the grounded part above. No separate citation needed. | | ||
| | Effort scale **low / medium / high / max / ultra** maps review depth to diff size; route *down* when unsure | `code-review` (Step 2) | JUDGMENT-CALL (recipe-knob) | The kit's tier law applied to a review — owning skill is `using-compound-v` (anti-overkill, `references/sources.md` → using-compound-v). Mirrors the depth tiers surfaced by Claude Code's own `/code-review` effort levels. | | ||
| | Clean-context reviewer catches **~2 bugs/PR, most severe** | `code-review` (Step 3) | PRIMARY | Cognition, "Multi-Agents: What's Actually Working" — https://cognition.ai/blog/multi-agents-working. Same source as the recheck table; reused here for the on-demand reviewer. | | ||
| | Parallelize the read/analysis lenses; keep any **write single-threaded** | `code-review` (Step 3) | PRIMARY | Walden Yan, Cognition, "Don't Build Multi-Agents" — https://cognition.ai/blog/dont-build-multi-agents. Same as `ai-system-reliability:30`. | | ||
| | Reviewer stays **read-only**; `--fix` is a separate, re-verified apply phase | `code-review` (Output) | PRIMARY | Same read-only-reviewer grounding as `recheck:19` (production reviewers mutate nothing); the bug a reviewer introduces is the one nobody reviews. | | ||
| | Don't flag **intentional changes** / no **extra rigor** beyond the surrounding code | `code-review` (Step 4) | PRIMARY | OpenAI Codex CLI review prompt (`codex-rs/core/review_prompt.md`), same rows as `recheck:68`. | | ||
| | GitHub comments: brief, no emojis, cite file+line with a permalink | `code-review` (Output) | JUDGMENT-CALL | Output-format recipe; matches the official `code-review` plugin's comment conventions (cited above). No separate citation needed. | | ||
|
|
||
| --- | ||
|
|
There was a problem hiding this comment.
🎯 Functional Correctness | 🔴 Critical
🧩 Analysis chain
🌐 Web query:
Does the URL https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review exist and describe a code-review plugin from Anthropic?
💡 Result:
The URL https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review does not exist [1][2][3]. Anthropic maintains a repository called claude-plugins-official, which contains the official code-review plugin [1][4][5]. The code-review plugin is a legitimate tool developed by Anthropic [6][7]. It is an automated code review solution for pull requests that utilizes multiple specialized agents working in parallel to audit changes, check for compliance with repository guidelines (such as CLAUDE.md), detect bugs, and analyze historical context via git blame [1][8][9]. It employs a confidence-based scoring system (0-100) to filter out potential false positives, typically only posting feedback that meets a configurable high-confidence threshold (default 80) [8][9][7]. The plugin is integrated into and available through the Claude Code environment [1][8][7].
Citations:
- 1: https://github.com/anthropics/claude-plugins-official/tree/main/plugins/code-review
- 2: https://github.com/anthropics/claude-plugins-public
- 3: https://github.com/anthropics/claude-plugins-official
- 4: https://github.com/anthropics/claude-plugins-official/blob/main/plugins/code-review/commands/code-review.md
- 5: https://github.com/anthropics/claude-plugins-official/tree/main/plugins
- 6: https://github.com/anthropics/claude-plugins-official/blob/main/.claude-plugin/marketplace.json
- 7: https://github.com/anthropics/claude-code/blob/main/plugins/code-review/README.md
- 8: https://github.com/anthropics/claude-code/blob/2923bc87/plugins/code-review/README.md
- 9: https://claude.com/plugins/code-review
🌐 Web query:
Does the URL https://cognition.ai/blog/multi-agents-working exist and contain analysis about how many bugs clean-context reviewers catch?
💡 Result:
The URL https://cognition.ai/blog/multi-agents-working exists and is titled "Multi-Agents: What's Actually Working" [1]. The blog post mentions that "a clean-context reviewer catches bugs the coder can't see" [1][2]. However, it does not provide specific data or statistics regarding exactly how many bugs such reviewers catch [1][2]. The post discusses the qualitative advantages of a clean-context reviewer, such as the ability to reason backward from an implementation without the original agent's accumulated context, which can help overcome issues related to context rot and help the agent identify nuances that were previously overlooked [2].
Citations:
- 1: https://cognition.ai/blog/multi-agents-working
- 2: https://cognition.com/blog/multi-agents-working
🌐 Web query:
Does the URL https://cognition.ai/blog/dont-build-multi-agents exist and discuss parallelizing read/analysis while keeping writes single-threaded?
💡 Result:
Yes, the URL https://cognition.ai/blog/dont-build-multi-agents exists [1]. It is an article published by Cognition that argues against the use of multi-agent architectures due to issues with context-passing and decision-making fragmentation [1][2]. Regarding the discussion of parallelizing read/analysis while keeping writes single-threaded, this specific topic is addressed in a follow-up post by Cognition titled "Multi-Agents: What's Actually Working" [3]. In that article, the author notes that while their original warnings about multi-agent systems hold true for "parallel-writer swarms," they have found success with "a narrower class of patterns... where multiple agents contribute intelligence to a task while writes stay single-threaded" [3]. The post explains that this approach allows for parallelized intelligence-gathering—such as using subagents for review or analysis—without the coordination risks associated with allowing multiple agents to act or write concurrently [3].
Citations:
- 1: https://cognition.ai/blog/dont-build-multi-agents
- 2: https://agentic-ai.readthedocs.io/en/latest/ContextEngineering/devin/
- 3: https://cognition.ai/blog/multi-agents-working
Fix incorrect repository URL and misattributed claims.
The grounding table contains three critical issues:
-
Anthropic plugin repository: Line 44 cites
https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-reviewbut the repository isclaude-plugins-official, notclaude-plugins-public. Correct URL:https://github.com/anthropics/claude-plugins-official/tree/main/plugins/code-review -
"~2 bugs/PR" statistic unsupported: Line 50 claims the Cognition article "Multi-Agents: What's Actually Working" contains data on how many bugs clean-context reviewers catch. The article discusses qualitatively that reviewers "catch bugs the coder can't see" but does not provide the specific ~2 bugs/PR metric. This claim needs either a different source or removal.
-
Parallel read/single-threaded write pattern misattributed: Line 53 cites "Don't Build Multi-Agents" for the claim about parallelizing read/analysis while keeping writes single-threaded. This specific pattern is discussed in the follow-up article "Multi-Agents: What's Actually Working," not the original "Don't Build Multi-Agents" article. Update the citation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@references/sources.md` around lines 42 - 56, Correct three source attribution
and URL errors in the code-review grounding table. First, update the Anthropic
plugin repository URL in the "Confidence-scored, multi-agent" row from
claude-plugins-public to claude-plugins-official (the correct repository).
Second, in the "Clean-context reviewer catches ~2 bugs/PR" row, either remove
the unsupported ~2 bugs/PR metric since the cited Cognition article does not
provide this specific statistic, or replace it with a source that does contain
this data point. Third, in the "Parallelize the read/analysis lenses; keep any
write single-threaded" row, correct the citation from "Don't Build Multi-Agents"
to "Multi-Agents: What's Actually Working" since that is the article where this
specific pattern is actually discussed and the original article does not contain
this claim.
What
Adds
compound-v:code-review— a native skill for the on-demand review case, the counterpart torecheck's in-pipeline gate.code-reviewis already a declared keyword inplugin.json; this fills it.You point it at a PR, a branch, or your uncommitted diff and it:
low → ultra), routing down when unsure — the kit's anti-overkill tier law applied to a review;high+, fans parallel review lenses (conventions, diff-bugs, git history, prior PRs, inline comments);code-reviewplugin;--comment) or apply the fixes (--fix).Boundary with
recheck(why this isn't a duplicate)recheck= the in-pipeline gate — the fixed read-only pass an implementer batch hands off to before the next batch.code-review= the on-demand reviewer you fire at a change directly: it scopes its own target, scales depth, confidence-gates, and reaches out to GitHub.The review stays read-only (a reviewer that edits ships its own unreviewed bug);
--fixis a separate, re-verified apply phase.Changes
skills/code-review/SKILL.md(new, 95 lines — under the 250 target).skills/using-compound-v/SKILL.md+README.md— wired into the Build group with the recheck boundary stated.references/sources.md— new## code-reviewsection grounding the load-bearing claims (confidence-scored multi-agent architecture → Anthropic's publicclaude-plugins-public/code-review; clean-context-reviewer and single-threaded-writes reuse the existing Cognition rows; FP calibration reuses the Codex review-prompt rows).Validation
bash scripts/check.sh: my additions pass all four checks (name=dir, line budget, no leak patterns, all cross-refs resolve, no@pathlinks).check.shcurrently fails on a pre-existingBAD_GUIDEreference inreferences/sources.md:445,447(the## BAD_GUIDE harvest additionssection, committed onmainbefore this branch). It is not introduced by this PR and is untouched by my diff.Provenance
The
/code-reviewcapability this is modeled on is a built-in Claude Code command (compiled into the CLI) — there was no file to copy, so this is authored natively in compound-v's format/voice rather than lifting any proprietary prompt.https://claude.ai/code/session_01C5WYe368wXRGA7UAGVDgYL
Summary by CodeRabbit
New Features
Documentation