Skip to content

feat(code-review): add the on-demand diff/PR reviewer skill#4

Open
Timmy-Lane wants to merge 1 commit into
LeventySeven:mainfrom
Timmy-Lane:add-code-review-skill
Open

feat(code-review): add the on-demand diff/PR reviewer skill#4
Timmy-Lane wants to merge 1 commit into
LeventySeven:mainfrom
Timmy-Lane:add-code-review-skill

Conversation

@Timmy-Lane

@Timmy-Lane Timmy-Lane commented Jun 23, 2026

Copy link
Copy Markdown

What

Adds compound-v:code-review — a native skill for the on-demand review case, the counterpart to recheck's in-pipeline gate. code-review is already a declared keyword in plugin.json; this fills it.

You point it at a PR, a branch, or your uncommitted diff and it:

  • scopes the target and runs a cheap eligibility check (skip drafts/bots/already-reviewed);
  • scales review depth to the diff (low → ultra), routing down when unsure — the kit's anti-overkill tier law applied to a review;
  • at high+, fans parallel review lenses (conventions, diff-bugs, git history, prior PRs, inline comments);
  • confidence-gates findings (drop below ~80) so false positives don't ship — the one genuinely good idea borrowed from Anthropic's official code-review plugin;
  • returns the kit's standard severity-tagged findings + verdict, and can post to GitHub (--comment) or apply the fixes (--fix).

Boundary with recheck (why this isn't a duplicate)

  • recheck = the in-pipeline gate — the fixed read-only pass an implementer batch hands off to before the next batch.
  • code-review = the on-demand reviewer you fire at a change directly: it scopes its own target, scales depth, confidence-gates, and reaches out to GitHub.

The review stays read-only (a reviewer that edits ships its own unreviewed bug); --fix is a separate, re-verified apply phase.

Changes

  • skills/code-review/SKILL.md (new, 95 lines — under the 250 target).
  • skills/using-compound-v/SKILL.md + README.md — wired into the Build group with the recheck boundary stated.
  • references/sources.md — new ## code-review section grounding the load-bearing claims (confidence-scored multi-agent architecture → Anthropic's public claude-plugins-public/code-review; clean-context-reviewer and single-threaded-writes reuse the existing Cognition rows; FP calibration reuses the Codex review-prompt rows).

Validation

  • bash scripts/check.sh: my additions pass all four checks (name=dir, line budget, no leak patterns, all cross-refs resolve, no @path links).
  • Heads-up: check.sh currently fails on a pre-existing BAD_GUIDE reference in references/sources.md:445,447 (the ## BAD_GUIDE harvest additions section, committed on main before this branch). It is not introduced by this PR and is untouched by my diff.

Provenance

The /code-review capability this is modeled on is a built-in Claude Code command (compiled into the CLI) — there was no file to copy, so this is authored natively in compound-v's format/voice rather than lifting any proprietary prompt.

https://claude.ai/code/session_01C5WYe368wXRGA7UAGVDgYL

Summary by CodeRabbit

  • New Features

    • Added code-review skill for on-demand PR and branch reviews with configurable depth levels (low/medium/high/max/ultra).
  • Documentation

    • Added code-review skill documentation including usage guidelines, review criteria, confidence-gated findings, and two-phase apply model.
    • Updated skill references to include code-review as an available Build skill.

A native compound-v skill — the on-demand counterpart to recheck's
in-pipeline gate. Point it at a PR, branch, or working set; it scopes
the target, scales review depth to the diff (low→ultra), fans parallel
review lenses, confidence-gates findings to keep false positives off the
PR, and can post to GitHub (--comment) or apply fixes (--fix). The review
stays read-only; --fix is a separate, re-verified phase.

Wires into the using-compound-v router (Build group) and the README
skills table; grounds its load-bearing claims in references/sources.md
(new ## code-review section), incl. the confidence-scored multi-agent
architecture from Anthropic's official code-review plugin.

Claude-Session: https://claude.ai/code/session_01C5WYe368wXRGA7UAGVDgYL
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

A new code-review skill is added as a 95-line SKILL.md defining an on-demand, confidence-gated multi-lens review workflow with depth tiers, a two-phase --comment/--fix model, and red flags. The skill is registered in README.md, skills/using-compound-v/SKILL.md, and grounded with source claims in references/sources.md.

Changes

code-review skill

Layer / File(s) Summary
Skill concept, when-to-use, and diff scoping
skills/code-review/SKILL.md
Adds frontmatter, on-demand vs. in-pipeline distinction, when-to-use rules, and Step 1 diff scoping with eligibility checks and house-rule reading.
Depth selection, lenses, confidence gating, output, post/fix, and red flags
skills/code-review/SKILL.md
Defines Step 2 depth tier mapping, Step 3 parallel lenses fan-out, Step 4 confidence gating/severity calibration, output format with verdict fields, the two-phase --comment/--fix model, and red flags.
Skill registration and source grounding
README.md, skills/using-compound-v/SKILL.md, references/sources.md
Adds code-review to the Build row in both skill tables and inserts the grounding claims table in sources.md.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 A new lens to sniff through the diff,
With confidence scores and a fix-phase cliff,
I read all the lenses in parallel hops,
Gate below eighty and know when to stop,
Post comments first, then apply the fix neat —
The warren of code is now tidy and sweet! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new code-review skill to the compound-v framework. It is concise, uses standard commit convention, and directly matches the primary objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
skills/code-review/SKILL.md (2)

71-76: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add language specifier to fenced code block.

The output format block at lines 71–76 is missing a language identifier (e.g., ```text or ```plaintext instead of just ```). This is required by markdownlint-cli2 (MD040).

📝 Suggested fix
-```
+```text
 [Critical|Important|Minor] (confidence NN) path/to/file.ext:line
   issue: one sentence — what is wrong
   why:   one sentence — the concrete impact / the input that triggers it
   fix:   one sentence — what would resolve it
-```
+```
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/code-review/SKILL.md` around lines 71 - 76, The fenced code block
containing the output format specification (starting with
[Critical|Important|Minor] and including the issue, why, and fix fields) is
missing a language identifier on the opening fence. Add a language specifier
such as "text" or "plaintext" to the opening triple backticks to comply with
markdownlint-cli2 rule MD040 (fenced code blocks must have a language
identifier).

Source: Linters/SAST tools


63-63: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Minor style consistency: "on purpose" → "deliberately".

Line 63 uses "on purpose"; earlier at line 78, the same concept is phrased "deliberate design choice". Consider aligning to "deliberately" for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/code-review/SKILL.md` at line 63, The phrase "on purpose" in the
sentence about a change the author clearly made should be replaced with
"deliberately" to maintain consistency with the "deliberate design choice"
phrasing used later in the same sentence and align with terminology used
elsewhere in the document at line 78. Update the wording to read "A change the
author clearly made deliberately" instead.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@references/sources.md`:
- Around line 42-56: Correct three source attribution and URL errors in the
code-review grounding table. First, update the Anthropic plugin repository URL
in the "Confidence-scored, multi-agent" row from claude-plugins-public to
claude-plugins-official (the correct repository). Second, in the "Clean-context
reviewer catches ~2 bugs/PR" row, either remove the unsupported ~2 bugs/PR
metric since the cited Cognition article does not provide this specific
statistic, or replace it with a source that does contain this data point. Third,
in the "Parallelize the read/analysis lenses; keep any write single-threaded"
row, correct the citation from "Don't Build Multi-Agents" to "Multi-Agents:
What's Actually Working" since that is the article where this specific pattern
is actually discussed and the original article does not contain this claim.

---

Nitpick comments:
In `@skills/code-review/SKILL.md`:
- Around line 71-76: The fenced code block containing the output format
specification (starting with [Critical|Important|Minor] and including the issue,
why, and fix fields) is missing a language identifier on the opening fence. Add
a language specifier such as "text" or "plaintext" to the opening triple
backticks to comply with markdownlint-cli2 rule MD040 (fenced code blocks must
have a language identifier).
- Line 63: The phrase "on purpose" in the sentence about a change the author
clearly made should be replaced with "deliberately" to maintain consistency with
the "deliberate design choice" phrasing used later in the same sentence and
align with terminology used elsewhere in the document at line 78. Update the
wording to read "A change the author clearly made deliberately" instead.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31b6cd5e-fa45-43f5-9a3e-c3ec6c6687e8

📥 Commits

Reviewing files that changed from the base of the PR and between f8d1276 and acbde7e.

📒 Files selected for processing (4)
  • README.md
  • references/sources.md
  • skills/code-review/SKILL.md
  • skills/using-compound-v/SKILL.md

Comment thread references/sources.md
Comment on lines +42 to +56
## code-review

| Claim (short) | skill:line | Category | Source / note |
|---|---|---|---|
| **Confidence-scored, multi-agent** review — fan parallel review lenses across a PR/diff, then score each candidate finding and filter the low-confidence ones to keep false positives off the PR | `code-review` (Steps 3-4) | PRIMARY | Anthropic Claude Code — official `code-review` plugin (`claude-plugins-official` marketplace): "Automated code review for pull requests using multiple specialized agents with confidence-based scoring to filter false positives." https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review |
| Confidence filter at **~80 / 100** | `code-review` (Step 4) | JUDGMENT-CALL (recipe-knob) | The threshold is a tunable knob (the official plugin filters below 80); the *mechanism* (confidence-gate to drop false positives) is the grounded part above. No separate citation needed. |
| Effort scale **low / medium / high / max / ultra** maps review depth to diff size; route *down* when unsure | `code-review` (Step 2) | JUDGMENT-CALL (recipe-knob) | The kit's tier law applied to a review — owning skill is `using-compound-v` (anti-overkill, `references/sources.md` → using-compound-v). Mirrors the depth tiers surfaced by Claude Code's own `/code-review` effort levels. |
| Clean-context reviewer catches **~2 bugs/PR, most severe** | `code-review` (Step 3) | PRIMARY | Cognition, "Multi-Agents: What's Actually Working" — https://cognition.ai/blog/multi-agents-working. Same source as the recheck table; reused here for the on-demand reviewer. |
| Parallelize the read/analysis lenses; keep any **write single-threaded** | `code-review` (Step 3) | PRIMARY | Walden Yan, Cognition, "Don't Build Multi-Agents" — https://cognition.ai/blog/dont-build-multi-agents. Same as `ai-system-reliability:30`. |
| Reviewer stays **read-only**; `--fix` is a separate, re-verified apply phase | `code-review` (Output) | PRIMARY | Same read-only-reviewer grounding as `recheck:19` (production reviewers mutate nothing); the bug a reviewer introduces is the one nobody reviews. |
| Don't flag **intentional changes** / no **extra rigor** beyond the surrounding code | `code-review` (Step 4) | PRIMARY | OpenAI Codex CLI review prompt (`codex-rs/core/review_prompt.md`), same rows as `recheck:68`. |
| GitHub comments: brief, no emojis, cite file+line with a permalink | `code-review` (Output) | JUDGMENT-CALL | Output-format recipe; matches the official `code-review` plugin's comment conventions (cited above). No separate citation needed. |

---

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does the URL https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review exist and describe a code-review plugin from Anthropic?

💡 Result:

The URL https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review does not exist [1][2][3]. Anthropic maintains a repository called claude-plugins-official, which contains the official code-review plugin [1][4][5]. The code-review plugin is a legitimate tool developed by Anthropic [6][7]. It is an automated code review solution for pull requests that utilizes multiple specialized agents working in parallel to audit changes, check for compliance with repository guidelines (such as CLAUDE.md), detect bugs, and analyze historical context via git blame [1][8][9]. It employs a confidence-based scoring system (0-100) to filter out potential false positives, typically only posting feedback that meets a configurable high-confidence threshold (default 80) [8][9][7]. The plugin is integrated into and available through the Claude Code environment [1][8][7].

Citations:


🌐 Web query:

Does the URL https://cognition.ai/blog/multi-agents-working exist and contain analysis about how many bugs clean-context reviewers catch?

💡 Result:

The URL https://cognition.ai/blog/multi-agents-working exists and is titled "Multi-Agents: What's Actually Working" [1]. The blog post mentions that "a clean-context reviewer catches bugs the coder can't see" [1][2]. However, it does not provide specific data or statistics regarding exactly how many bugs such reviewers catch [1][2]. The post discusses the qualitative advantages of a clean-context reviewer, such as the ability to reason backward from an implementation without the original agent's accumulated context, which can help overcome issues related to context rot and help the agent identify nuances that were previously overlooked [2].

Citations:


🌐 Web query:

Does the URL https://cognition.ai/blog/dont-build-multi-agents exist and discuss parallelizing read/analysis while keeping writes single-threaded?

💡 Result:

Yes, the URL https://cognition.ai/blog/dont-build-multi-agents exists [1]. It is an article published by Cognition that argues against the use of multi-agent architectures due to issues with context-passing and decision-making fragmentation [1][2]. Regarding the discussion of parallelizing read/analysis while keeping writes single-threaded, this specific topic is addressed in a follow-up post by Cognition titled "Multi-Agents: What's Actually Working" [3]. In that article, the author notes that while their original warnings about multi-agent systems hold true for "parallel-writer swarms," they have found success with "a narrower class of patterns... where multiple agents contribute intelligence to a task while writes stay single-threaded" [3]. The post explains that this approach allows for parallelized intelligence-gathering—such as using subagents for review or analysis—without the coordination risks associated with allowing multiple agents to act or write concurrently [3].

Citations:


Fix incorrect repository URL and misattributed claims.

The grounding table contains three critical issues:

  1. Anthropic plugin repository: Line 44 cites https://github.com/anthropics/claude-plugins-public/tree/main/plugins/code-review but the repository is claude-plugins-official, not claude-plugins-public. Correct URL: https://github.com/anthropics/claude-plugins-official/tree/main/plugins/code-review

  2. "~2 bugs/PR" statistic unsupported: Line 50 claims the Cognition article "Multi-Agents: What's Actually Working" contains data on how many bugs clean-context reviewers catch. The article discusses qualitatively that reviewers "catch bugs the coder can't see" but does not provide the specific ~2 bugs/PR metric. This claim needs either a different source or removal.

  3. Parallel read/single-threaded write pattern misattributed: Line 53 cites "Don't Build Multi-Agents" for the claim about parallelizing read/analysis while keeping writes single-threaded. This specific pattern is discussed in the follow-up article "Multi-Agents: What's Actually Working," not the original "Don't Build Multi-Agents" article. Update the citation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@references/sources.md` around lines 42 - 56, Correct three source attribution
and URL errors in the code-review grounding table. First, update the Anthropic
plugin repository URL in the "Confidence-scored, multi-agent" row from
claude-plugins-public to claude-plugins-official (the correct repository).
Second, in the "Clean-context reviewer catches ~2 bugs/PR" row, either remove
the unsupported ~2 bugs/PR metric since the cited Cognition article does not
provide this specific statistic, or replace it with a source that does contain
this data point. Third, in the "Parallelize the read/analysis lenses; keep any
write single-threaded" row, correct the citation from "Don't Build Multi-Agents"
to "Multi-Agents: What's Actually Working" since that is the article where this
specific pattern is actually discussed and the original article does not contain
this claim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant