agent-code-reviewer

Multi-agent code review CLI with long-chain reasoning. Five specialist agents (security, performance, style, architecture, test) review code in parallel; a coordinator agent synthesizes their reasoning chains into one prioritized action list. Works with any OpenAI-compatible endpoint — MiMo, OpenAI, DeepSeek, Moonshot, vLLM, Ollama.

Why does this exist?

Single-LLM code review has a recurring failure mode: the model spreads its attention thin and produces shallow, generic feedback. Asking one model to "find security AND performance AND style AND architecture AND test issues" tends to produce six bullet points of mush.

agent-code-reviewer flips the model. Each concern gets its own specialist agent with its own role prompt and its own reasoning chain. The findings come back as structured JSON, and a coordinator agent consumes the full reasoning traces from all five specialists to:

Deduplicate findings that overlap across concerns
Suppress likely false positives (with justification)
Resolve disagreements between specialists explicitly
Rank the surviving issues by (severity × confidence) ÷ fix-effort

The result is a short, actionable list — not a wall of "could-be-an-issue" noise.

How it works

┌──────────────────────────────────────────────────────────────────┐
│                         your source file                          │
└─────────────────────────────────┬────────────────────────────────┘
                                  │
        ┌────────┬────────┬───────┼────────┬────────┐
        ▼        ▼        ▼       ▼        ▼        ▼
   ┌────────┐┌────────┐┌────────┐┌──────────┐┌─────────────┐
   │Security││  Perf  ││ Style  ││Architectr││TestCoverage │   ← parallel, each with
   │ Agent  ││ Agent  ││ Agent  ││  Agent   ││   Agent     │     a 2-stage prompt:
   └───┬────┘└───┬────┘└───┬────┘└────┬─────┘└──────┬──────┘     1) reasoning chain
       │        │        │           │             │             2) JSON findings
       └────────┴────────┴───────────┴─────────────┘
                            │
                            ▼
                  ┌─────────────────────┐
                  │    Coordinator      │   ← reads all 5 reasoning chains,
                  │  (long-chain reason)│     synthesizes, ranks, suppresses
                  └─────────┬───────────┘
                            │
                ┌───────────┼───────────┐
                ▼           ▼           ▼
            terminal      markdown     json
            (rich)        (PR-ready)   (CI-ready)

Two-stage prompting per agent is the key lever. Stage 1 asks the agent to reason out loud (chain-of-thought) without committing to a JSON schema. Stage 2 hands the agent its own reasoning back and asks it to extract structured findings. The reasoning trace also survives into the coordinator's context — so the coordinator can see why a specialist flagged something, not just what it flagged.

Install

git clone https://github.com/m74567437-maker/agent-code-reviewer.git
cd agent-code-reviewer
pip install -e .

Or with pip install -e ".[dev]" to get test dependencies.

Configure

Copy .env.example to .env and fill in:

LLM_API_KEY=sk-...
LLM_BASE_URL=https://api.openai.com/v1   # or any compatible endpoint
LLM_MODEL=gpt-4o-mini

Works out-of-the-box with any provider that ships /v1/chat/completions:

Provider	`LLM_BASE_URL`	Sample `LLM_MODEL`
OpenAI	`https://api.openai.com/v1`	`gpt-4o-mini`
Xiaomi MiMo	(per the MiMo open platform docs)	`mimo-7b`
DeepSeek	`https://api.deepseek.com/v1`	`deepseek-chat`
Moonshot	`https://api.moonshot.cn/v1`	`moonshot-v1-8k`
Ollama	`http://localhost:11434/v1`	`qwen2.5-coder`
vLLM	`http://localhost:8000/v1`	(whatever you served)

Use

# Review a file, print to terminal
agent-review review examples/sample_buggy.py

# Markdown report, ready to paste into a PR
agent-review review src/foo.py -f md > review.md

# JSON for CI integration
agent-review review src/foo.py -f json -o review.json

# Only specific agents
agent-review review src/foo.py --enable security --enable performance

# Stdin
cat src/foo.py | agent-review review --stdin -l python

# Fail the build on high or critical findings
agent-review review src/foo.py --fail-on high

List available agents

$ agent-review agents

Output (terminal)

─────────────────────────── examples/sample_buggy.py (python) ───────────────────────────

Overall severity: HIGH  •  Total findings: 7  •  Tokens: 4318  •  Duration: 6234ms

╭─ Executive Summary ─────────────────────────────────────────────────────╮
│ Two critical issues block deployment: a SQL injection on line 14 and a │
│ hardcoded API key on line 7. Performance is fine for the current scale │
│ but the N+1 pattern on line 22 will bite at >1k records.               │
╰─────────────────────────────────────────────────────────────────────────╯

                          Prioritized Actions
┏━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ # ┃ Severity  ┃ Title                        ┃ Sources                 ┃
┡━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1 │ CRITICAL  │ SQL injection via raw f-str  │ security                │
│ 2 │ CRITICAL  │ Hardcoded API key            │ security                │
│ 3 │ HIGH      │ N+1 DB query in loop         │ performance, architecture│
│ 4 │ MEDIUM    │ Missing edge cases in tests  │ test_coverage           │
└───┴───────────┴──────────────────────────────┴─────────────────────────┘

Architecture

See docs/ARCHITECTURE.md for the design walkthrough — agent contracts, the two-stage prompting protocol, why the coordinator gets the full reasoning trace, and how to add your own agent.

Add a custom agent

Subclass BaseAgent, implement system_prompt() and reasoning_prompt(), register it:

# my_agent.py
from agent_reviewer.agents.base import BaseAgent, ReviewContext

class DocsAgent(BaseAgent):
    name = "docs"
    category = "documentation"

    def system_prompt(self) -> str:
        return "You are a docs reviewer. Flag missing or misleading docstrings."

    def reasoning_prompt(self, ctx: ReviewContext) -> str:
        return f"Review docs in this {ctx.language} file...\n```\n{ctx.source}\n```"

Then:

from agent_reviewer.agents import AGENT_REGISTRY
AGENT_REGISTRY["docs"] = DocsAgent

Roadmap

Diff-aware mode (--diff HEAD~1) so reviews focus on what actually changed
GitHub Action so the report shows up as a PR comment
Repo-level review across multiple files with cross-file reasoning
Auto-fix mode — feed the suggested fixes back into a Patch Agent
Caching layer keyed on (file content hash, agent, model) to skip re-reviews

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src/agent_reviewer		src/agent_reviewer
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-code-reviewer

Why does this exist?

How it works

Install

Configure

Use

List available agents

Output (terminal)

Architecture

Add a custom agent

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-code-reviewer

Why does this exist?

How it works

Install

Configure

Use

List available agents

Output (terminal)

Architecture

Add a custom agent

Roadmap

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages