Pentest Agent Suite for Claude Code

Autonomous bug-bounty framework for Claude Code and 6 other AI coding tools — 48 agents, 26 commands, 19 CLI tools, 2 MCP servers.

~300 files · 49k+ lines · 48 agents · 26 commands · 19 CLI tools · 6 skills · 2 MCP servers (16 bug-bounty platforms + BYO writeup search) · 2,047 payload lines

A complete bug bounty framework. Battle-tested hunting methodology with concrete payloads, 7-Question Gate validation, autonomous hunt loops, A→B exploit chain building, persistent brain with endpoint tracking, optional semantic writeup search (bring your own index), automatic cost tracking via CC hooks, live platform integration, and a cross-IDE installer that emits the native format for Claude Code, Codex, Gemini, Cursor, Windsurf, and VS Code Copilot.

Quick Start

# MCP servers are launched via `uv run --with mcp` — no global pip install required.
export HACKERONE_USERNAME=you HACKERONE_TOKEN=your_token
uv run python3 tools/scaffold.py hackerone tesla --type web-app
cd ~/bounties/hackerone-tesla && claude
/model opus             # Opus 4.6 [1M] — subagents inherit via model: "inherit"
/sync hackerone tesla
/brain init && /status
/hunt tesla.com

Install (Claude Code + 6 other AI coding tools)

pentest-agents ships a cross-IDE installer that emits each target's native format — agents, skills, commands, rules, and MCP configuration — so the same framework works everywhere.

# From a clone:
python3 -m tools.installer install --targets all --scope project

PyPI distribution is WIP. uv build produces a working wheel, but the installed CLI currently resolves source files relative to a repo clone layout (.claude/agents, .claude/skills, skills/, rules/, rules/payloads.md, mcp-*-server/). Running via pipx install / uvx pentest-agents will execute but install an empty manifest. Until this is fixed, run the installer from a clone.

Target	Agents	Commands	Rules	MCP	Scopes
Claude Code	native `.claude/agents/*.md`	`.claude/skills/<name>/SKILL.md`	`CLAUDE.md`	`.mcp.json` / `~/.claude.json`	global + project
OpenAI Codex	native `.codex/agents/*.toml`	`.codex/commands/*.md`	`AGENTS.md`	`[mcp_servers.*]` in `config.toml`	global + project
Google Gemini	native `.gemini/agents/*.md`	TOML in `.gemini/commands/`	`GEMINI.md`	`mcpServers` in `settings.json`	global + project
Cursor	→ Skills	→ Skills	`.cursor/rules/*.mdc` + `AGENTS.md`	`.cursor/mcp.json`	global + project
Windsurf	→ Skills	Workflows	`.windsurf/rules/*.md` (≤12K / file)	`~/.codeium/windsurf/mcp_config.json`	global + project
VS Code Copilot	`.github/agents/*.agent.md`	`.github/prompts/*.prompt.md`	`.github/copilot-instructions.md` + `.github/instructions/*`	`.vscode/mcp.json`	project + global-MCP
OpenClaw	→ Skills	→ Skills	`~/.openclaw/workspace/AGENTS.md` or `<proj>/AGENTS.md`	`mcp.servers` in `~/.openclaw/openclaw.json`	global + project (skills/rules only; MCP is user-level)

Cursor, Windsurf, and OpenClaw have no native subagent concept, so Claude-format agents are rendered as Skills for those three (the closest analogue). Every target's rule digest is a single canonical AGENTS.md-compatible file when supported.

OpenClaw specifics (verified against docs.openclaw.ai, April 2026): skills install into ~/.openclaw/skills/<name>/SKILL.md (global) or <project>/.agents/skills/<name>/SKILL.md (project — AgentSkills convention). MCP is always wired into the user-level ~/.openclaw/openclaw.json under mcp.servers.*; project-scope installs emit a warning reminding you to run --scope global once if you need the MCP servers.

Management:

pentest-agents list                      # detect which targets are installed
pentest-agents install --targets claude_code,codex --scope global
pentest-agents install --dry-run         # preview every file + JSON merge
pentest-agents verify                    # check manifest vs. disk (drift)
pentest-agents uninstall                 # reverse, restore .pa-backup files

Every install records a manifest (.pentest-agents/manifest.json for project scope, ~/.config/pentest-agents/manifest.json for global). Uninstall only removes files we wrote and surgically strips only the MCP/JSON keys we merged — your other settings are never touched. Conflicting writes back up the original as <path>.pa-backup and are restored on uninstall.

Workflow

New program:   /new → /sync → /brain init → /analyze → /surface → /hunt
Returning:     /resume <target> → /hunt or /autopilot
After finding: /validate → /chain → /report → /dupcheck → /submit → /learn
Batch triage:  /triage (7-Question Gate on all findings)

MCP Servers (2)

bounty-platforms (16 platforms)

HackerOne (full API), Bugcrowd, Intigriti, Immunefi (public), YesWeHack + 11 stubs. 7 MCP tools: list_platforms, get_program_scope, get_program_policy, search_hacktivity, sync_program, draft_report, submit_report.

writeup-search (BYO index)

Searchable knowledge base agents query during hunting and validation. 4 MCP tools:

search_writeups — semantic search (FAISS) or keyword search for prior art
get_writeup — full writeup content by ID
search_techniques — exploitation techniques by vuln class
search_payloads — curated payloads from rules/payloads.md

The writeup index is not bundled. Bulk-redistributing scraped hacktivity violates most platform ToS, so this repo ships the server only. The search_payloads + search_techniques fallback works out of the box; the semantic/keyword layers activate once you point the server at your own index.

Three search modes (auto-detected, graceful fallback):

Mode	Requires	Searches
FAISS (semantic)	`faiss-cpu`, `sentence-transformers`, your `metadata.db` + `index.faiss`	Your writeup corpus via vector embeddings
SQLite (keyword)	Your `metadata.db` only	Your writeup corpus via `LIKE` over the text column
Local (default)	Nothing — zero deps	`rules/payloads.md` + `skills/` shipped in this repo

Point the server at your index by dropping metadata.db (+ optionally index.faiss) into ~/.local/share/pentest-writeups/, or set WRITEUP_DB_DIR=/path/to/dir.

Expected schema (metadata.db): a SQLite file with at least one table containing columns id, title, url, and one text column (content / text / body / writeup). Row order in the table must match vector order in index.faiss when using semantic mode.

Build your own index — `rag-builder/`

The repo now ships a local RAG/FAISS builder under rag-builder/ that turns a list of GitHub / GitLab repositories into a metadata.db + index.faiss pair the writeup-search MCP server consumes. Destructive operations (clone, embed, write) are always gated behind --execute — running the CLI without it prints the plan and changes nothing, so you can never wipe an existing index by accident.

cd rag-builder

# 1. Inspect the plan — no network, no writes.
python3 build.py status
python3 build.py ingest                    # dry-run (the default)

# 2. Opt-in pre-flight: probe every URL with `git ls-remote` (network).
python3 build.py ingest --check-remotes    # ~5s for 141 repos at 16 workers

# 3. Actually clone + index every repo from repos.yaml into ./data/.
python3 build.py ingest --execute
python3 build.py ingest --execute --check-remotes   # skip unreachable first

# 4. Point the MCP server at the output.
export WRITEUP_DB_DIR="$PWD/data"
python3 ../mcp-writeup-server/server.py --test

rag-builder/repos.yaml ships with a 146-entry seed covering CTF archives, bug-bounty reports, payload collections, and research aggregators — edit freely. repos-skipped.yaml is loaded automatically as an exclusion list (override with --skip-list or --no-skip-list). config.yaml controls the embedding model (all-MiniLM-L6-v2 by default), host allowlist, clone size cap, and file-size ceiling. See rag-builder/README.md for the full reference.

CC Hooks (automatic cost tracking)

Configured in settings.json, fires automatically:

SubagentStop → cost_hook.py logs agent name + session to cost-tracking.json
Stop → logs session end
SessionStart → welcome message

Statusline shows live cost from session token data: $0.57

Commands (26)

Hunting & Analysis

Command	Description
`/hunt <target> [--vuln-class]`	Active hunting — searches writeup DB for techniques first, then tests with concrete payloads
`/autopilot <target>`	Autonomous loop with --paranoid/--normal/--yolo checkpoints
`/surface <target>`	P1/P2/Kill ranked attack surface
`/chain`	Build A→B→C exploit chains (12 patterns, 6 high-value templates)
`/analyze <target>`	AI analysis: crown jewels, attack paths, blind spots
`/mindmap <target>`	Attack surface tree with brain status
`/sast <repo>`	Source-code vulnerability hunting (entry → flow → gap → exploit pipeline)

Validation & Reporting

Command	Description
`/validate <finding>`	7-Question Gate → PASS/KILL/DOWNGRADE/CHAIN REQUIRED
`/triage`	Batch-validate ALL findings, kill weak ones
`/quality <draft>`	Score report 1-10 (blocks below 7)
`/report [format]`	Reports (hard gate: requires /validate PASS)
`/dupcheck <desc>`	Hacktivity + writeup DB for duplicates
`/submit <finding>`	Submit (hard gate: /validate PASS + /quality ≥ 7)

Session & Memory

Command	Description
`/resume <target>`	Resume — untested endpoints + suggestions
`/remember`	Log finding/pattern for cross-target learning
`/learn <id> <status>`	Record response — auto-boosts paid techniques
`/brain`	init, brief, status, endpoint, endpoints, record, exhausted

Infrastructure

Command	Description
`/new`, `/sync`, `/status`	Setup + dashboard
`/pipeline`, `/quickscan`, `/fullscan`	Scanning pipelines
`/correlate`	Chain discovery across findings
`/cost`, `/monitor`	Cost tracking, target change detection

Agents (48)

H1 Weakness Specialists (17)

xss-hunter (#60/#61/#62), sqli-hunter (#67), csrf-hunter (#57), ssrf-hunter (#75), ssti-hunter (#74), idor-hunter (#55), auth-tester (#27), info-disclosure (#18), open-redirect (#38), rce-hunter (#70), xxe-hunter (#63), file-upload (#39), cors-hunter (#58), subdomain-takeover (#145), business-logic (#28), race-condition (#29), privilege-escalation (#26)

Hunting & Analysis (3)

validator — 7-Question Gate + never-submit list (PASS/KILL/DOWNGRADE/CHAIN)
chain-builder — A→B chain table, searches writeup DB for proven chains
recon-ranker — P1/P2/Kill surface ranking

Infrastructure / Recon (10)

recon, vuln-scanner, config-auditor, cloud-recon, js-analyzer, waf-profiler, graphql-audit, nuclei-writer, browser-agent (Burp MCP), browser-stealth-agent (Camoufox)

Meta / Validation (9)

brain, correlator, quality-check, monitor, poc-builder, report-writer, scope-check, browser-verifier (client-side PoC proof), dast-devils-advocate (adversarial downgrade)

SAST Pipeline (8)

sast-file-ranker, sast-entry-mapper, sast-danger-mapper, sast-flow-tracer, sast-gap-analyzer, sast-devils-advocate, sast-hunter, sast-exploit-builder

Specialized (1)

web3-auditor — Solidity grep arsenal, Foundry PoC, DeFi patterns

CLI Tools (19)

Tool	Purpose
brain.py	Brain with endpoint tracking + circuit breaker
intel_engine.py	Hacktivity patterns + tech→vuln mapping
journal.py	JSONL session journal for /resume
target_selector.py	Program ROI ranking
cost_hook.py	CC hook: auto-logs agent completions via SubagentStop
statusline.py	Dashboard (--compact/--watch/--json)
scope_check.py	Scope validation with --list
scope_hook.py	PreToolUse hook: blocks out-of-scope Bash commands (exact + wildcard)
cvss_version_guard.py	Enforces H1 = CVSS 3.1, other platforms = CVSS 4.0
file_path_guard.py	Blocks hallucinated file paths in reports
file_safety.py	Shared safety checks for agent-written files
dedup_findings.py	Dedup + hacktivity cross-reference
global_brain.py	Cross-engagement knowledge (incremental hash-based sync)
response_tracker.py	Response learning + auto-boost paid techniques
scaffold.py	Workspace scaffolding with update mode
capture.py	Screenshots + video (WSL2)
cost.py	Token cost tracking + ROI
camofox_ctl.sh	Camoufox (stealth Firefox) lifecycle — Cloudflare/Akamai bypass
pentest-statusline.sh	CC statusline: findings, brain, context, cost

Rules Library (`rules/`)

Single source of truth for every agent — all hunters, validators, and report-writers read the relevant files at session start.

File	Lines	Purpose
`hunting.md`	—	23 hunting rules (Rule 0 harm check, Rule 8 sibling check, Rule 9 A→B signal, Rule 19 never-submit)
`payloads.md`	2,047	XSS / SSRF / SQLi / IDOR / OAuth / upload / race / SSTI / deser / JWT / LFI / prototype pollution / NoSQLi / DeFi
`techniques.md`	389	Proven attack techniques extracted from real paid engagements
`waf-bypass-protocol.md`	166	WAF bypass iteration ladder for Akamai/Cloudflare/Imperva
`vendor-status.md`	126	Patched vendor vectors, framework fingerprints, cooldown tables
`chain-table.md`	—	Capability→next-bug chain table for `/chain`
`never-submit.md`	—	Never-submit list + conditionally-valid-with-chain table
`mistakes.md`	—	Top 10 most common mistakes — every agent reads this at session start

Key Features

Writeup search MCP: Agents query prior art during hunting — bring your own FAISS/SQLite writeup index, or fall back to the shipped payload/technique library
CC hooks: SubagentStop/Stop auto-log costs, statusline shows live $X.XX from token data
PreToolUse scope hook: Bash commands are matched (exact + wildcard) against scope.yaml; out-of-scope targets are blocked before the tool call fires
7-Question Gate: Every finding validated — first NO = KILL
Depth Engine: /autopilot enforces an anti-shallow protocol — no claim of "exhausted" until the exhaustion matrix is complete
Stacked-encoding mandate: /hunt and /autopilot require multi-layer encoding in every payload attempt before declaring a surface clean
CVSS policy guard: HackerOne findings use CVSS 3.1; every other platform uses CVSS 4.0 — enforced by cvss_version_guard.py
Circuit breaker: 5× consecutive 403/429 → auto-backoff 60s
Endpoint tracking: Brain records every endpoint tested per target
Hard validation gates: /report and /submit refuse without /validate PASS
Never-submit filter: Pipeline auto-kills informational findings
Incremental sync: Global brain hash-based, skips unchanged files
Feedback loop: /learn auto-boosts paid techniques globally
Session journal: JSONL log for /resume continuity

Requirements

Python 3.10+, uv (MCP servers launch via uv run --with mcp)
Optional: uv pip install faiss-cpu sentence-transformers (for writeup semantic search)
Security tools: nmap, httpx, subfinder, nuclei, ffuf, katana, sqlmap
GraphQL hunter tools: graphql-path-enum — cargo install --git https://gitlab.com/dee-see/graphql-path-enum (auto-installed by setup-mcp.sh if cargo is present)
Evidence: grim/scrot, wf-recorder/ffmpeg
jq (for statusline)

License

For authorized security testing only. Follow responsible disclosure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pentest Agent Suite for Claude Code

Quick Start

Install (Claude Code + 6 other AI coding tools)

Workflow

MCP Servers (2)

bounty-platforms (16 platforms)

writeup-search (BYO index)

Build your own index — `rag-builder/`

CC Hooks (automatic cost tracking)

Commands (26)

Hunting & Analysis

Validation & Reporting

Session & Memory

Infrastructure

Agents (48)

H1 Weakness Specialists (17)

Hunting & Analysis (3)

Infrastructure / Recon (10)

Meta / Validation (9)

SAST Pipeline (8)

Specialized (1)

CLI Tools (19)

Rules Library (`rules/`)

Key Features

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.claude		.claude
docs		docs
hooks		hooks
mcp-bounty-server		mcp-bounty-server
mcp-writeup-server		mcp-writeup-server
rag-builder		rag-builder
rules		rules
skills		skills
tests		tests
tools		tools
wordlists		wordlists
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
README.md		README.md
pyproject.toml		pyproject.toml
scope.yaml.example		scope.yaml.example

Folders and files

Latest commit

History

Repository files navigation

Pentest Agent Suite for Claude Code

Quick Start

Install (Claude Code + 6 other AI coding tools)

Workflow

MCP Servers (2)

bounty-platforms (16 platforms)

writeup-search (BYO index)

Build your own index — rag-builder/

CC Hooks (automatic cost tracking)

Commands (26)

Hunting & Analysis

Validation & Reporting

Session & Memory

Infrastructure

Agents (48)

H1 Weakness Specialists (17)

Hunting & Analysis (3)

Infrastructure / Recon (10)

Meta / Validation (9)

SAST Pipeline (8)

Specialized (1)

CLI Tools (19)

Rules Library (rules/)

Key Features

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Build your own index — `rag-builder/`

Rules Library (`rules/`)

Packages