Peppy - Codebase Indexing Plugin for Claude Code

Peppy is a powerful codebase indexing and search plugin designed to work seamlessly with Claude Code. It provides efficient code navigation, symbol search, and intelligent grep capabilities across your entire codebase.

Features

🚀 Fast Indexing: Quickly index large codebases with intelligent caching
🔍 Smart Search: Find functions, classes, variables, and more across multiple languages
🌳 Tree-sitter Parsing: Accurate code understanding using tree-sitter parsers
📁 Gitignore Support: Respects .gitignore patterns automatically
🔌 MCP Integration: Works as an MCP server for Claude Code

Supported Languages

Python
JavaScript/TypeScript
Go
Rust
Java
And more coming soon!

Installation

pip install -e .

Usage with Claude Code

Add to your Claude Code MCP settings:

{
  "mcpServers": {
    "peppy": {
      "command": "python",
      "args": ["-m", "peppy.server"]
    }
  }
}

Available Tools

1. `index_codebase`

Index a directory to enable fast searching.

Parameters:

path (string): Path to the codebase root
force_reindex (boolean, optional): Force re-indexing even if cache exists

2. `search_symbols`

Search for code symbols (functions, classes, variables) across the indexed codebase.

Parameters:

query (string): Search query (supports regex)
symbol_type (string, optional): Filter by type (function, class, variable, etc.)
file_pattern (string, optional): Filter by file pattern (e.g., "*.py")

3. `grep_code`

Perform efficient grep search across the codebase.

Parameters:

pattern (string): Search pattern (regex supported)
file_pattern (string, optional): File glob pattern
context_lines (integer, optional): Number of context lines to show

4. `get_file_symbols`

Get all symbols defined in a specific file.

Parameters:

file_path (string): Path to the file

5. `get_statistics`

Get statistics about an indexed codebase.

Parameters:

codebase_path (string): Path to the indexed codebase

6. `clear_cache`

Clear the index cache.

Parameters:

codebase_path (string, optional): Path to specific codebase, or omit to clear all

📊 Token Savings & Performance

Peppy dramatically reduces token usage when working with Claude Code by providing targeted, indexed search capabilities instead of brute-force file reading.

Real-World Token Comparison

Scenario 1: Finding a Function Definition

Without Peppy:

1. User: "Find the process_data function"
2. Claude: Glob for *.py files → 50 files found
3. Claude: Read file1.py → 2,000 tokens
4. Claude: Read file2.py → 3,500 tokens
5. Claude: Read file3.py → 1,800 tokens
6. Claude: Found in file3.py!

Total tokens: ~7,300
Time: Multiple API calls

With Peppy:

1. User: "Find the process_data function"
2. Claude: search_symbols(query="process_data") → file3.py:45
3. Claude: Read file3.py:40-60 (just the function)

Total tokens: ~500
Time: Single API call

💰 Savings: 93% fewer tokens (6,800 tokens saved)

Scenario 2: Exploring Codebase Structure

Without Peppy:

1. Read directory structure → 500 tokens
2. Read multiple __init__.py files → 2,000 tokens
3. Read example files to understand → 4,000 tokens
4. Read configuration files → 1,500 tokens

Total tokens: ~8,000

With Peppy:

1. get_statistics(codebase_path) → Complete overview
2. search_symbols(query="class") → All classes listed

Total tokens: ~800

💰 Savings: 90% fewer tokens (7,200 tokens saved)

Scenario 3: Finding All TODO Comments

Without Peppy:

1. Glob for all source files → 100+ files
2. Read each file to search for TODO → 50,000+ tokens
3. Filter and summarize results

Total tokens: ~50,000+

With Peppy:

1. grep_code(pattern="TODO", context_lines=2)

Total tokens: ~1,500

💰 Savings: 97% fewer tokens (48,500 tokens saved)

Performance Benchmarks

Operation	Codebase Size	Without Peppy	With Peppy	Savings
Find function	500 files	~10,000 tokens	~600 tokens	94%
List all classes	500 files	~15,000 tokens	~800 tokens	95%
Grep pattern	500 files	~50,000 tokens	~2,000 tokens	96%
Explore structure	500 files	~12,000 tokens	~700 tokens	94%

Cache Benefits

Once indexed, searches are instant across sessions:

First index: One-time cost (~2-5k tokens for medium codebase)
All future searches: Near-zero overhead (~100-500 tokens per query)
Cache persistence: Index survives across conversations

ROI: After 2-3 searches, Peppy pays for itself in token savings!

🎯 Optimization Tips

Maximize your token savings with these best practices:

1. Index Once, Search Many

✅ DO: Index at the start of a session
❌ DON'T: Re-index for every query

# At session start:
index_codebase(path="/path/to/project")

# Then search freely:
search_symbols(query="MyClass")
search_symbols(query="process_.*", use_regex=true)
grep_code(pattern="TODO")

2. Use Specific Queries

✅ DO: Use precise search terms
search_symbols(query="AuthService", symbol_type="class")

❌ DON'T: Use overly broad searches
search_symbols(query=".*")  # Returns everything!

3. Leverage File Patterns

✅ DO: Filter by file type when you know the context
search_symbols(query="handler", file_pattern="*.py")
grep_code(pattern="error", file_pattern="src/**/*.ts")

❌ DON'T: Search all files when you only need specific types

4. Control Context Lines

✅ DO: Request minimal context when you just need location
grep_code(pattern="FIXME", context_lines=0)  # Just the line

✅ DO: Request context when you need understanding
grep_code(pattern="error_handler", context_lines=3)  # See usage

❌ DON'T: Always use max context (wastes tokens)

5. Use Appropriate Tools

✅ DO: Use search_symbols for finding definitions
search_symbols(query="User")  # Find User class/function

✅ DO: Use grep_code for finding usage patterns
grep_code(pattern="User\(")  # Find User instantiation

❌ DON'T: Use grep_code when search_symbols is better
grep_code(pattern="class User")  # Inefficient!

6. Limit Results

✅ DO: Use max_results to control output
grep_code(pattern="import", max_results=20)

❌ DON'T: Return thousands of results
grep_code(pattern=".")  # Returns everything!

7. Check Statistics First

✅ DO: Use get_statistics to understand the codebase
get_statistics(codebase_path="/project")
# Shows: 500 files, 2000 symbols, breakdown by type

Then search intelligently based on what exists

8. Cache Management

✅ DO: Keep caches for active projects
# Cache persists across sessions automatically

✅ DO: Clear cache when codebase changes significantly
clear_cache(codebase_path="/project")
index_codebase(path="/project")  # Re-index

❌ DON'T: Clear cache unnecessarily (wastes re-indexing tokens)

📚 Tool Usage Guide & Best Practices

Workflow Examples

Starting a New Codebase Exploration

Step 1: Index the codebase
→ index_codebase(path="/path/to/project")
  ℹ️ Cost: ~2,000 tokens (one-time)

Step 2: Get overview
→ get_statistics(codebase_path="/path/to/project")
  ℹ️ Cost: ~300 tokens
  ℹ️ Returns: File counts, symbol types, extensions

Step 3: Explore key symbols
→ search_symbols(query=".*Service$", symbol_type="class")
  ℹ️ Cost: ~400 tokens
  ℹ️ Finds: All service classes

Step 4: Deep dive on specific files
→ get_file_symbols(file_path="src/auth.py")
  ℹ️ Cost: ~200 tokens
  ℹ️ Lists: All functions/classes in that file

Debugging Workflow

Step 1: Find error handling
→ grep_code(pattern="raise \w+Error", context_lines=2, file_pattern="*.py")
  ℹ️ Shows: All error raises with context

Step 2: Find specific exception class
→ search_symbols(query="ValidationError", symbol_type="class")
  ℹ️ Shows: Exact definition location

Step 3: Find all usages
→ grep_code(pattern="ValidationError", file_pattern="*.py", max_results=50)
  ℹ️ Shows: Where it's used across codebase

Refactoring Workflow

Step 1: Find all references to old function
→ grep_code(pattern="old_function_name\(", max_results=100)
  ℹ️ Lists: All call sites

Step 2: Find the definition
→ search_symbols(query="old_function_name", symbol_type="function")
  ℹ️ Shows: Where it's defined

Step 3: Search for similar patterns
→ search_symbols(query="old_.*", use_regex=true)
  ℹ️ Finds: Related functions that might need updating

Code Review Workflow

Step 1: Find all TODOs
→ grep_code(pattern="TODO|FIXME", context_lines=1, max_results=50)

Step 2: Check for test coverage
→ search_symbols(query="test_.*", symbol_type="function", file_pattern="*test*.py")

Step 3: Find security-sensitive functions
→ grep_code(pattern="password|secret|token", context_lines=2)

Anti-Patterns to Avoid

❌ Reading files before searching

# Bad: Wastes tokens
Read all files → Then search manually

# Good: Search first
search_symbols(query="target") → Get exact location → Read only that file

❌ Repeating searches

# Bad: Same search multiple times
search_symbols(query="User")
# ... later ...
search_symbols(query="User")  # Same query again!

# Good: Save results in conversation context
# Claude remembers previous search results

❌ Over-indexing

# Bad: Index everything including node_modules
index_codebase(path="/project")  # Contains node_modules!

# Good: Index only source code
index_codebase(path="/project/src")

❌ Ignoring symbol types

# Bad: Search everything
search_symbols(query="process")  # Returns functions, variables, classes...

# Good: Be specific
search_symbols(query="process", symbol_type="function")

Token Budget Guidelines

For a typical coding session with Peppy:

Activity	Token Budget	Frequency
Initial indexing	2,000-5,000	Once per project
Statistics check	200-500	1-2 times
Symbol searches	300-600 each	5-10 times
Grep searches	500-2,000 each	3-5 times
File symbol listing	200-400 each	2-4 times

Total session: ~10,000-15,000 tokens with Peppy Same session without Peppy: ~50,000-100,000 tokens

💰 Net savings: 70-85% fewer tokens per session

Pro Tips

Chain searches efficiently

# Use statistics to guide your searches
get_statistics() → See what exists → Search specifically

Combine with Claude's memory

# Claude remembers index results, so reference them later
"Earlier you found 5 User classes. Show me the one in auth.py"

Use regex for power searches

search_symbols(query="handle_.*_request", use_regex=true)
grep_code(pattern="def (test_|spec_)", use_regex=true)

Incremental exploration

# Start broad, narrow down
get_statistics() → search_symbols() → get_file_symbols() → Read file

🔗 Integration with Siggy

Peppy works seamlessly with Siggy, a workflow orchestration plugin for complex coding tasks. Together, they provide:

70% faster planning - Peppy's indexed search accelerates Siggy's research phase
83% more efficient execution - Precise location finding without reading multiple files
65-75% overall token savings - Dramatic reduction in API costs for complex workflows

Quick Integration (Plugin Configuration)

Enable Peppy as a plugin in your Siggy configuration:

# .siggy.yml
plugins:
  peppy:
    enabled: true
    auto_index: true
    use_enhanced_agents: true

That's it! Siggy will automatically:

Use Peppy-enhanced agents for planning and execution
Auto-index your codebase on session start
Provide 65-75% token savings on workflows

Manual Setup (Alternative)

# 1. Install both plugins
pip install -e .  # Peppy
# Install Siggy per their docs

# 2. Configure Peppy as MCP server (see above)

# 3. Copy enhanced agents to Siggy (optional)
cp integrations/siggy/agents/* /path/to/siggy/.siggy/agents/

# 4. Configure Siggy to use them
# Edit .siggy.yml:
agents:
  planner: .siggy/agents/peppy-planner.md
  executor: .siggy/agents/peppy-executor.md

Token Savings Example

Without Peppy:

Planning: ~8,000 tokens
Execution: ~12,000 tokens
Verification: ~3,000 tokens
Total: ~23,000 tokens

With Peppy:

Indexing (one-time): ~2,000 tokens
Planning: ~2,500 tokens
Execution: ~2,000 tokens
Verification: ~1,500 tokens
Total: ~8,000 tokens (65% savings!)

📚 Complete Integration Guide | Integration README

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black .
ruff check .

How It Works

Indexing: Peppy scans your codebase, parses files using tree-sitter, and extracts symbols
Caching: Index data is cached for fast subsequent searches
Search: Fast lookups using indexed data with optional regex filtering
MCP Protocol: Exposes functionality as tools that Claude Code can use

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.opencode/plans		.opencode/plans
docs		docs
examples		examples
integrations/siggy		integrations/siggy
peppy		peppy
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
plan.md		plan.md
pyproject.toml		pyproject.toml
token_simulator.py		token_simulator.py

Folders and files

Latest commit

History

Repository files navigation

Peppy - Codebase Indexing Plugin for Claude Code

Features

Supported Languages

Installation

Usage with Claude Code

Available Tools

1. index_codebase

2. search_symbols

3. grep_code

4. get_file_symbols

5. get_statistics

6. clear_cache

📊 Token Savings & Performance

Real-World Token Comparison

Scenario 1: Finding a Function Definition

Scenario 2: Exploring Codebase Structure

Scenario 3: Finding All TODO Comments

Performance Benchmarks

Cache Benefits

🎯 Optimization Tips

1. Index Once, Search Many

2. Use Specific Queries

3. Leverage File Patterns

4. Control Context Lines

5. Use Appropriate Tools

6. Limit Results

7. Check Statistics First

8. Cache Management

📚 Tool Usage Guide & Best Practices

Workflow Examples

Starting a New Codebase Exploration

Debugging Workflow

Refactoring Workflow

Code Review Workflow

Anti-Patterns to Avoid

Token Budget Guidelines

Pro Tips

🔗 Integration with Siggy

Quick Integration (Plugin Configuration)

Manual Setup (Alternative)

Token Savings Example

Development

How It Works

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. `index_codebase`

2. `search_symbols`

3. `grep_code`

4. `get_file_symbols`

5. `get_statistics`

6. `clear_cache`

Packages