CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

DeepScrap is an agent-powered deep research tool that produces LaTeX-to-PDF investment research reports on publicly traded companies. Uses a multi-agent architecture with a Synthesis-driven feedback loop. Emphasizes forensic due diligence, rigorous valuation, and forward-looking catalyst analysis.

Development Commands

# Install in dev mode
pip install -e ".[dev]"

# Run all tests
pytest tests/ -v

# Run single test
pytest tests/path/test_file.py::TestClass::test_name -v

# Run the tool (from project root; or use python -m deepscrap.cli.main)
deepscrap analyze AAPL --depth shallow
deepscrap analyze AAPL --depth medium
deepscrap analyze AAPL --depth deep --verbose

Architecture

Multi-agent system with hub-and-spoke design:

Orchestrator (agents/orchestrator.py) — coordinates generic research agents, runs iterative feedback loop
Research Agents (Sonnet, agents/research.py) — N generic agents (agent_1 through agent_N) that autonomously plan which data sources to query based on their directive. Count is configurable via Settings.max_sub_agents (default 6).
Synthesis Agent (Opus, agents/synthesis.py) — the brain; evaluates coverage, directs research agents by name, produces final qualitative analysis. Dynamic prompt includes agent count so it only references valid agent names.
Adversarial Reviewer (GPT, agents/adversarial.py) — challenges Synthesis output
Report Generator (report/generator.py) — Jinja2 LaTeX templates → PDF. Auto-populates appendix (sources list from store metadata, methodology description).

Key layers:

llm/ — Provider abstraction (Claude + OpenAI) with role-based model routing via registry.py. Supports adaptive thinking and effort levels (output_config.effort) mapped from analysis depth.
sources/ — Pluggable data sources (Yahoo Finance, SEC EDGAR, Serper.dev, Google News, OpenInsider)
store/ — Pydantic models + ResearchStore with JSON persistence and coverage tracking
report/ — Chart generation (matplotlib) + LaTeX templates (Jinja2 with <% %> delimiters) + PDF compilation

Report sections (12-section structure):

Executive Summary
Business Analysis
Financial Deep Dive (includes accounting quality & red flags)
Valuation Analysis (comps, DCF, sum-of-parts, pricing verdict)
Leadership Assessment
Forensic & Red Flag Analysis (related-party deals, suspicious M&A, SEC inquiries, insider patterns)
Bull Case
Bear Case
Risk Matrix
Forward-Looking Analysis (next earnings preview, catalyst timeline, what to watch)
Conclusion
Confidence Assessment (overall, per-section breakdown, evidence gaps, source quality)

Research flow:

Orchestrator dispatches 6 research agents with targeted initial directives:
- Company overview + suspicious deals
- Deep financial analysis + accounting red flags
- Valuation models + analyst targets
- Executive team + insider trading patterns
- Forward-looking analysis + catalyst events
- Industry/regulatory/sentiment + short interest
Synthesis evaluates coverage map, issues targeted directives to agent_1..agent_N
Iterate until satisfied (or depth limit reached)
Synthesis produces 12-section qualitative-first analysis
GPT adversarially reviews → Opus revises/rebuts
Report agent renders LaTeX → PDF (appendix auto-populated from store)

Reference system:

Research agents register ReferenceEntry objects in the store as they fetch data
SEC filings are downloaded to {output_dir}/references/ during research
Report generator builds a numbered bibliography with \href{} links
Inline [Source: ...] citations are linked to bibliography entries via keyword matching

Effort levels (Claude API):

--depth shallow → effort medium
--depth medium → effort high
--depth deep → effort max (Opus only)

All Claude calls use adaptive thinking (thinking: {type: "adaptive"}).

Configuration

API keys via environment variables or .env file: ANTHROPIC_API_KEY, OPENAI_API_KEY, SERPER_API_KEY

Model routing defined in llm/config.py — Sonnet for research agents, Opus for synthesis, GPT for adversarial review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Development Commands

Architecture

Key layers:

Report sections (12-section structure):

Research flow:

Reference system:

Effort levels (Claude API):

Configuration

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Development Commands

Architecture

Key layers:

Report sections (12-section structure):

Research flow:

Reference system:

Effort levels (Claude API):

Configuration