diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index 7e657a8..0000000 --- a/AGENTS.md +++ /dev/null @@ -1,62 +0,0 @@ -# Repository Guidelines - -## Project Structure & Modules -- `hotspots-core/`: Core Rust library (analysis, metrics, policies, git, CFG). -- `hotspots-cli/`: CLI binary (`hotspots`). -- `docs/`: VitePress docs (`npm -C docs run docs:dev|docs:build`). -- `action/`, `packages/`: GitHub Action and TS packages. -- `tests/`, `hotspots-core/tests/`: Rust integration and golden tests. -- `assets/`, `examples/`, `scripts/`: Logos, fixtures, helpers. - -## Build, Test, and Dev Commands -- Build: `cargo build --release` or `make build`. -- Run CLI locally: `./dev.sh analyze src/` or `cargo run --package hotspots-cli -- analyze src/`. -- Unit tests: `cargo test` or `make test` (library tests only). -- Comprehensive tests: `make test-comprehensive` (pytest or `integration/legacy/test_comprehensive.py`). -- All tests: `make test-all`. -- Install hooks: `make install-hooks` (fmt + clippy + tests on commit). - -## Coding Style & Naming -- Language: Rust 2021; format with `cargo fmt`, lint with `cargo clippy -D warnings`. -- Naming: crates/modules `snake_case`, types/traits `PascalCase`, consts `SCREAMING_SNAKE_CASE`. -- Indentation: Rust defaults (4 spaces); avoid unnecessary refactors; keep changes focused. -- Workspace lints are enforced; PRs must be warning-free. - -## Testing Guidelines -- Framework: Rust test harness; use `#[test]` functions and `_tests.rs` files. -- Locations: unit tests near sources; integration/golden tests in `hotspots-core/tests/`. -- Useful flags: `cargo test -- --nocapture`, `cargo test -p hotspots-core`. -- Optional coverage: `cargo tarpaulin` (see docs) if installed. - -## Commit & Pull Request Guidelines -- Conventional commits: `: ` (`feat`, `fix`, `docs`, `refactor`, `test`, `chore`). -- Commit messages: single line, ≤72 chars (see `CLAUDE.md`). -- Before pushing: `cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings && cargo test`. -- PRs: small scope, clear description, link issues; include CLI output or screenshots where relevant; update docs when flags/behavior change. - -## Security & Configuration Tips -- Config: `.hotspotsrc.json` controls include/exclude patterns (tests, builds, fixtures are excluded by default). -- Secrets: do not commit tokens or private data; binaries install to `~/.local/bin`. -- Docs deploy: `wrangler.toml` targets Pages; build with `npm -C docs run docs:build`. - -## Agent-Specific Notes -- Follow `CLAUDE.md`: keep diffs minimal, batch edits, and always run fmt + clippy + tests before proposing changes. - -## Understanding Quadrants and Activity Risk - -Every function in a Hotspots snapshot has a `quadrant` field. Use it — not the raw risk score — to determine urgency: - -| Quadrant | Complexity | Recent Activity | What to do | -|---|---|---|---| -| `fire` | High | High | Act now — live regression risk | -| `debt` | High | Low | Schedule proactively — structural debt | -| `simple-active` | Low | High | Monitor only | -| `simple-stable` | Low | Low | Ignore | - -**Critical:** `activity_risk` (the composite score) is a decay function over git history. It **never reaches zero** even if a function hasn't been touched in months. A high score alone does NOT mean a function is actively changing. - -To determine true activity, always check **both**: -- `quadrant` — the authoritative fire/debt classification -- `touches_30d` — commits touching this function in the last 30 days - -A `debt`-quadrant function with `touches_30d == 0` is structural debt (stable but complex). Never describe it as "actively changing." A `fire`-quadrant function with `touches_30d > 0` is a live regression surface. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md deleted file mode 100644 index 3f6bba5..0000000 --- a/CONTRIBUTING.md +++ /dev/null @@ -1,212 +0,0 @@ -# Contributing to Hotspots - -Thank you for your interest in contributing to Hotspots! 🎉 - -## Quick Start - -1. **Fork** the repository -2. **Clone** your fork: `git clone https://github.com/YOUR_USERNAME/hotspots.git` -3. **Set up** your development environment (see below) -4. **Create** a feature branch: `git checkout -b feature/your-feature-name` -5. **Make** your changes -6. **Test**: `cargo test` -7. **Commit**: Follow our [commit conventions](./CLAUDE.md) -8. **Push** and create a pull request - -## Development Setup - -### Prerequisites - -- Rust 1.70+ (`rustup install stable`) -- Git -- Node.js 18+ (for action) - -### Build - -```bash -# Clone the repository -git clone https://github.com/Stephen-Collins-tech/hotspots.git -cd hotspots - -# Build the project -cargo build --release - -# Install git hooks (runs fmt + clippy + tests before each commit) -make install-hooks - -# Run tests -cargo test - -# Run the CLI -./target/release/hotspots --help -``` - -See [Development Setup Guide](./docs/contributing/development.md) for detailed instructions. - -## How to Contribute - -### Reporting Bugs - -- Use the [GitHub issue tracker](https://github.com/Stephen-Collins-tech/hotspots/issues) -- Search existing issues first -- Include: - - Hotspots version (`hotspots --version`) - - Operating system - - Steps to reproduce - - Expected vs. actual behavior - -### Suggesting Features - -- Open a [GitHub discussion](https://github.com/Stephen-Collins-tech/hotspots/discussions) -- Describe the use case -- Explain why it would be useful - -### Submitting Code - -#### Code Quality - -- Follow Rust conventions (run `cargo fmt` and `cargo clippy`) -- Write tests for new functionality -- Update documentation as needed -- Follow our [coding conventions](./CLAUDE.md) - -#### Commit Messages - -Use conventional commits format: - -``` -: - -[optional body] - -Co-Authored-By: Your Name -``` - -Types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore` - -**Examples:** -``` -feat: add Python language support - -fix: correct cyclomatic complexity calculation for switch statements - -docs: update installation instructions - -refactor: extract metrics calculation into separate module -``` - -See our [CLAUDE.md](./CLAUDE.md) file for detailed commit guidelines. - -#### Pull Request Process - -1. Ensure all tests pass: `cargo test` -2. Update documentation if needed -3. Add yourself to Co-Authors in the commit -4. Request review from maintainers -5. Address review feedback -6. Once approved, maintainers will merge - -### Adding Language Support - -Want to add support for a new language? See our comprehensive guide: - -📖 **[Adding Language Support Guide](./docs/contributing/adding-languages.md)** - -This includes: -- Parser integration -- CFG (Control Flow Graph) builder -- Metrics extraction -- Test fixtures and golden files -- Documentation updates - -## Project Structure - -``` -hotspots/ -├── hotspots-core/ # Core analysis library -│ ├── src/ -│ │ ├── language/ # Language parsers & CFG builders -│ │ ├── metrics.rs # Metrics calculation -│ │ ├── delta.rs # Delta mode logic -│ │ └── ... -│ └── tests/ # Integration & unit tests -├── hotspots-cli/ # CLI binary -├── packages/ # TypeScript packages -│ └── types/ # TypeScript type definitions -├── action/ # GitHub Action -├── docs/ # Documentation -└── examples/ # Example code -``` - -## Testing - -```bash -# Run all tests -cargo test - -# Run specific test -cargo test test_name - -# Run with output -cargo test -- --nocapture - -# Run golden tests (deterministic output verification) -cargo test --test golden_tests - -# Run integration tests (pytest suite) -make test-integration - -# Run comprehensive tests (auto-detect pytest; falls back to legacy script) -make test-comprehensive -``` - -### Integration Tests - -- Location: `integration/` (pytest-based E2E tests) and `integration/legacy/` (fallback script). -- Entry points: - - `make test-integration` — runs `pytest -q integration`. - - `make test-comprehensive` — runs pytest if available, else `python3 integration/legacy/test_comprehensive.py`. -- CI runs `make test-comprehensive` and uploads artifacts from `test-repo-comprehensive/`. - -### Golden Files - -- No manual path fixing needed. Golden tests normalize file paths at assertion time for cross-platform consistency. - -## Documentation - -- All docs are in `docs/` directory -- Documentation powers [docs.hotspots.dev](https://docs.hotspots.dev) -- Use markdown with frontmatter for metadata -- Test docs locally before submitting - -## Code of Conduct - -- Be respectful and inclusive -- Provide constructive feedback -- Help others learn and grow -- Follow GitHub's [Community Guidelines](https://docs.github.com/en/site-policy/github-terms/github-community-guidelines) - -## Recognition - -Contributors are recognized in: -- Git commit history (Co-Authored-By) -- Release notes (CHANGELOG.md) -- GitHub contributors page - -## Questions? - -- 💬 [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) -- 📧 [Open an Issue](https://github.com/Stephen-Collins-tech/hotspots/issues) - -## Detailed Guides - -For more detailed information, see: - -- [Development Setup](./docs/contributing/development.md) - Detailed dev environment setup -- [Adding Languages](./docs/contributing/adding-languages.md) - Language implementation guide -- [Release Process](./docs/contributing/releases.md) - How releases are created -- [Architecture](./docs/architecture/overview.md) - System architecture overview - ---- - -**Thank you for contributing!** 🚀 diff --git a/IDEAS.md b/IDEAS.md deleted file mode 100644 index c4f97d7..0000000 --- a/IDEAS.md +++ /dev/null @@ -1,93 +0,0 @@ -# Hotspots — Ideas & Future Directions - -Captured after v1.9.0 (2026-03-24). - ---- - -## Prioritized by Bang for Buck - -### Tier 1 — High Value, Low Effort - -**SARIF output format** -- Add `OutputFormat::Sarif` variant alongside Text/Json/Html/Jsonl -- SARIF is the standard for GitHub code scanning — enables native GitHub Actions integration with zero user config -- Effort: ~1–2 days (new output format, well-specified schema, no core changes) - -**Pre-commit hook template generation** -- `hotspots init --hooks` writes a ready-to-use pre-commit config snippet -- Effort: ~half a day (mostly docs/template string) - -**Kotlin support** -- Java CFG builder is 669 lines; Kotlin is structurally similar (JVM, tree-sitter grammar available) -- Effort: ~2–3 days (new parser + cfg_builder, update FunctionBody enum + all match arms) -- High reach: Kotlin is now the default Android language - ---- - -### Tier 2 — High Value, Moderate Effort - -**File-level aggregation in output** -- Roll up function scores to file level (max CC, avg LRS, total touches) -- `OutputLevel::File` already exists as an enum variant — it may be partially stubbed -- Effort: ~2–3 days -- Makes reports actionable for large codebases where function-level noise is overwhelming - -**Configurable thresholds per language** -- Config file (already supported) gains language-specific CC/LRS cutoffs -- e.g., Python tends to have higher CC for idiomatic code than Go -- Effort: ~2–3 days (config schema + per-language filter pass) - -**Historical trend charts in HTML report** -- html.rs is already 2820 lines with a full scatter plot -- Add a time-series chart using existing snapshot/trend data -- Effort: ~3–4 days (JS chart rendering, data serialization, HTML template work) - ---- - -### Tier 3 — Moderate Value, Higher Effort - -**C/C++ support** -- Tree-sitter grammar exists but C/C++ is complex: macros, headers, no clear function boundary convention -- Effort: ~1–2 weeks (parser + cfg_builder + significant edge-case handling) -- High corpus size but requires robust handling to avoid noisy results - -**Cross-function call graph risk propagation** -- `callgraph.rs` (789 lines) already computes fan-in/fan-out and PageRank -- Would propagate risk scores up the call graph so callers of high-CC functions inherit risk -- Effort: ~3–5 days (scoring.rs + risk.rs changes, callgraph integration, golden test updates) - -**Rust/Python `match`/`match-case` CFG accuracy** -- Rust CFG builder is 325 lines, Python 596 — match arms are partially handled -- Getting edge cases right (guards, binding patterns, exhaustiveness) is fiddly -- Effort: ~3–5 days per language - -**Ruby support** -- No existing patterns to reuse (unlike Kotlin/Java) -- Effort: ~1 week - ---- - -### Tier 4 — Lower Priority / Large Effort - -**VS Code extension** -- Separate project, different tech stack (TypeScript extension API) -- Requires language server protocol or subprocess integration -- Effort: ~2–4 weeks - -**async/await control flow modeling** -- Affects JS/TS, Python, Rust — each has different semantics -- Would significantly improve CC accuracy for modern codebases -- Effort: ~1–2 weeks per language, high correctness risk - -**Exception propagation edges in CFG** -- Needs an exception overlay graph across the whole call graph -- High complexity, moderate CC accuracy gain -- Effort: ~2–3 weeks - ---- - -## Raw Ideas (Unprioritized) - -- Swift support (mobile) -- PHP support -- Dead code detection (CFG nodes unreachable from entry — infrastructure already exists) diff --git a/README.md b/README.md index 324f3ab..9cae728 100644 --- a/README.md +++ b/README.md @@ -13,653 +13,208 @@ **Find the code that's actually causing problems.** -Your codebase has thousands of functions. Some are messy but never break. Others are complex AND change constantly—those are your **hotspots**, the 20% of code causing 80% of your bugs, incidents, and slowdowns. - -Stop refactoring code that doesn't matter. Focus on what's hurting you right now. - ---- - -## The Problem - -You know your codebase has tech debt. But which code should you actually refactor? - -❌ **Refactor by gut feeling** → Waste weeks on code that rarely causes issues -❌ **Refactor everything** → Impossible, and you'll rewrite stable code that doesn't need touching -❌ **Refactor nothing** → Tech debt compounds until "fix this bug" becomes "rewrite everything" - -**The real question:** Which functions are both complex AND frequently changed? - -Those are the functions causing production incidents, slowing down features, and burning out your team. - ---- - -## The Solution - -Hotspots analyzes your codebase and git history to find functions that are: - -1. **Complex** - High cyclomatic complexity, deep nesting, lots of branching -2. **Volatile** - Changed frequently in recent commits -3. **Risky** - The dangerous combination of both - -Instead of guessing what to refactor, you get a prioritized list: - -![Hotspots Example Report](https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/assets/hotspots-example-report.png) - -*Risk Landscape from a real 7,911-function codebase: 284 Critical (red), 491 High (orange). Each dot is a function — top-right are your hotspots.* - -```bash -hotspots analyze src/ - -# Output: -Critical (LRS ≥ 9.0): -processPlanUpgrade src/api/billing.ts:142 LRS 12.4 CC 15 ND 4 FO 8 NS 3 - -High (6.0 ≤ LRS < 9.0): -validateSession src/auth/session.ts:67 LRS 9.8 CC 11 ND 3 FO 7 NS 2 -applySchema src/db/migrations.ts:203 LRS 8.1 CC 10 ND 2 FO 5 NS 2 -``` - -Now you know exactly where to focus. +Your codebase has thousands of functions. Some are messy but never break. Others are complex AND change constantly — those are your **hotspots**, the 20% of code causing 80% of your bugs, incidents, and slowdowns. --- -## What You Get - -### ✅ Refactor What Actually Matters - -Stop wasting time on code that "looks messy" but never causes problems. Focus on the 20% of functions responsible for 80% of your incidents. - -### ✅ Block Complexity Regressions in CI - -Catch risky changes before they merge: +## Install ```bash -# Run in CI with policy checks -hotspots analyze src/ --mode delta --policy -# Exit code 1 if policies fail → CI fails +brew install Stephen-Collins-tech/tap/hotspots # macOS +npm install -g @stephencollinstech/hotspots # any platform +pip install hotspots-cli # any platform +cargo install hotspots-cli # Rust toolchain +curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh # Linux ``` -Your CI fails if someone introduces high-risk code. No manual review needed. +Windows: download the binary from [GitHub Releases](https://github.com/Stephen-Collins-tech/hotspots/releases/latest). -**GitHub Actions** — use the native action for zero-config CI integration: +Verify: `hotspots --version` +**GitHub Action:** ```yaml - uses: Stephen-Collins-tech/hotspots/action@v1 with: github-token: ${{ secrets.GITHUB_TOKEN }} ``` -See [docs/guide/github-action.md](docs/guide/github-action.md) for the full action reference. - -### ✅ Fine-Tune Rankings with Repo-Specific ML - -The default heuristic ranker works out of the box, but every codebase is different. Train a local RandomForest ranker from your own fix-commit history to score functions based on patterns that actually predict bugs in your repo: - -```bash -# Fit a ranker from the last year of fix commits (precise blame-based labels) -hotspots train . --blame - -# Next analyze picks up the trained ranker automatically -hotspots analyze . -``` - -The trained model learns which structural features (complexity, churn, call graph) correlate with real bug fixes in your history — not a generic heuristic. Scores are saved to `.hotspots/ranker.json` and reused on every subsequent `analyze`. - -**Not sure if training actually helped?** Use `--eval` to check: - -```bash -hotspots train . --eval -``` - -This prints a Precision@K table after training — how many of the top-K ranked functions were genuinely in fix commits: - -``` -P@K evaluation (365-day fix-label window): - K P@K base_rate - 10 0.400 0.084 - 20 0.300 0.084 - 50 0.200 0.084 - 100 0.150 0.084 - 200 0.110 0.084 -``` - -**How to read it:** `base_rate` is the fraction of all functions that appeared in a bug-fix commit. If `P@10` is much higher than `base_rate`, the ranker is genuinely surfacing risky functions at the top. If `P@10` ≈ `base_rate`, the model is no better than random — skip applying it and rely on the default LRS ranking instead. - -### ✅ Ship with Confidence, Not Crossed Fingers - -Know which files are landmines before you touch them. See complexity trends over time. Make informed decisions about refactoring vs rewriting vs leaving it alone. - -### ✅ Get AI-Assisted Refactoring - -Hotspots integrates with Claude Code, Cursor, and GitHub Copilot. Point your AI at the hottest functions and get refactoring suggestions that actually improve your codebase. - -```bash -# Analyze changes in your project -hotspots analyze . --mode delta --format json - -# Get agent-optimized output (quadrant buckets + action text) -hotspots analyze . --mode delta --all-functions --format json -``` - --- ## Quick Start -### 1. Install - -**macOS (Homebrew):** -```bash -brew install Stephen-Collins-tech/tap/hotspots -``` - -**npm:** -```bash -npm install -g @stephencollinstech/hotspots -``` - -**pip:** -```bash -pip install hotspots-cli -``` - -**cargo (Rust toolchain):** ```bash -cargo install hotspots-cli -``` +# Find your hotspots +hotspots analyze src/ -**macOS / Linux (shell script):** -```bash -curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh +# Critical (LRS ≥ 9.0): +# processPlanUpgrade src/api/billing.ts:142 LRS 12.4 CC 15 ND 4 FO 8 NS 3 +# +# High (6.0 ≤ LRS < 9.0): +# validateSession src/auth/session.ts:67 LRS 9.8 CC 11 ND 3 FO 7 NS 2 ``` -Verify with `hotspots --version`. - -**GitHub Action:** -```yaml -- uses: Stephen-Collins-tech/hotspots/action@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} -``` -See [docs/guide/github-action.md](docs/guide/github-action.md) for inputs, outputs, and examples. +**Critical** = refactor now. **High** = refactor next time you touch it. **Moderate** = block increases. **Low** = leave it alone. -### 2. Analyze Your Code +### Common commands ```bash -# Find your hotspots -hotspots analyze src/ - -# Filter to critical functions only -hotspots analyze src/ --min-lrs 9.0 - -# Get per-function explanations with driver labels +# Per-function explanations with refactoring advice hotspots analyze . --mode snapshot --format text --explain --top 10 -# Get JSON for tooling/AI -hotspots analyze src/ --format json - -# Stream JSONL for pipeline processing -hotspots analyze src/ --format jsonl - -# Compare with previous commit (delta mode) +# Block complexity regressions in CI hotspots analyze src/ --mode delta --policy -``` - -**Large repos:** By default, hotspots uses hybrid touch mode — file-level git activity for all functions, with per-function precision only for actively-changed files (≥5 commits in 30 days). This keeps memory usage low and completes reliably on repos of any size. -```bash -# Default: hybrid touch (fast, completes on any repo) -hotspots analyze . --mode snapshot --explain - -# Full precision (slower cold start, more accurate activity scores) -hotspots analyze . --mode snapshot --explain --per-function-touches - -# Fastest possible (file-level only, no per-function git log) -hotspots analyze . --mode snapshot --explain --no-per-function-touches -``` - -### 3. Act on Results - -**Critical functions (LRS ≥ 9.0):** Refactor now. These are your top priority. -**High functions (LRS 6.0-9.0):** Watch closely. Refactor before they become critical. -**Moderate functions (LRS 3.0-6.0):** Keep an eye on them. Block complexity increases. -**Low functions (LRS < 3.0):** You're good. Don't overthink these. +# Compare any two git refs +hotspots diff main HEAD --top 10 --policy ---- +# Interactive HTML report +hotspots analyze src/ --mode snapshot --format html -## Supported Languages +# JSON for tooling/AI +hotspots analyze src/ --format json -- **TypeScript** - `.ts`, `.tsx`, `.mts`, `.cts` -- **JavaScript** - `.js`, `.jsx`, `.mjs`, `.cjs` -- **Go** - `.go` -- **Python** - `.py` -- **Rust** - `.rs` -- **Java** - `.java` -- **C** - `.c`, `.h` -- **C#** - `.cs` +# Track trends over time +hotspots trends . -Full language parity across all metrics and features. See [docs/reference/language-support.md](docs/reference/language-support.md) for details. +# Train a repo-specific ranker from your bug history +hotspots train . --blame --eval +``` --- ## How It Works -Hotspots computes a **Local Risk Score (LRS)** for each function based on: - -1. **Cyclomatic Complexity (CC)** - How many paths through the code? -2. **Nesting Depth (ND)** - How deeply nested are your if/for/while statements? -3. **Fan-Out (FO)** - How many other functions does this call? -4. **Non-Structured Exits (NS)** - How many early returns, breaks, throws? - -These metrics combine into a single **Local Risk Score (LRS)**. Higher LRS = higher risk of bugs, incidents, and developer confusion. +Hotspots computes a **Local Risk Score (LRS)** per function from four structural metrics: -LRS is then combined with **Activity Risk** signals from git history and the call graph: +| Metric | What it measures | +|---|---| +| **CC** — Cyclomatic Complexity | Independent decision paths (if/loop/catch/&&/\|\|) | +| **ND** — Nesting Depth | Maximum depth of nested control structures | +| **FO** — Fan-Out | Distinct functions called | +| **NS** — Non-Structured Exits | Early returns, throws, breaks | -- **Churn** — lines changed in the last 30 days (volatile code) -- **Touch frequency** — commit count touching this function -- **Recency** — days since last change (branch-aware) -- **Fan-in** — how many other functions call this one (call graph) -- **Cyclic dependency** — SCC membership (tightly coupled code) -- **Neighbor churn** — lines changed in direct dependencies - -The call graph engine resolves imports to detect fan-in, PageRank, betweenness centrality, and SCC membership. Functions that are both complex AND heavily depended upon by other changing code rise to the top. +``` +LRS = 1.0×R_cc + 0.8×R_nd + 0.6×R_fo + 0.7×R_ns +``` -### Understanding Quadrants +where each component is log-scaled and capped to prevent outliers from dominating. -Every function is placed in one of four quadrants based on its structural complexity and recent activity: +In **snapshot mode**, LRS is combined with git history (churn, touch frequency, recency) and call graph metrics (fan-in, PageRank, SCC membership) to compute an **Activity Risk Score** and place each function in a triage quadrant: -| Quadrant | Complexity | Recent Activity | What it means | +| Quadrant | Complexity | Activity | Action | |---|---|---|---| -| **fire** | High | High | Live regression risk — complex AND actively changing right now | -| **debt** | High | Low | Structural debt — complex but not recently touched; high blast radius when next changed | -| **simple-active** | Low | High | Active but manageable — monitor, low structural risk | -| **simple-stable** | Low | Low | Lowest priority | - -**Important:** The activity-weighted risk score (and `lrs`) is a decay function computed over git history — it never reaches zero even if a function hasn't been touched in months. A high risk score alone does **not** mean a function is actively changing. Always check `quadrant` and `touches_30d` to determine whether a function is a live regression risk (fire) or structural debt (debt). +| `fire` | High | High | Refactor now | +| `debt` | High | Low | Schedule before next push | +| `watch` | Low | High | Monitor | +| `ok` | Low | Low | Leave it alone | -- **fire**: Refactor now — every commit is landing on a complex function -- **debt**: Schedule proactively — refactor before the next development push into that area, not urgently -- **simple-active**: Watch closely but don't over-invest in refactoring -- **simple-stable**: Leave it alone unless metrics change +**Risk bands:** Low (< 3) · Moderate (3–6) · High (6–9) · Critical (≥ 9) -**Example:** +--- -```typescript -// LRS: 12.4 (Critical) - Complex AND frequently changed -function processPlanUpgrade(user, newPlan, paymentMethod) { - if (!user.isActive) return false; - if (user.plan === newPlan) return true; - - if (paymentMethod.type === "card") { - if (paymentMethod.isExpired) { - try { - paymentMethod = renewPaymentMethod(user); - } catch (error) { - logError(error); - notifyUser(user, "payment_failed"); - return false; - } - } - - if (newPlan.price > user.plan.price) { - const prorated = calculateProration(user, newPlan); - if (!chargeCard(paymentMethod, prorated)) { - return false; - } - } - } else if (paymentMethod.type === "invoice") { - // Different logic for invoice customers... - } - - updateDatabase(user, newPlan); - sendConfirmation(user); - return true; -} -``` +## Supported Languages -**This function:** -- CC: 15 (lots of branching) -- ND: 4 (deeply nested) -- FO: 8 (calls many functions) -- NS: 3 (multiple early returns) -- **LRS: 12.4** ← This is a hotspot +TypeScript · JavaScript · Go · Python · Rust · Java · C/C headers · C# · Vue -Refactor this before it causes a production incident. +All 12 file extensions (`.ts`, `.tsx`, `.mts`, `.cts`, `.js`, `.jsx`, `.mjs`, `.cjs`, `.go`, `.py`, `.rs`, `.java`, `.c`, `.h`, `.cs`, `.vue`) work out of the box. --- -## Features - -### 🚦 Policy Enforcement (CI/CD) - -Block risky code before it merges: +## CI/CD Integration -- **Critical Introduction** - Fail CI if new functions exceed LRS 9.0 -- **Excessive Regression** - Fail CI if LRS increases by ≥1.0 -- **Watch/Attention Warnings** - Warn about functions approaching thresholds -- **Rapid Growth Detection** - Catch functions growing >50% in complexity +### GitHub Action (zero config) -```bash -# Run in CI with policy checks -hotspots analyze src/ --mode delta --policy -# Exit code 1 if policies fail → CI fails +```yaml +name: Hotspots +on: [pull_request, push] +jobs: + analyze: + runs-on: ubuntu-latest + permissions: + contents: read + pull-requests: write + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + - uses: Stephen-Collins-tech/hotspots-action@v1 + with: + github-token: ${{ secrets.GITHUB_TOKEN }} ``` -### 🔍 Driver Labels & Explain Mode +Posts PR comments, generates HTML reports, fails builds on policy violations. -Understand *why* a function is flagged and get concrete refactoring advice: +### Manual CI ```bash -hotspots analyze . --mode snapshot --format text --explain --top 10 -``` - -Each function shows its primary **driver** (`high_complexity`, `deep_nesting`, -`high_churn_low_cc`, `high_fanout_churning`, `high_fanin_complex`, `cyclic_dep`, -`composite`) plus an **Action** line with dimension-specific guidance: - -``` -processPayment /src/billing.ts:89 - LRS: 14.52 | Band: critical | Driver: high_complexity - CC: 15, ND: 4, FO: 8, NS: 3 - Action: Reduce branching; extract sub-functions -``` - -Use `--level file` or `--level module` for higher-level aggregated views. - -### 📊 Multiple Output Formats - -**Terminal (human-readable):** -``` -Critical (LRS ≥ 9.0): -processPlanUpgrade src/api/billing.ts:142 LRS 12.4 CC 15 ND 4 FO 8 NS 3 +# Fail CI if new critical functions introduced or LRS increases ≥ 1.0 +hotspots analyze src/ --mode delta --policy ``` -**JSON (machine-readable):** -```json -{ - "schema_version": 2, - "functions": [ - { - "function_id": "src/api/billing.ts::processPlanUpgrade", - "file": "src/api/billing.ts", - "line": 142, - "lrs": 12.4, - "band": "critical", - "driver": "high_complexity", - "metrics": { "cc": 15, "nd": 4, "fo": 8, "ns": 3 } - } - ] -} -``` +Exit code 1 on blocking violations, 0 on warnings only. -**JSONL (streaming per-function):** -```bash -hotspots analyze src/ --mode snapshot --format jsonl | grep '"band":"critical"' -``` -One JSON object per line — ideal for large repos and shell pipeline processing. +--- -**HTML (interactive reports):** -- Sortable, filterable tables -- Risk band visualization -- Shareable with stakeholders -- Upload as CI artifacts +## Key Features -**SARIF (Static Analysis Results Interchange Format):** -```bash -hotspots analyze src/ --format sarif -``` -Compatible with GitHub Code Scanning and any SARIF-aware tool. +**Policy engine** — blocks: new critical-risk functions, LRS regressions ≥ 1.0, net repo regression ≥ 5.0. Warns: approaching thresholds, rapid growth > 50%. -### 🔇 Suppression Comments +**Driver labels** — each function gets a primary diagnosis: `high_complexity`, `deep_nesting`, `exit_heavy`, `high_churn_low_cc`, `high_fanout_churning`, `high_fanin_complex`, `cyclic_dep`, or `composite`. -Have complex code you can't refactor yet? Suppress warnings with a reason: +**Pattern detection** — 13 named patterns in two tiers: structural (always, e.g. `complex_branching`, `god_function`) and enriched (snapshot mode, e.g. `churn_magnet`, `cyclic_hub`, `volatile_god`). +**Suppression comments** — exclude functions from CI failures while keeping them visible: ```typescript // hotspots-ignore: legacy payment processor, rewrite scheduled Q2 2026 -function legacyBillingLogic() { - // Complex but can't touch it yet -} +function legacyBillingLogic() { ... } ``` -Functions with suppressions: -- ✅ Still appear in reports (visibility) -- ❌ Don't fail CI policies (pragmatism) -- 📝 Require a reason (accountability) - -### ⚙️ Configuration +**Trained ranker** — fit a RandomForest from your repo's bug-fix history: +```bash +hotspots train . --blame --eval # train + check P@K vs base rate +``` -Customize thresholds, weights, and file patterns: +**Output formats** — `text` (terminal), `json` (machine), `jsonl` (streaming), `html` (interactive), `sarif` (GitHub Code Scanning). +**Configuration** — `.hotspotsrc.json` in project root (auto-discovered): ```json { - "thresholds": { - "moderate": 3.0, - "high": 6.0, - "critical": 9.0 - }, "include": ["src/**/*.ts"], - "exclude": ["**/*.test.ts", "**/__mocks__/**"] + "exclude": ["**/*.test.ts"], + "thresholds": { "moderate": 3.0, "high": 6.0, "critical": 9.0 }, + "weights": { "cc": 1.0, "nd": 0.8, "fo": 0.6, "ns": 0.7 } } ``` -See [docs/guide/configuration.md](docs/guide/configuration.md) for all options. - -### 🤖 AI Integration - -**Claude Code:** -```bash -# Analyze changes and feed to Claude Code -hotspots analyze . --mode delta --format json - -# Get agent-optimized output -hotspots analyze . --mode delta --all-functions --format json -``` - -See [docs/integrations/ai-agents.md](docs/integrations/ai-agents.md) for complete guide. - -**Cursor/GitHub Copilot:** -```bash -hotspots analyze src/ --format json | jq '.functions[] | select(.lrs > 9)' -# Feed results to your AI coding assistant -``` - -### 📈 Git History Analysis - -Track complexity over time: - -```bash -# Create baseline snapshot -hotspots analyze src/ --mode snapshot - -# Compare current code vs baseline -hotspots analyze src/ --mode delta - -# Compare any two git refs (branches, tags, SHAs) -hotspots diff main HEAD -hotspots diff v1.0.0 v2.0.0 --format json -hotspots diff main HEAD --top 10 --policy - -# See complexity trends -hotspots trends . - -# Train a repo-specific ranker from fix-commit history -hotspots train . --blame - -# Check whether the trained model is actually useful (P@K evaluation) -hotspots train . --eval - -# Prune unreachable snapshots (after force-push or branch deletion) -hotspots prune --unreachable --older-than 30 - -# Compact snapshot history -hotspots compact --level 0 -``` - -Delta mode and `hotspots diff` show: -- Functions that got more complex -- Functions that were simplified -- New high-complexity functions introduced -- Overall repository complexity trend - -`hotspots diff` requires snapshots to exist for both refs (run `hotspots analyze --mode snapshot` at each ref first). Use `--auto-analyze` to generate missing snapshots automatically via git worktrees. - -### ⚙️ Configuration Commands - -```bash -# Show resolved configuration (weights, thresholds, filters) -hotspots config show - -# Validate configuration file without running analysis -hotspots config validate -``` - -### 🪝 Hook Templates - -```bash -# Print pre-commit and CI hook templates to stdout -hotspots init --hooks -``` - -Outputs ready-to-use shell hooks and pre-commit framework config for enforcing policies locally. - --- ## Documentation -- 🚀 [Quick Start](docs/getting-started/quick-start.md) - Get started in 5 minutes -- 📖 [CLI Reference](docs/reference/cli.md) - All commands and options -- 📊 [Scoring Methodology](docs/reference/scoring.md) - How scores are calculated and ranked -- 🎯 [CI Integration](docs/guide/ci-integration.md) - GitHub Actions, GitLab CI -- 🤖 [AI Integration](docs/integrations/ai-agents.md) - Claude, Cursor, Copilot -- 🏗️ [Architecture](docs/architecture/overview.md) - How it works -- 🤝 [Contributing](docs/contributing/index.md) - Add languages, fix bugs, improve docs - -**Full documentation:** [docs/index.md](docs/index.md) +- [docs/USAGE.md](docs/USAGE.md) — workflows, CI setup, output formats, policy engine, suppression, training, snapshot management +- [docs/REFERENCE.md](docs/REFERENCE.md) — complete CLI reference, all flags, config options, JSON schema, metrics formula +- [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) — system design, analysis pipeline, invariants, design decisions +- [docs/CONTRIBUTING.md](docs/CONTRIBUTING.md) — dev setup, adding languages, release process --- ## Why Hotspots? -### vs ESLint Complexity Rules - -**ESLint:** Checks individual metrics (CC > 10). No context about change frequency or real-world risk. -**Hotspots:** Combines multiple metrics into LRS. Integrates git history. Prioritizes based on actual risk. - -### vs SonarQube / CodeClimate - -**SonarQube:** Enterprise platform, complex setup, slow scans, requires server infrastructure. -**Hotspots:** Single binary, instant analysis, zero config, works offline, git history built-in. - -### vs Code Reviews - -**Reviews:** Catch complexity subjectively. Miss gradual regressions. Don't track trends. -**Hotspots:** Objective metrics. Catches every change. Shows trends over time. Enforces policies automatically. - -**Use both:** Hotspots + code reviews = comprehensive quality control. - ---- - -## Real-World Use Cases - -### 🔥 Incident Prevention -"We had 3 production incidents in Q1. All originated from the same 5 functions. Hotspots flagged all 5 as critical. We refactored them in Q2. Zero incidents since." - -### 🚀 Faster Onboarding -"New engineers use Hotspots to identify risky code before touching it. 'This function is LRS 11.2, be careful' = instant context." - -### 🎯 Refactoring Sprints -"We allocate 1 sprint per quarter to reduce our top 10 hotspots. Dropped average LRS from 6.2 to 4.1 over 6 months." - -### 🤖 AI-Guided Refactoring -"Feed hotspots JSON to Claude. It suggests refactorings for critical functions. Accept, commit, verify LRS dropped. Repeat." - -### ⚖️ Technical Debt Metrics -"Execs ask 'How's our tech debt?' I show them: 23 critical functions (down from 31), average LRS 4.8 (down from 5.3). Clear progress." - ---- - -## Installation - -### Quick Install - -**macOS (Homebrew):** -```bash -brew install Stephen-Collins-tech/tap/hotspots -``` - -**npm:** -```bash -npm install -g @stephencollinstech/hotspots -``` - -**pip:** -```bash -pip install hotspots-cli -``` - -**cargo (Rust toolchain):** -```bash -cargo install hotspots-cli -``` - -**macOS / Linux (shell script):** -```bash -curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh -``` - -Verify with `hotspots --version`. - -**Install a specific version:** -```bash -HOTSPOTS_VERSION=v1.0.0 curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh -``` - -### Build from Source +**vs ESLint complexity rules** — ESLint checks one metric in isolation with no git context. Hotspots combines four metrics, git history, and call graph topology into a single prioritized list. -```bash -git clone https://github.com/Stephen-Collins-tech/hotspots.git -cd hotspots -cargo build --release -mkdir -p ~/.local/bin -cp target/release/hotspots ~/.local/bin/ -``` +**vs SonarQube / CodeClimate** — Enterprise platforms requiring server infrastructure. Hotspots is a single binary, zero config, works offline, results in seconds. -**Requirements:** Rust 1.75 or later +**vs code reviews** — Reviews catch complexity subjectively and miss gradual drift. Hotspots enforces objective thresholds automatically on every commit. --- ## Contributing -We welcome contributions! - -- 🐛 [Report bugs](https://github.com/Stephen-Collins-tech/hotspots/issues) -- 💡 [Request features](https://github.com/Stephen-Collins-tech/hotspots/discussions) -- 🔧 [Submit PRs](docs/contributing/index.md) -- 📖 [Improve docs](docs/contributing/index.md) - -**Want to add a language?** See [docs/contributing/adding-languages.md](docs/contributing/adding-languages.md) - we have a proven pattern for adding TypeScript, JavaScript, Go, Python, Rust, and Java. +- Bug reports: [GitHub Issues](https://github.com/Stephen-Collins-tech/hotspots/issues) +- Feature requests: [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) +- PRs: see [docs/CONTRIBUTING.md](docs/CONTRIBUTING.md) --- ## License -MIT License - see [LICENSE-MIT](LICENSE-MIT) for details. - ---- - -## Next Steps - -1. ⚡ [Install Hotspots](#installation) (2 minutes) -2. 🔍 Run your first analysis: `hotspots analyze src/` -3. 🎯 Identify your top 10 hotspots -4. 🛠️ Refactor the worst offender -5. 📊 Add to CI/CD: `hotspots analyze src/ --mode delta --policy` -6. 🧠 Train a repo-specific ranker: `hotspots train . --blame` -6. 🤖 Integrate with AI: [AI Integration Guide](docs/integrations/ai-agents.md) - -**Questions?** Open a [GitHub Discussion](https://github.com/Stephen-Collins-tech/hotspots/discussions). - -**Found a bug?** Open an [issue](https://github.com/Stephen-Collins-tech/hotspots/issues). - ---- - -**Stop refactoring guesswork. Start with Hotspots.** +MIT — see [LICENSE-MIT](LICENSE-MIT). diff --git a/TASKS.md b/TASKS.md deleted file mode 100644 index 313ec32..0000000 --- a/TASKS.md +++ /dev/null @@ -1,477 +0,0 @@ -# Pattern Detection — Implementation Tasks - -Branch: `feature/pattern-detection` - -This file tracks the implementation of pattern detection as specified in `docs/patterns.md`. Work through tasks in order — each task depends on the ones before it. - ---- - -## Context - -Patterns are informational labels derived from existing metrics. They do not affect LRS or risk bands. The authoritative spec is `docs/patterns.md`. - -**Files that will be modified:** - -| File | Change | -|---|---| -| `hotspots-core/src/patterns.rs` | NEW — pattern engine | -| `hotspots-core/src/lib.rs` | expose `pub mod patterns` | -| `hotspots-core/src/callgraph.rs` | make `is_entry_point` pub | -| `hotspots-core/src/config.rs` | add `PatternThresholdsConfig` to `HotspotsConfig`; add `pattern_thresholds` to `ResolvedConfig` | -| `hotspots-core/src/report.rs` | add `patterns` and `pattern_details` to `FunctionRiskReport` | -| `hotspots-core/src/analysis.rs` | compute Tier 1 patterns, pass to report | -| `hotspots-core/src/snapshot.rs` | add `neighbor_churn` to `CallGraphMetrics`; add `patterns` and `pattern_details` to `FunctionSnapshot`; compute neighbor_churn and Tier 2 patterns in enrichment | -| `hotspots-cli/src/main.rs` | add `PATTERNS` column to tabular output; add `--explain-patterns` flag | -| `hotspots-core/src/html.rs` | add `Patterns` column to HTML report | -| `hotspots-core/tests/golden_tests.rs` | add golden assertions for pattern output | -| `hotspots-core/tests/fixtures/` | add synthetic fixture for pattern golden tests | -| `hotspots-core/tests/golden/` | add expected output file for pattern golden tests | - ---- - -## Task 1 — Create `patterns.rs`: pure pattern engine - -**File:** `hotspots-core/src/patterns.rs` (new file) - -Implement the entire pattern classification engine as a pure, stateless module. No I/O, no global state. - -### String types, not `&'static str` - -All string fields in `PatternDetail` and `TriggeredBy` must be `String`, not `&'static str`. `&'static str` cannot be used in a `#[derive(Serialize)]` struct without a custom impl. The allocation cost is negligible — patterns are computed once per function per analysis run. - -### Types to define - -```rust -// Input for Tier 1 classification -pub struct Tier1Input { - pub cc: usize, - pub nd: usize, - pub fo: usize, - pub ns: usize, - pub loc: usize, -} - -// Input for Tier 2 classification (all Option — absent outside snapshot mode) -pub struct Tier2Input { - pub fan_in: Option, - pub scc_size: Option, - pub churn_lines: Option, - pub days_since_last_change: Option, - pub neighbor_churn: Option, - pub is_entrypoint: bool, // suppresses middle_man and neighbor_risk when true -} - -// All thresholds in one struct — defaults match docs/patterns.md. -// Passed by reference into classify(); callers use Thresholds::default() unless -// overridden by config. This keeps all threshold logic in one place and makes -// per-project overrides (Task 9) a thin config-loading layer. -#[derive(Debug, Clone)] -pub struct Thresholds { - pub complex_branching_cc: usize, - pub complex_branching_nd: usize, - pub deeply_nested_nd: usize, - pub exit_heavy_ns: usize, - pub god_function_loc: usize, - pub god_function_fo: usize, - pub long_function_loc: usize, - pub churn_magnet_churn: u64, - pub churn_magnet_cc: usize, - pub cyclic_hub_scc: u32, - pub cyclic_hub_fan_in: u32, - pub hub_function_fan_in: u32, - pub hub_function_cc: usize, - pub middle_man_fan_in: u32, - pub middle_man_fo: usize, - pub middle_man_cc_max: usize, - pub neighbor_risk_churn: u64, - pub neighbor_risk_fo: usize, - pub shotgun_target_fan_in: u32, - pub shotgun_target_churn: u64, - pub stale_complex_cc: usize, - pub stale_complex_loc: usize, - pub stale_complex_days: u32, -} - -impl Default for Thresholds { ... } // all values from docs/patterns.md - -// A single triggered condition — used by --explain-patterns. -// All fields are String (not &'static str) for serde compatibility. -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct TriggeredBy { - pub metric: String, // e.g. "LOC", "CC" - pub op: String, // ">=" or "<=" - pub value: u64, // observed value - pub threshold: u64, // threshold compared against -} - -// Full detail for one fired pattern — used by --explain-patterns. -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct PatternDetail { - pub id: String, - pub tier: u8, // 1 or 2 - pub kind: String, // "primitive" or "derived" - pub triggered_by: Vec, -} -``` - -### Functions to implement - -```rust -// Returns sorted pattern IDs only (Tier 1 alphabetical, then Tier 2 alphabetical). -// Implemented by calling classify_detailed() and extracting ids — single source of -// threshold logic, no divergence possible. -pub fn classify(t1: &Tier1Input, t2: &Tier2Input, th: &Thresholds) -> Vec - -// Returns full pattern detail for --explain-patterns. -// This is the canonical implementation — classify() delegates to this. -pub fn classify_detailed( - t1: &Tier1Input, - t2: &Tier2Input, - th: &Thresholds, -) -> Vec - -// Internal helpers — one per primitive pattern. -// Return Some(PatternDetail) if fired, None if not. -// Each helper constructs triggered_by from the metrics that crossed threshold. -fn check_complex_branching(t: &Tier1Input, th: &Thresholds) -> Option -fn check_deeply_nested(t: &Tier1Input, th: &Thresholds) -> Option -fn check_exit_heavy(t: &Tier1Input, th: &Thresholds) -> Option -fn check_god_function(t: &Tier1Input, th: &Thresholds) -> Option -fn check_long_function(t: &Tier1Input, th: &Thresholds) -> Option -fn check_churn_magnet(t1: &Tier1Input, t2: &Tier2Input, th: &Thresholds) -> Option -fn check_cyclic_hub(t2: &Tier2Input, th: &Thresholds) -> Option -fn check_hub_function(t1: &Tier1Input, t2: &Tier2Input, th: &Thresholds) -> Option -fn check_middle_man(t1: &Tier1Input, t2: &Tier2Input, th: &Thresholds) -> Option -fn check_neighbor_risk(t1: &Tier1Input, t2: &Tier2Input, th: &Thresholds) -> Option -fn check_shotgun_target(t2: &Tier2Input, th: &Thresholds) -> Option -fn check_stale_complex(t1: &Tier1Input, t2: &Tier2Input, th: &Thresholds) -> Option -``` - -**Entrypoint suppression:** `check_middle_man` and `check_neighbor_risk` return `None` immediately when `t2.is_entrypoint` is `true`. This is the complete implementation of entrypoint exclusion. - -**`volatile_god` (derived):** computed inside `classify_detailed()` after all primitives are checked. If both `god_function` and `churn_magnet` details are present in the result, append a `volatile_god` entry whose `triggered_by` is the union of both primitives' `triggered_by` lists. Do not re-evaluate raw thresholds. `kind: "derived"`. - -**Ordering:** Tier 1 results first (alphabetical), then Tier 2 (alphabetical). `volatile_god` sorts among Tier 2. - -### Unit tests (required in the same file, under `#[cfg(test)]`) - -For **every primitive pattern**, write four tests using `Thresholds::default()`: -1. Just below threshold — must not fire -2. Exactly at threshold — must fire -3. Well above threshold — must fire -4. Compound pattern: one condition met, other not — must not fire - -For **`volatile_god`** (derived): -1. Only `god_function` conditions met — `volatile_god` must not appear -2. Only `churn_magnet` conditions met — `volatile_god` must not appear -3. Both met — must contain all three: `god_function`, `churn_magnet`, `volatile_god` - -For **entrypoint suppression**: -1. `middle_man` conditions met, `is_entrypoint: false` — must fire -2. Same conditions, `is_entrypoint: true` — must not fire -3. Same pair of tests for `neighbor_risk` - -**Ordering test:** inputs that trigger all 5 Tier 1 patterns. Output must be exactly: -`["complex_branching", "deeply_nested", "exit_heavy", "god_function", "long_function"]` - -**Detail test:** trigger `god_function`. Call `classify_detailed()`. Assert the returned `PatternDetail` has `tier: 1`, `kind: "primitive"`, and `triggered_by` contains entries for `LOC` and `FO` with correct `value` and `threshold`. - ---- - -## Task 2 — `is_entry_point` pub + `neighbor_churn` on `CallGraphMetrics` - -### 2a — Make `is_entry_point` pub in `callgraph.rs` - -**File:** `hotspots-core/src/callgraph.rs` - -`is_entry_point` is currently `fn is_entry_point` (private). Change to `pub fn is_entry_point`. No other changes to this file. - -This method uses name-based heuristics (main, handler patterns, etc.) and is cheap to call. It does not need to be stored per-function — snapshot enrichment (Task 6) calls it directly. - -### 2b — Add `neighbor_churn` to `CallGraphMetrics` - -**File:** `hotspots-core/src/snapshot.rs` - -`CallGraphMetrics` is the struct stored in `FunctionSnapshot.callgraph`. Add: - -```rust -pub neighbor_churn: Option, -``` - -Initialize to `None` in all constructors. The value is computed during enrichment (Task 6) once churn data is available for all functions. - -**Grep first:** search for all `CallGraphMetrics { ` construction sites in `snapshot.rs` and add `neighbor_churn: None`. - ---- - -## Task 3 — Add `patterns` and `pattern_details` to `FunctionRiskReport` - -**File:** `hotspots-core/src/report.rs` - -Add to `FunctionRiskReport`: - -```rust -pub patterns: Vec, -#[serde(skip_serializing_if = "Option::is_none")] -pub pattern_details: Option>, -``` - -Initialize `patterns` to `vec![]` and `pattern_details` to `None` in the constructor. Task 5 wires in real values. - -`pattern_details` is absent from JSON output by default. It is only populated when `--explain-patterns` is passed (Task 10). - -**Grep first:** search for all `FunctionRiskReport` construction sites and add the new fields. - ---- - -## Task 4 — Add `patterns` and `pattern_details` to `FunctionSnapshot` - -**File:** `hotspots-core/src/snapshot.rs` - -Add to `FunctionSnapshot`: - -```rust -pub patterns: Vec, -#[serde(skip_serializing_if = "Option::is_none")] -pub pattern_details: Option>, -``` - -Initialize both to empty. Task 6 wires in real values. - -**Grep first:** search for all `FunctionSnapshot` construction sites and add the new fields. - ---- - -## Task 5 — Wire Tier 1 patterns in analysis mode - -**File:** `hotspots-core/src/analysis.rs` - -After `extract_metrics()` returns `RawMetrics`, compute patterns. Use `config.pattern_thresholds` (added in Task 9; use `Thresholds::default()` until then): - -```rust -let t1 = patterns::Tier1Input { - cc: raw.cc, nd: raw.nd, fo: raw.fo, ns: raw.ns, loc: raw.loc, -}; -let t2 = patterns::Tier2Input { - fan_in: None, scc_size: None, churn_lines: None, - days_since_last_change: None, neighbor_churn: None, - is_entrypoint: false, -}; -let function_patterns = patterns::classify(&t1, &t2, &config.pattern_thresholds); -``` - -Pass `function_patterns` to `FunctionRiskReport`. When `--explain-patterns` is active (Task 10), call `classify_detailed()` instead and populate `pattern_details`. - ---- - -## Task 6 — Compute `neighbor_churn` and Tier 2 patterns in snapshot enrichment - -**File:** `hotspots-core/src/snapshot.rs` - -Runs after call graph and git churn data are both available for all functions. - -### 6a — Compute `neighbor_churn` per function - -For each function snapshot, sum `churn_lines` across all direct callees (1-hop outgoing edges). Look up each callee's churn total from the already-built churn map; contribute 0 for callees with no entry. Store in `snap.callgraph.neighbor_churn`. - -### 6b — Compute Tier 2 patterns - -```rust -let t1 = patterns::Tier1Input { - cc: snap.metrics.cc, nd: snap.metrics.nd, - fo: snap.metrics.fo, ns: snap.metrics.ns, loc: snap.metrics.loc, -}; -let t2 = patterns::Tier2Input { - fan_in: snap.callgraph.as_ref().map(|cg| cg.fan_in as u32), - scc_size: snap.callgraph.as_ref().map(|cg| cg.scc_size as u32), - churn_lines: snap.churn.as_ref().map(|c| c.lines_added + c.lines_deleted), - days_since_last_change: snap.days_since_last_change, - neighbor_churn: snap.callgraph.as_ref().and_then(|cg| cg.neighbor_churn), - is_entrypoint: call_graph.is_entry_point(&snap.function_id), -}; -snap.patterns = patterns::classify(&t1, &t2, &config.pattern_thresholds); -// if --explain-patterns (Task 10): -// snap.pattern_details = Some(patterns::classify_detailed(&t1, &t2, &config.pattern_thresholds)); -``` - ---- - -## Task 7 — Tabular output for `patterns` - -**File:** `hotspots-cli/src/main.rs` - -Add a `PATTERNS` column to tabular output: -- Value: comma-separated pattern IDs, or `-` if empty -- Only render the column when at least one function in the result has patterns -- Column appears after `BAND` - -JSON output requires no changes — `patterns: Vec` serialises as a JSON array automatically. - ---- - -## Task 8 — HTML report patterns column - -**File:** `hotspots-core/src/html.rs` - -Add a `Patterns` column to the HTML function table: -- Each pattern ID rendered as `` — allows per-pattern CSS -- Show `-` for empty lists -- Add minimal CSS: `.pattern { font-family: monospace; font-size: 0.9em; margin-right: 4px; }` - -No new data pipeline work — `FunctionSnapshot.patterns` is populated by Task 6. - ---- - -## Task 9 — Per-project threshold overrides via `.hotspotsrc.json` - -**Files:** `hotspots-core/src/config.rs` - -Follow the exact same pattern as `ScoringWeightsConfig` → `ScoringWeights` → `ResolvedConfig.scoring_weights`. - -### 9a — Add `PatternThresholdsConfig` to `HotspotsConfig` - -```rust -#[derive(Debug, Clone, Serialize, Deserialize)] -#[serde(deny_unknown_fields)] -pub struct PatternThresholdsConfig { - pub complex_branching_cc: Option, - pub complex_branching_nd: Option, - pub deeply_nested_nd: Option, - pub exit_heavy_ns: Option, - pub god_function_loc: Option, - pub god_function_fo: Option, - pub long_function_loc: Option, - pub churn_magnet_churn: Option, - pub churn_magnet_cc: Option, - pub cyclic_hub_scc: Option, - pub cyclic_hub_fan_in: Option, - pub hub_function_fan_in: Option, - pub hub_function_cc: Option, - pub middle_man_fan_in: Option, - pub middle_man_fo: Option, - pub middle_man_cc_max: Option, - pub neighbor_risk_churn: Option, - pub neighbor_risk_fo: Option, - pub shotgun_target_fan_in: Option, - pub shotgun_target_churn: Option, - pub stale_complex_cc: Option, - pub stale_complex_loc: Option, - pub stale_complex_days: Option, -} -``` - -Add to `HotspotsConfig` (note: `deny_unknown_fields` is on `HotspotsConfig` — the new field must be added here or config parsing will reject the `patterns` key): - -```rust -#[serde(default)] -pub patterns: Option, -``` - -### 9b — Add `pattern_thresholds` to `ResolvedConfig` - -```rust -pub pattern_thresholds: patterns::Thresholds, -``` - -### 9c — Merge in `resolve()` - -Follow the `scoring_weights` pattern: start from `patterns::Thresholds::default()`, override field by field for any `Some` value in `PatternThresholdsConfig`. Each field falls back to the default independently — partial configs work. - -### 9d — Validation - -Add a `validate_pattern_thresholds()` helper called from `HotspotsConfig::validate()`. Rules: -- All usize/u32/u64 thresholds must be > 0 -- `middle_man_cc_max` must be < `hub_function_cc` (prevents middle_man and hub_function from being simultaneously impossible to distinguish — warn, not error) - -### 9e — Wire through call sites - -Update Tasks 5 and 6's `classify()` calls to use `&config.pattern_thresholds` instead of `&Thresholds::default()`. This is a one-line change at each call site once `ResolvedConfig` has the field. - ---- - -## Task 10 — `--explain-patterns` flag - -**File:** `hotspots-cli/src/main.rs` - -Add `--explain-patterns` boolean flag to the `analyze` and `snapshot` subcommands. - -When set: -- Call `patterns::classify_detailed()` instead of `patterns::classify()` in analysis (Task 5) and snapshot enrichment (Task 6) -- Store result in `pattern_details` on the report/snapshot struct -- JSON output: `pattern_details` serialises automatically (absent by default via `skip_serializing_if`) -- Tabular output: after the main row for a function, print one indented line per pattern: - ``` - god_function: LOC=85 (≥60), FO=12 (≥10) - ``` -- HTML output: expand each `` into a `` tooltip - -`pattern_details` must remain `None` when `--explain-patterns` is not passed. Do not call `classify_detailed()` in the hot path. - ---- - -## Task 11 — Golden tests for Tier 1 patterns - -**Files:** -- `hotspots-core/tests/fixtures/patterns_tier1.ts` (new) -- `hotspots-core/tests/golden/patterns_tier1.json` (new) -- `hotspots-core/tests/golden_tests.rs` (add test case) - -### Fixture requirements - -Write a TypeScript fixture with five named functions, each engineered to trigger specific Tier 1 patterns: - -1. `godAndLong` — triggers `god_function` and `long_function` (LOC ≥ 80, FO ≥ 10) -2. `complexBranching` — triggers `complex_branching` (CC ≥ 10, ND ≥ 4) but not `deeply_nested` -3. `deeplyNested` — triggers `deeply_nested` alone (ND ≥ 5, CC < 10) -4. `exitHeavy` — triggers `exit_heavy` (NS ≥ 5) -5. `allFiveTier1` — triggers all five Tier 1 patterns simultaneously - -After running analysis on the fixture, capture output and commit it as the golden file. The golden file locks the `patterns` array per function. - -### Test assertion - -```rust -// In golden_tests.rs: -// Run analysis, load golden JSON, assert patterns match for each function by name. -``` - ---- - -## Task 12 — Full test suite pass - -``` -cargo fmt --all -- --check -cargo clippy --all-targets --all-features -- -D warnings -cargo test -``` - -**Anticipated failure modes:** - -- **`deny_unknown_fields`** on `HotspotsConfig` — if Task 9 is not complete, any test that serialises/deserialises a config with the `patterns` key will fail. Complete Task 9 before running tests. -- **Struct exhaustiveness** — any `..Default::default()` or struct literal missing new fields will fail to compile. Grep for all construction sites before compiling. -- **Golden file drift** — `patterns: []` will appear in JSON output for all existing golden files. Update them: an empty `patterns` array is valid and expected for existing fixtures that don't trigger any patterns. -- **Clippy `needless_pass_by_value`** — `classify()` takes `&Thresholds` by reference; ensure no call site passes by value. - ---- - -## Out of scope for this branch - -- **Percentile-based thresholds** — requires a two-pass analysis (collect all metric values repo-wide, then classify per-function). The current per-function stateless engine cannot support this without an architectural change to the analysis pipeline. Documented as a planned extension in `docs/patterns.md`. - ---- - -## Definition of done - -- [x] All 13 patterns implemented in `patterns.rs` with full unit test coverage -- [x] `Thresholds` struct with `Default` impl; all `classify()` calls accept `&Thresholds` -- [x] Entrypoint suppression for `middle_man` and `neighbor_risk` -- [x] `FunctionRiskReport.patterns` populated in analyze mode (Tier 1) -- [x] `FunctionSnapshot.patterns` populated in snapshot mode (Tier 1 + Tier 2) -- [x] `neighbor_churn` computed and stored in snapshot enrichment -- [x] Tabular output shows `PATTERNS` column when non-empty -- [x] HTML output shows `Patterns` column with colored pill badges and pattern breakdown panel -- [x] JSON output includes `patterns` array; `pattern_details` absent unless `--explain-patterns` -- [x] `--explain-patterns` populates `pattern_details` with triggered conditions -- [x] `.hotspotsrc.json` `patterns` section overrides thresholds per-field; partial configs work -- [x] Golden test fixture and expected output committed -- [x] `cargo fmt`, `cargo clippy`, `cargo test` all pass clean diff --git a/docs/.vitepress/config.js b/docs/.vitepress/config.js index 4b7165e..f40359c 100644 --- a/docs/.vitepress/config.js +++ b/docs/.vitepress/config.js @@ -3,7 +3,6 @@ import { defineConfig } from 'vitepress' export default defineConfig({ title: 'Hotspots', description: 'Find where your engineering attention has the highest expected value.', - head: [ ['link', { rel: 'icon', type: 'image/svg+xml', href: '/logo.svg' }], @@ -22,67 +21,24 @@ export default defineConfig({ logo: '/logo.svg', nav: [ - { text: 'Guide', link: '/guide/usage' }, - { text: 'Reference', link: '/reference/cli' }, - { text: 'Codebase Guide', link: '/code-architecture/' }, + { text: 'Quick Start', link: '/quickstart' }, + { text: 'Usage', link: '/USAGE' }, + { text: 'Reference', link: '/REFERENCE' }, + { text: 'Architecture', link: '/ARCHITECTURE' }, + { text: 'Contributing', link: '/CONTRIBUTING' }, { text: 'hotspots.dev', link: 'https://hotspots.dev' }, { text: 'GitHub', link: 'https://github.com/Stephen-Collins-tech/hotspots' } ], sidebar: [ { - text: 'Getting Started', - items: [ - { text: 'Installation', link: '/getting-started/installation' }, - { text: 'Quick Start', link: '/getting-started/quick-start' }, - ] - }, - { - text: 'Guide', - items: [ - { text: 'Usage & Workflows', link: '/guide/usage' }, - { text: 'Training a Ranker', link: '/guide/training' }, - { text: 'Configuration', link: '/guide/configuration' }, - { text: 'CI/CD & GitHub Action', link: '/guide/ci-cd' }, - { text: 'Output Formats', link: '/guide/output-formats' }, - { text: 'AI Integration', link: '/integrations/ai-integration' }, - ] - }, - { - text: 'Reference', - items: [ - { text: 'CLI Reference', link: '/reference/cli' }, - { text: 'Scoring Methodology', link: '/reference/scoring' }, - { text: 'Scoring Changelog', link: '/reference/scoring-changelog' }, - { text: 'Metrics & LRS', link: '/reference/metrics' }, - { text: 'Language Support', link: '/reference/language-support' }, - ] - }, - { - text: 'Codebase Guide', - items: [ - { text: 'Overview', link: '/code-architecture/' }, - { text: 'Analysis Pipeline', link: '/code-architecture/pipeline' }, - { text: 'Core Crate Modules', link: '/code-architecture/core-modules' }, - { text: 'CLI and GitHub Action', link: '/code-architecture/cli-and-action' }, - { text: 'Data Model and Persistence', link: '/code-architecture/data-model' }, - { text: 'Contributor Change Guide', link: '/code-architecture/change-guide' }, - ] - }, - { - text: 'Architecture Notes', + text: 'Docs', items: [ - { text: 'Overview', link: '/architecture/' }, - { text: 'Design Decisions', link: '/architecture/design-decisions' }, - { text: 'Invariants', link: '/architecture/invariants' }, - { text: 'Multi-language Design', link: '/architecture/multi-language' }, - { text: 'Testing Strategy', link: '/architecture/testing' }, - ] - }, - { - text: 'Contributing', - items: [ - { text: 'Contributing Guide', link: '/contributing/' }, + { text: 'Quick Start', link: '/quickstart' }, + { text: 'Usage & Workflows', link: '/USAGE' }, + { text: 'CLI & Config Reference', link: '/REFERENCE' }, + { text: 'Architecture', link: '/ARCHITECTURE' }, + { text: 'Contributing', link: '/CONTRIBUTING' }, ] }, ], @@ -103,7 +59,7 @@ export default defineConfig({ editLink: { pattern: 'https://github.com/Stephen-Collins-tech/hotspots/edit/main/docs/:path', text: 'Edit this page on GitHub' - } + }, }, // Ignore internal docs from site diff --git a/docs/.vitepress/theme/custom.css b/docs/.vitepress/theme/custom.css new file mode 100644 index 0000000..65a6cca --- /dev/null +++ b/docs/.vitepress/theme/custom.css @@ -0,0 +1,39 @@ +:root { + /* Brand */ + --vp-c-brand-1: #f97316; + --vp-c-brand-2: #ea6a0a; + --vp-c-brand-3: #c2550a; + --vp-c-brand-soft: rgba(249, 115, 22, 0.14); + + /* Home hero name gradient */ + --vp-home-hero-name-color: transparent; + --vp-home-hero-name-background: linear-gradient(135deg, #f97316 0%, #fb923c 100%); + + /* Home hero image glow */ + --vp-home-hero-image-background-image: radial-gradient(ellipse at center, rgba(249, 115, 22, 0.35) 0%, transparent 70%); + --vp-home-hero-image-filter: blur(44px); + + /* Buttons */ + --vp-button-brand-border: transparent; + --vp-button-brand-text: #ffffff; + --vp-button-brand-bg: #f97316; + --vp-button-brand-hover-border: transparent; + --vp-button-brand-hover-text: #ffffff; + --vp-button-brand-hover-bg: #ea6a0a; + --vp-button-brand-active-border: transparent; + --vp-button-brand-active-text: #ffffff; + --vp-button-brand-active-bg: #c2550a; + + /* Badge */ + --vp-badge-tip-border: transparent; + --vp-badge-tip-text: #ffffff; + --vp-badge-tip-bg: #f97316; +} + +/* Dark mode adjustments */ +.dark { + --vp-c-brand-1: #fb923c; + --vp-c-brand-2: #f97316; + --vp-c-brand-3: #ea6a0a; + --vp-c-brand-soft: rgba(249, 115, 22, 0.16); +} diff --git a/docs/.vitepress/theme/index.js b/docs/.vitepress/theme/index.js new file mode 100644 index 0000000..3d71960 --- /dev/null +++ b/docs/.vitepress/theme/index.js @@ -0,0 +1,7 @@ +import DefaultTheme from 'vitepress/theme' +import './custom.css' // This safely registers your custom styles + +export default { + extends: DefaultTheme, + // You can also register custom layout components here if needed +} \ No newline at end of file diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..030ea12 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,186 @@ +# Architecture + +## Overview + +Hotspots is a Rust workspace with two crates: + +- **`hotspots-core`** — library: parsing, CFG construction, metrics, risk scoring, snapshot persistence, delta, policy, report generation +- **`hotspots-cli`** — binary: argument parsing, file collection, output formatting, command dispatch + +**Technology stack:** Rust 2021 Edition (MSRV 1.75), `swc_ecma_parser` for JS/TS, `tree-sitter-*` for all other languages, `clap` v4.5 for CLI, `serde`/`serde_json` for serialization, `anyhow` for error propagation. + +## Analysis Pipeline + +``` +Source Code + ↓ +[Parser] → Module AST (per language, per file) + ↓ +[Function Discovery] → FunctionNode[] + ↓ (for each function) +[CFG Builder] → Control Flow Graph + ↓ +[Metric Extraction] → CC, ND, FO, NS, LOC + ↓ +[Risk Components] → R_cc, R_nd, R_fo, R_ns (log-scaled, bounded) + ↓ +[LRS + Risk Band] + ↓ +[Pattern Classification] → Tier 1 (structural) + ↓ +[optional: git history + call graph enrichment] + ↓ +[Activity Risk Score + Tier 2 patterns + Driver Label + Quadrant] + ↓ +[Snapshot persistence / Delta computation / Report rendering] + ↓ +[Output: text / JSON / JSONL / HTML / SARIF] +``` + +Stages above the enrichment line run on source code alone. Stages below require a git repository (`--mode snapshot` or `--mode delta`). + +## Phase Details + +### Phase 1 — Parsing and Function Discovery + +Each source file is parsed into an AST by the language-specific parser module. All function types are discovered: declarations, expressions, arrow functions, methods, object literal methods, closures. + +Functions are sorted by source position (byte offset) before processing to ensure deterministic output. Anonymous functions are named `@:`. + +**JS/TS:** SWC parser (`swc_ecma_parser`). Decorator support enabled for all `.ts` files (Angular `@Component`, etc.). JSX enabled for `.jsx` and `.js` files (React webpack convention). **All other languages:** tree-sitter parsers. + +### Phase 2 — CFG Construction + +A Control Flow Graph is built for each function. The CFG models all control structures explicitly: +- `if`/`else` — condition node, then/else branches, lazy join node +- Loops (`for`, `while`, `do-while`, `for-in`, `for-of`) — loop header, back edge, lazy break target +- `switch` — switch node, case branches, lazy join (no switch→join edge to avoid CC inflation) +- `try`/`catch`/`finally` — handler edges, lazy join, fallback `try_start→finally_start` edge only when no catch handler +- Early exits (`return`, `throw`, `break`, `continue`) — edge to exit/break-target; `current_node = None` marks dead code + +Key correctness rules: +- Join nodes are created **lazily** — only when needed (when at least one live path reaches them). Eager join node creation causes orphaned nodes with no predecessors, failing CFG validation. +- After a terminating statement sets `current_node = None`, subsequent `visit_*` calls return early (`let Some(from_node) = self.current_node else { return; }`) to prevent panics on dead code. +- `BreakableContext.break_target: Option`: `Some` for condition-guarded loops (break target pre-created); `None` (lazy) for `do-while`, infinite `for`, and `switch`. + +CFG validation: all nodes must be reachable from entry; all non-exit nodes must have at least one successor. + +### Phase 3 — Metric Extraction + +From the validated CFG: +- **CC** = `E − N + 2` + one per `&&`/`||` short-circuit + one per `switch` case + one per `catch` clause +- **ND** = maximum nesting depth tracked during AST traversal (if, loops, switch, try; excludes bare blocks and lexical scopes) +- **FO** = count of distinct call expressions during AST traversal; each segment of a chained call counts independently (`a().b().c()` = 3) +- **NS** = count of non-tail `return`, `throw`, `break`, `continue` during traversal +- **LOC** = physical line count of the function body + +### Phase 4 — Risk Scoring + +Log-scale transforms and weighted sum → LRS → risk band. See [REFERENCE.md](REFERENCE.md#lrs-formula) for formulas. + +### Phase 5 — Enrichment (snapshot mode) + +**Git history:** `git log` provides per-file or per-function (with `-L`) churn and touch counts. Results cached in `.hotspots/touch-cache.json.zst`. Hybrid mode: file-level for all functions, per-function for files with ≥ N touches/30d. + +**Call graph:** Import resolution builds a cross-file call graph. Fan-in, fan-out, PageRank, betweenness centrality (exact for < 2000 nodes; Brandes algorithm with k=256 pivots for larger), SCC (Tarjan's algorithm), dependency depth (topological sort). + +**Pattern classification:** Tier 2 patterns check call graph and git data against thresholds. `volatile_god` is derived (fires only when both `god_function` and `churn_magnet` are true). + +**Driver label assignment:** Percentile-relative checks in priority order (see [REFERENCE.md](REFERENCE.md#driver-labels)). + +**Quadrant assignment:** Band × activity → `fire`/`debt`/`watch`/`ok`. + +### Phase 6 — Snapshot Persistence + +Snapshots are stored as `/.hotspots/snapshots/.json.zst` (compressed). They are immutable by default — identified by commit SHA. `--force` regenerates; `--no-persist` skips writing. + +Index at `.hotspots/index.json` tracks known snapshots. + +Delta computation (`Delta::new(head, Some(&base))`) produces per-function status (`new`/`deleted`/`modified`/`unchanged`) and metric deltas. + +## Module Structure + +``` +hotspots-core/src/ +├── language/ +│ ├── typescript/ # SWC-based parser + CFG builder +│ ├── javascript/ # same, JSX-enabled +│ ├── go/ +│ ├── java/ +│ ├── python/ +│ ├── rust/ +│ ├── c/ +│ ├── csharp/ +│ └── vue/ +├── cfg/ +│ ├── builder.rs # generic CFG construction traits +│ └── mod.rs # CfgNode, CfgEdge, Cfg, validation +├── metrics.rs # raw metric extraction +├── risk.rs # LRS, risk components, risk bands +├── patterns.rs # Tier 1 + Tier 2 pattern detection +├── drivers.rs # driver label assignment +├── snapshot.rs # snapshot serialization, persistence, loading +├── delta.rs # delta computation +├── policy.rs # policy rule evaluation +├── analysis.rs # pipeline orchestration +├── aggregates.rs # file_risk, co_change, modules, models +├── callgraph.rs # fan-in/out, PageRank, betweenness, SCC +├── git.rs # git log integration, touch cache, ref resolution +├── config.rs # config loading and resolution +├── html.rs # HTML report rendering +├── sarif.rs # SARIF output +└── report.rs # JSON/JSONL rendering + +hotspots-cli/src/ +├── main.rs # CLI entry, Commands enum +└── cmd/ + ├── analyze.rs + ├── diff.rs + ├── train.rs + ├── trends.rs + ├── prune.rs + ├── compact.rs + ├── config.rs + └── init.rs +``` + +## Global Invariants + +These are non-negotiable. Any violation is a bug. + +1. **Per-function analysis** — each function analyzed independently; no cross-function state during analysis +2. **No global mutable state** — no `static mut`, no shared mutable references between functions +3. **No randomness, clocks, threads, or async** — all operations fully deterministic +4. **Deterministic traversal order** — files sorted by path; functions sorted by source position (byte offset) +5. **Formatting/whitespace invariance** — only structural AST nodes used; comments and whitespace do not affect results +6. **Identical input → byte-for-byte identical output** — all JSON key ordering and floating-point formatting are deterministic + +## Testing Strategy + +**Golden tests** (`tests/golden/`) — expected JSON outputs for known fixture files. Automatically verified in CI. Updated when metric behavior intentionally changes. + +**Unit tests** — per-module: CFG construction, metric extraction, risk scoring, pattern detection, config parsing. + +**Integration tests** (`integration/`, pytest-based E2E) — full pipeline tests against real fixture projects. `make test-integration` runs pytest; `make test-comprehensive` auto-detects pytest or falls back to legacy script. + +**Determinism tests** — run analysis twice on identical input, assert byte-for-byte identical output. + +**No manual golden path fixing needed** — golden tests normalize file paths at assertion time for cross-platform consistency. + +## Design Decisions + +**SWC for JS/TS, tree-sitter for everything else.** SWC is Rust-native (no Node.js dependency), fast, and supports full TypeScript syntax. Tree-sitter parsers are widely available for other languages and have a uniform API. + +**Explicit CFG over implicit flow tracking.** More complex to implement, but enables formal CC calculation (`E − N + 2`) and catches edge cases (dead code after return, multiple catch clauses, finally with and without catch). + +**Logarithmic scaling for CC and FO.** The marginal risk of CC 1→4 is larger than CC 40→44. Caps prevent extreme outliers from dominating the aggregate score. + +**Immutable snapshots keyed by commit SHA.** Enables `hotspots diff` between any two historical refs, supports multiple in-flight branches without collision, and makes snapshot storage auditable. + +**Lazy join node creation in CFG builder.** Eager creation of join nodes that never receive incoming edges (e.g., when all branches terminate) causes CFG validation failures (`"Nodes not reachable from entry"`). Lazy creation — only when at least one live path needs to merge — is the correct approach. + +**Percentile-relative driver labels.** Absolute thresholds for driver labels would fire on different functions in a 100-function repo vs a 100k-function repo. Percentile-relative checks (default P75) adapt to the codebase's own distribution. + +**No cross-function analysis in LRS.** LRS is per-function and named "Local" deliberately. Call graph metrics (fan-in, PageRank) are added at the Activity Risk layer, not folded into LRS. This separation keeps LRS a pure structural measure and Activity Risk the combined signal. + +**Betweenness centrality approximation above 2000 nodes.** Exact betweenness is O(V·E), which becomes prohibitive on large call graphs. The Brandes approximation with k=256 random pivot nodes is accurate enough for ranking purposes and runs in bounded time. diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md new file mode 100644 index 0000000..fc7c0ef --- /dev/null +++ b/docs/CONTRIBUTING.md @@ -0,0 +1,194 @@ +# Contributing + +## Setup + +**Prerequisites:** Rust 1.75+ (`rustup install stable`), Git. Node.js 18+ only needed for GitHub Action development. + +```bash +git clone https://github.com/Stephen-Collins-tech/hotspots.git +cd hotspots +cargo build --release +cargo test +make install-hooks # install pre-commit git hooks (fmt + clippy + tests) +``` + +Binaries: `target/debug/hotspots` (fast compile) and `target/release/hotspots` (optimized). + +## Development workflow + +```bash +cargo check # fast error checking +cargo build # debug build +cargo test # run all tests +cargo test --package hotspots-core # core only +cargo test test_name -- --nocapture # specific test with output +cargo test --test golden_tests # golden file tests +make test-integration # pytest E2E tests +make test-comprehensive # pytest or legacy fallback +``` + +**Before every commit** (pre-commit hooks enforce this, but run manually first): +```bash +cargo fmt --all -- --check +cargo clippy --all-targets --all-features -- -D warnings +cargo test +``` + +## Contributing code + +1. **Always branch before making changes.** Never commit directly to `main`. + ```bash + git checkout -b feat/your-feature-name + ``` + Branch naming: `feat/`, `fix/`, `refactor/`, `test/`, `docs/`, `chore/` + +2. Make changes. When modifying a struct or enum, grep all usages first to avoid cascading compile errors: + ```bash + grep -rn "TypeName" hotspots-core/src hotspots-cli/src + ``` + Batch all related edits across files, then compile once. + +3. Run checks (see above), fix all errors. + +4. Commit with a single-line message under 72 characters: + ``` + feat: add Python language support + fix: correct CC calculation for switch statements + docs: compress docs to five pages + ``` + Types: `feat`, `fix`, `refactor`, `test`, `docs`, `chore` + +5. Open a PR. One logical unit of work per branch. + +**Code conventions (from CLAUDE.md):** +- Minimal and focused changes — do not refactor or rename beyond what the task requires +- No comments unless the WHY is non-obvious +- No error handling for scenarios that can't happen +- Before implementing, list all files to be modified; get confirmation if > 5 files + +## Project structure + +``` +hotspots/ +├── hotspots-core/src/ +│ ├── language/ # per-language parsers and CFG builders +│ │ ├── typescript/ +│ │ ├── javascript/ +│ │ ├── go/ +│ │ ├── java/ +│ │ ├── python/ +│ │ ├── rust/ +│ │ └── ... +│ ├── cfg/ # CFG types and validation +│ ├── metrics.rs # raw metric extraction +│ ├── risk.rs # LRS formula +│ ├── patterns.rs # pattern detection +│ ├── snapshot.rs # snapshot persistence +│ ├── delta.rs # delta computation +│ ├── policy.rs # policy engine +│ ├── callgraph.rs # fan-in/out, PageRank, betweenness, SCC +│ ├── git.rs # git log, touch cache, ref resolution +│ └── config.rs # config loading +├── hotspots-cli/src/ +│ ├── main.rs +│ └── cmd/ # one file per subcommand +├── tests/ +│ ├── fixtures/ # language-specific test code files +│ └── golden/ # expected JSON outputs (golden tests) +├── integration/ # pytest-based E2E tests +├── action/ # GitHub Action (Node.js) +└── docs/ # documentation (this directory) +``` + +## Adding a language + +Estimated effort: 7–14 days depending on language complexity. + +**Prerequisites:** A tree-sitter parser exists for the target language (check [crates.io](https://crates.io) or [github.com/tree-sitter](https://github.com/tree-sitter)). Read [docs/ARCHITECTURE.md](ARCHITECTURE.md) first. Study an existing implementation — Go (`language/go/`) for a clean simple case, Python (`language/python/`) for complex language features. + +### Step 1 — Dependency and module + +Add to `hotspots-core/Cargo.toml`: +```toml +tree-sitter- = "0.x.y" +``` + +Create `hotspots-core/src/language//mod.rs`, `parser.rs`, `cfg_builder.rs`. + +### Step 2 — Parser + +Implement `LanguageParser` trait in `parser.rs`: +- Parse source with `tree_sitter::Parser` +- Walk the AST to discover all function types (declarations, methods, closures, lambdas) +- Sort by source position (`start_byte`) for determinism +- Handle all function node kinds in the tree-sitter grammar (`tree-sitter parse file.ext --debug` shows node kinds) + +### Step 3 — CFG builder + +Implement `CfgBuilder` trait in `cfg_builder.rs`: +- Build one CFG per function +- Model all control structures: `if`/`else`, all loop types, `switch`, `try`/`catch`/`finally` +- Route early exits (`return`, `throw`, `break`, `continue`) to correct targets +- Create join nodes **lazily** — only when at least one live path reaches them (see Architecture) +- Track loop context stack for `break`/`continue` targets +- Track nesting depth during traversal + +**Critical:** After a terminating statement sets `current_node = None`, subsequent statements must return early. Eager join node creation before confirming live paths causes CFG validation failures. + +### Step 4 — Register + +1. Add language variant to `Language` enum in `language/mod.rs` +2. Map file extensions in `Language::from_extension()` +3. Register parser in `analysis.rs`'s `create_parser()` dispatch +4. Register CFG builder in `cfg_builder.rs`'s `create_cfg_builder()` dispatch +5. Add `FunctionBody` variant if the language uses a different body representation + +### Step 5 — Tests + +Create `tests/fixtures//` with 5–7 test files covering: simple functions, loops, conditionals, early exits, nested control flow, language-specific constructs. + +Generate golden files: +```bash +cargo build --release +./target/release/hotspots analyze tests/fixtures//simple. --format json > tests/golden/-simple.json +``` + +Add unit tests in `hotspots-core/tests/_tests.rs`. Verify: +- All function types discovered +- CC/ND/FO/NS match manual calculation +- Golden tests pass (deterministic output) +- No clippy warnings + +**Common pitfalls:** +- Non-deterministic output — always sort by `start_byte` +- Missing function types — check all node kinds in the grammar +- Incorrect CC — debug by printing edge/node counts before `E − N + 2` +- Break/continue routing — maintain a `loop_stack: Vec` with `header` and `exit` nodes + +### Step 6 — Docs and PR + +Update `docs/REFERENCE.md` language support table. Open a PR with all changes including golden files. Update `CHANGELOG.md`. + +## Releases + +Releases are fully automated via CI. **Never manually bump `Cargo.toml` versions** — the release workflow owns version bumps. + +To create a release: +1. Ensure all changes are merged to `main` and CI is green +2. The release workflow triggers on version tags (`v*`) +3. It builds binaries for Linux x86_64, macOS x86_64, macOS ARM64, Windows x86_64 +4. Creates a GitHub release with binaries and generated release notes +5. Updates the `v1` floating tag pointer + +**Rolling back:** delete the release and tag with `gh release delete vX.Y.Z --yes` and `git push origin :refs/tags/vX.Y.Z`. Create a new patch release with the fix. + +We follow [Semantic Versioning](https://semver.org/): MAJOR for breaking API changes, MINOR for new features, PATCH for bug fixes. + +## Reporting bugs and requesting features + +- Bugs: [GitHub Issues](https://github.com/Stephen-Collins-tech/hotspots/issues) — include version (`hotspots --version`), OS, steps to reproduce, expected vs actual +- Features: [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) — describe the use case, not just the solution + +## Code of conduct + +Be respectful and constructive. Follow GitHub's [Community Guidelines](https://docs.github.com/en/site-policy/github-terms/github-community-guidelines). diff --git a/docs/REFERENCE.md b/docs/REFERENCE.md new file mode 100644 index 0000000..19bce1b --- /dev/null +++ b/docs/REFERENCE.md @@ -0,0 +1,492 @@ +# Reference + +## CLI Commands + +### `hotspots analyze ` + +Core analysis command. Scans source files, computes metrics, scores functions. + +``` +hotspots analyze [OPTIONS] +``` + +| Flag | Default | Description | +|---|---|---| +| `--format` | `text` | `text`, `json`, `jsonl`, `html`, `sarif` | +| `--mode` | — | `snapshot`, `delta`, `models` | +| `--top N` | none | Show top N functions by LRS | +| `--min-lrs F` | `0.0` | Filter functions below this LRS | +| `--config PATH` | auto | Path to config file | +| `--output PATH` | `.hotspots/report.html` | Output file (HTML/SARIF) | +| `--explain` | off | Per-function risk breakdown (snapshot+text only) | +| `--explain-patterns` | off | Show pattern trigger conditions | +| `--level` | — | `file` or `module` aggregate view (snapshot+text only) | +| `--policy` | off | Evaluate policies; exit 1 on blocking violations (delta only) | +| `--force` | off | Overwrite existing snapshot | +| `--no-persist` | off | Skip writing snapshot to disk | +| `--per-function-touches` | off | Use `git log -L` for precise touch counts (slow cold start) | +| `--no-per-function-touches` | off | Force file-level touch batching | +| `--skip-touch-metrics` | off | Skip all git log I/O (touch counts reported as 0) | +| `--all-functions` | off | Output flat array instead of triage buckets (snapshot JSON only) | +| `--include-models` | off | Add model risk map to JSON/HTML (snapshot only) | +| `--callgraph-skip-above N` | 50000 | Skip betweenness centrality if call graph > N edges | +| `--skip-gate` | off | Disable suppression gate P@10 check | +| `-j N` / `--jobs N` | CPU count | Parallel worker threads | + +**Notes:** +- `--explain` and `--level` are mutually exclusive +- `--force` and `--no-persist` are mutually exclusive +- Snapshot mode text output requires `--explain` or `--level` +- SARIF requires `--mode snapshot`; HTML requires `--mode snapshot` or `--mode delta` +- `--policy` requires `--mode delta` + +### `hotspots diff ` + +Compare snapshots between any two git refs. Both must have existing snapshots. + +``` +hotspots diff [OPTIONS] +``` + +Accepts: branch names, tags, full/short SHAs, `HEAD~N` relative refs. + +| Flag | Description | +|---|---| +| `--format` | `text` (default), `json`, `jsonl`, `html` | +| `--output PATH` | Write output to file | +| `--policy` | Evaluate policies; exit 1 on blocking violations | +| `--top N` | Limit to N changed functions by \|ΔLRS\| | +| `--config PATH` | Config file | +| `--auto-analyze` | Generate missing snapshots via git worktrees | + +Exit codes: 0 = success, 1 = policy failure, 2 = auto-analysis failed, 3 = snapshot missing. + +`--top` applies after policy evaluation — violations outside the top N are still detected. + +### `hotspots train [PATH]` + +Fit a RandomForest ranker from fix-commit history. Model saved to `.hotspots/ranker.json` and auto-loaded by `hotspots analyze`. + +| Flag | Default | Description | +|---|---|---| +| `--blame` | off | Blame-based function-level labels (slower, more precise) | +| `--label-window DAYS` | `365` | Days of history to scan | +| `--n-estimators N` | `200` | Trees in RandomForest | +| `--max-depth N` | `6` | Maximum tree depth | +| `--output PATH` | `.hotspots/ranker.json` | Model output path | +| `--eval` | off | Print Precision@K table after training | +| `--screen` | off | Pre-flight check only; don't fit model | + +Requires: ≥ 50 functions in snapshot, ≥ 5 positive and ≥ 10 negative labels. Fix keywords: `fix:`, `bug`, `patch`, `hotfix`, `regression`, `defect`. + +### `hotspots prune` + +Remove unreachable snapshots (after force-push or branch deletion). + +``` +hotspots prune --unreachable [--older-than DAYS] [--dry-run] +``` + +`--unreachable` is required. Only prunes snapshots unreachable from `refs/heads/*`. + +### `hotspots compact` + +Set compaction level for snapshot storage. + +``` +hotspots compact --level 0 +``` + +Level 0 = full snapshots (current). Levels 1–2 are not yet implemented. + +### `hotspots trends [PATH]` + +Analyze complexity trends across snapshot history. + +``` +hotspots trends . [--window N] [--top K] [--format text|json|html] +``` + +| Flag | Default | Description | +|---|---|---| +| `--window N` | `10` | Number of snapshots to analyze | +| `--top K` | `5` | Top K functions to track | +| `--format` | `json` | Output format | + +Reports: risk velocities (LRS change per snapshot), hotspot stability (consistent top-K), refactor effectiveness (sustained LRS reduction). + +### `hotspots config` + +```bash +hotspots config show # show resolved config (merged defaults + file) +hotspots config show --path FILE # show specific file +hotspots config validate # validate auto-discovered config (exit 1 on failure) +hotspots config validate --path FILE +``` + +### `hotspots init` + +```bash +hotspots init --hooks # print pre-commit and CI hook templates to stdout +``` + +### Global flags + +```bash +hotspots --help +hotspots --version +``` + +### Environment variables + +- `NO_COLOR` — disable ANSI colors in text output +- `GIT_DIR`, `GIT_WORK_TREE` — override git repository location +- `GITHUB_EVENT_NAME=pull_request` — triggers merge-base comparison in delta mode +- `CI_MERGE_REQUEST_IID` (GitLab), `CIRCLE_PULL_REQUEST` (CircleCI), `TRAVIS_PULL_REQUEST` (Travis) — same effect + +### Exit codes + +| Code | Meaning | +|---|---| +| 0 | Success (or warnings only) | +| 1 | Error or blocking policy failure | +| 2 | Auto-analysis failed (`hotspots diff --auto-analyze` only) | +| 3 | Snapshot missing (`hotspots diff` only) | + +--- + +## Metrics + +### The four structural metrics + +**CC — Cyclomatic Complexity** +Number of independent decision paths. Counts: `if`, `else if`, `for`, `while`, `do/while`, `case`, `catch`, `&&`, `||`, ternary. A function with no branches has CC 1. + +**ND — Nesting Depth** +Maximum depth of nested control structures (`if`, loops, `try`/`catch`, `switch`). Each additional level degrades readability non-linearly. ND ≥ 5 almost always warrants refactoring. + +**FO — Fan-Out** +Distinct functions called from within this function. Each call segment in a chained expression counts independently (`foo().bar().baz()` = 3). High FO = high external coupling. + +**NS — Non-Structured Exits** +Count of early returns, throws, breaks, and continues (excluding the final tail return). Scattered exits make control flow hard to trace and postconditions hard to reason about. + +**LOC — Lines of Code** +Physical line count. Used for pattern detection only, not the LRS score. + +### LRS formula + +``` +R_cc = min(log2(CC + 1), 6.0) # logarithmic, capped at 6 +R_nd = min(ND, 8.0) # linear, capped at 8 +R_fo = min(log2(FO + 1), 6.0) # logarithmic, capped at 6 +R_ns = min(NS, 6.0) # linear, capped at 6 + +LRS = 1.0×R_cc + 0.8×R_nd + 0.6×R_fo + 0.7×R_ns +``` + +Logarithmic scaling for CC and FO: going from CC 1→4 matters more than CC 40→44. Linear for ND and NS: each additional level contributes uniformly. Caps prevent a single extreme value from dominating. + +**Theoretical range:** 1.0 (trivial) to 20.2 (all four at cap). + +**Weight rationale:** CC (1.0) = primary defect correlate; ND (0.8) = captures complexity CC can miss; NS (0.7) = implicit exit conditions; FO (0.6) = external coupling weighted lower. + +### Risk bands + +| Band | LRS range | Typical action | +|---|---|---| +| Critical | ≥ 9.0 | Refactor now | +| High | 6.0–8.9 | Refactor next time you touch it | +| Moderate | 3.0–5.9 | Monitor; block increases in CI | +| Low | < 3.0 | Leave alone | + +Thresholds are configurable. + +### Activity Risk Score (snapshot mode) + +Extends LRS with git history and call graph signals: + +``` +Activity Risk = LRS + + (lines_added + lines_deleted) / 100 × 0.5 # churn + + min(touch_count_30d / 10, 5.0) × 0.3 # touch frequency + + max(0, 5.0 − days_since_change / 7) × 0.2 # recency + + min(fan_in / 5, 10.0) × 0.4 # call graph fan-in + + (scc_size if in cycle else 0) × 0.3 # cyclic dependency + + min(dependency_depth / 3, 5.0) × 0.1 # depth from entrypoints + + neighbor_churn / 500 × 0.2 # churn in callees +``` + +Activity Risk is always ≥ LRS. When no git data is available, Activity Risk = LRS. + +### Call graph metrics (snapshot mode) + +- **Fan-in** — functions that call this function (blast radius) +- **PageRank** — importance/centrality based on call graph topology +- **Betweenness centrality** — fraction of shortest paths that pass through this function (hub detection); exact for graphs < 2000 nodes, approximate (k=256 pivots) for larger +- **SCC size** — strongly connected component size; > 1 = part of a dependency cycle +- **Dependency depth** — longest acyclic path from entrypoints to this function +- **Neighbor churn** — sum of churn in directly-called functions + +### Quadrant assignment + +| | Low activity | High activity | +|---|---|---| +| **High/Critical band** | `debt` | `fire` | +| **Low/Moderate band** | `ok` | `watch` | + +Activity is "high" if: 30-day touch count above population median, OR changed within last 30 days. + +`fire` = live regression risk (refactor now). `debt` = structural debt (schedule proactively). `watch` = monitor. `ok` = no action. + +### Driver labels + +Each function gets a single primary diagnosis, checked in priority order: + +| Label | Condition | Action | +|---|---|---| +| `cyclic_dep` | Part of dependency cycle (SCC > 1) | Break the cycle before adding callers | +| `high_complexity` | CC above P75 | Schedule refactor; extract sub-functions | +| `deep_nesting` | ND above P75 | Flatten with early returns or guard clauses | +| `high_fanout_churning` | FO above P75 AND touches above P50 | Extract interface boundary | +| `high_fanin_complex` | Fan-in above P75 AND CC above P50 | Extract and stabilize; wide blast radius | +| `high_churn_low_cc` | Touches above P75 AND CC below P25 | Add regression tests before next change | +| `composite` | No single dimension clearly dominates | Address the highest dimension first | + +Thresholds are percentile-relative (default P=75, configurable via `driver_threshold_percentile`). `cyclic_dep` is the sole absolute check. + +`driver_detail` (JSON): for `composite` functions, lists up to 3 near-miss dimensions with their percentile rank (e.g. `"cc (P72), nd (P68)"` — notable but below P75 threshold). Omitted when null. + +### Pattern detection + +Patterns are informational labels. A function can have multiple. They do not affect LRS. + +**Tier 1 — structural (all modes):** + +| Pattern | Trigger | +|---|---| +| `complex_branching` | CC ≥ 10 AND ND ≥ 4 | +| `deeply_nested` | ND ≥ 5 | +| `exit_heavy` | NS ≥ 5 | +| `god_function` | LOC ≥ 60 AND FO ≥ 10 | +| `long_function` | LOC ≥ 80 | + +**Tier 2 — enriched (snapshot mode, requires call graph + git data):** + +| Pattern | Trigger | +|---|---| +| `churn_magnet` | churn ≥ 200 lines AND CC ≥ 8 | +| `cyclic_hub` | SCC size ≥ 2 AND fan-in ≥ 6 | +| `hub_function` | fan-in ≥ 10 AND CC ≥ 8 | +| `middle_man` | fan-in ≥ 8 AND FO ≥ 8 AND CC ≤ 4 | +| `neighbor_risk` | neighbor churn ≥ 400 AND FO ≥ 8 | +| `shotgun_target` | fan-in ≥ 8 AND churn ≥ 150 lines | +| `stale_complex` | CC ≥ 10 AND LOC ≥ 60 AND days since change ≥ 180 | +| `volatile_god` | Derived: `god_function` AND `churn_magnet` | + +All thresholds configurable in `.hotspotsrc.json`. Use `--explain-patterns` to see which conditions triggered each pattern. + +--- + +## Configuration + +Config file is auto-discovered from project root in this order: +1. `--config ` CLI flag (explicit override) +2. `.hotspotsrc.json` +3. `hotspots.config.json` +4. `"hotspots"` key in `package.json` + +The project root is determined by walking up from the analyzed path to find `.git`. CLI flags take precedence over config file values. + +Validate: `hotspots config validate` / Inspect resolved: `hotspots config show` + +### Full schema + +```json +{ + "include": ["src/**/*.ts"], + "exclude": [ + "**/*.test.ts", "**/*.spec.ts", + "**/node_modules/**", "**/__tests__/**", "**/__mocks__/**", + "**/dist/**", "**/build/**", "**/vendor/**", + "**/*.pb.go", "**/zz_generated*.go" + ], + "thresholds": { + "moderate": 3.0, + "high": 6.0, + "critical": 9.0 + }, + "weights": { + "cc": 1.0, + "nd": 0.8, + "fo": 0.6, + "ns": 0.7 + }, + "warning_thresholds": { + "watch_min": 2.5, + "watch_max": 3.0, + "attention_min": 5.5, + "attention_max": 6.0, + "rapid_growth_percent": 50.0 + }, + "min_lrs": 0.0, + "top": null, + "co_change_window_days": 90, + "co_change_min_count": 3, + "driver_threshold_percentile": 75, + "per_function_touches": true +} +``` + +**Validation rules:** +- `moderate < high < critical` (all positive) +- `watch_min < watch_max ≤ moderate < attention_min < attention_max ≤ high` +- All weights non-negative; at least one positive; none > 10.0 +- Unknown fields are rejected (to catch typos) + +**`driver_threshold_percentile`:** default 75 means a function must be in the top 25% of its metric to receive a specific driver label. Lower (50–60) for small/uniform repos; higher (85–90) for large repos with high median complexity. + +**`co_change_window_days`:** days of git history to mine for file co-change pairs. Increase for repos with slow commit cadence. + +**`per_function_touches`:** `true` = use cached `git log -L` per-function counts; `false` = file-level batching always (useful in CI without persistent cache). + +--- + +## JSON Schema + +### Schema versions + +| Version | Structure | When | +|---|---|---| +| v4 (default snapshot JSON) | `fire`/`debt`/`watch`/`ok` triage buckets + per-function `action` + `architecture` aggregates | `hotspots analyze --mode snapshot` | +| v2 (full snapshot) | Flat `functions` array + enriched `aggregates` | `--all-functions` | +| v1 (delta) | `deltas` array with before/after | `--mode delta` | + +Always check `schema_version` before consuming output in tooling. + +### Function fields (v2 / `--all-functions`) + +```json +{ + "function_id": "src/api/billing.ts::processPlanUpgrade", + "file": "src/api/billing.ts", + "line": 142, + "language": "TypeScript", + "lrs": 12.4, + "band": "critical", + "quadrant": "fire", + "driver": "high_complexity", + "driver_detail": null, + "metrics": { "cc": 15, "nd": 4, "fo": 8, "ns": 3 }, + "risk": { "r_cc": 4.0, "r_nd": 4.0, "r_fo": 3.0, "r_ns": 3.0 }, + "patterns": ["complex_branching", "churn_magnet"], + "pattern_details": null, + "suppression_reason": null, + "churn": { "lines_added": 156, "lines_deleted": 89, "net_change": 67 }, + "touch_count_30d": 12, + "days_since_last_change": 3, + "activity_risk": 18.5, + "callgraph": { + "fan_in": 8, "fan_out": 8, + "pagerank": 0.0042, "betweenness": 127.3, + "scc_id": 0, "scc_size": 1, "dependency_depth": 5 + } +} +``` + +`pattern_details` is populated only with `--explain-patterns`. `suppression_reason` is omitted (not null) when no suppression is present. + +### Aggregates (`--all-functions`) + +**`aggregates.file_risk`** — per-file ranked by `file_risk_score`: +``` +file_risk_score = max_cc×0.4 + avg_cc×0.3 + log2(fn_count+1)×0.2 + churn_factor×0.1 +``` + +**`aggregates.co_change`** — file pairs that change together in the same commit: +```json +{ + "file_a": "hotspots-cli/src/main.rs", + "file_b": "hotspots-core/src/aggregates.rs", + "co_change_count": 14, + "coupling_ratio": 0.78, + "has_static_dep": false, + "risk": "high" +} +``` +`risk: "expected"` = a static import exists; co-change is explained. + +**`aggregates.modules`** — directory-level instability: +```json +{ + "module": "hotspots-core/src", + "afferent": 8, + "efferent": 3, + "instability": 0.27, + "module_risk": "high" +} +``` +Instability near 0 = everything depends on it (risky to change). Instability near 1 = depends on others (safe to change). + +**`aggregates.models`** / **`architecture.models`** — present with `--include-models`: +```json +{ + "items": [{ + "name": "Snapshot", "file": "...", "line": 219, + "kind": "struct", "score": 52.11, + "critical": 4, "high": 15, "moderate": 17, + "functions": [...] + }], + "links": [{ "source": 0, "target": 2, "shared_functions": 15, "shared_risk": 83.53 }] +} +``` + +### Delta output (v1) + +```json +{ + "schema_version": 1, + "commit": { "sha": "abc123", "parent": "def456" }, + "baseline": false, + "deltas": [{ + "function_id": "src/api/billing.ts::processPlanUpgrade", + "status": "modified", + "before": { "lrs": 11.0, "band": "high", "metrics": { "cc": 13, "nd": 3, "fo": 7, "ns": 2 } }, + "after": { "lrs": 12.4, "band": "critical", "metrics": { "cc": 15, "nd": 4, "fo": 8, "ns": 3 } }, + "delta": { "cc": 2, "nd": 1, "fo": 1, "ns": 1, "lrs": 1.4 }, + "band_transition": { "from": "high", "to": "critical" } + }], + "policy": { + "failed": [{ "id": "critical-introduction", "severity": "blocking", "message": "..." }], + "warnings": [] + } +} +``` + +Delta statuses: `new`, `deleted`, `modified`, `unchanged` (unchanged omitted by default). + +--- + +## Supported Languages + +| Language | Extensions | +|---|---| +| TypeScript | `.ts`, `.tsx`, `.mts`, `.cts`, `.mtsx`, `.ctsx` | +| JavaScript | `.js`, `.jsx`, `.mjs`, `.cjs`, `.mjsx`, `.cjsx` | +| Go | `.go` | +| Python | `.py`, `.pyw` | +| Rust | `.rs` | +| Java | `.java` | +| C / C headers | `.c`, `.h` | +| C# | `.cs` | +| Vue | `.vue` | + +All languages have full parity across all metrics and features. + +**JSX note:** `.jsx` and `.tsx` files support JSX syntax. Plain `.js` files also enable JSX parsing (React webpack convention). JSX elements do not add CC; control flow in JSX (`&&`, ternary) does. + +--- + +## Scoring Changelog + +All changes to formulas, weights, thresholds, or ranking rules are tracked in git commit history. The LRS formula and default weights have been stable since v1.0. The trained ranker feature (v3 model, 8 features) was introduced in a later release. Check `CHANGELOG.md` for version-specific details. diff --git a/docs/USAGE.md b/docs/USAGE.md new file mode 100644 index 0000000..82ed449 --- /dev/null +++ b/docs/USAGE.md @@ -0,0 +1,428 @@ +# Usage Guide + +## Basic Analysis + +```bash +hotspots analyze src/ # text output, all functions +hotspots analyze src/ --top 20 # top 20 by LRS +hotspots analyze src/ --min-lrs 5 # only LRS ≥ 5.0 +hotspots analyze src/ --format json +hotspots analyze src/ --format jsonl | grep '"band":"critical"' +``` + +## Snapshot Mode + +Snapshot mode captures a full analysis tied to the current git commit. It enables: +- Delta comparisons (`--mode delta`) +- Trend tracking (`hotspots trends`) +- HTML trend charts +- Trained ranker scoring + +```bash +# Capture current state +hotspots analyze . --mode snapshot + +# Snapshot with detailed per-function explanation +hotspots analyze . --mode snapshot --format text --explain --top 10 + +# Snapshot without saving to disk +hotspots analyze . --mode snapshot --no-persist --format json + +# Regenerate an existing snapshot (e.g. after config change) +hotspots analyze . --mode snapshot --force +``` + +Snapshots are stored as `.hotspots/snapshots/.json.zst` and are immutable by default. + +### Higher-level views (snapshot mode only) + +```bash +# File-level risk table (ranked by composite file_risk_score) +hotspots analyze . --mode snapshot --format text --level file + +# Module instability (Robert Martin's metric at directory level) +hotspots analyze . --mode snapshot --format text --level module +``` + +File risk score = `max_cc×0.4 + avg_cc×0.3 + log2(fn_count+1)×0.2 + churn_factor×0.1`. Module instability near 0 = everything depends on it (risky to change); near 1 = safe to change. High-complexity + low-instability modules are the priority targets. + +## Delta Mode + +Delta mode compares the current state against the parent commit snapshot. + +```bash +hotspots analyze . --mode delta --format text +hotspots analyze . --mode delta --policy # exit 1 on blocking violations +hotspots analyze . --mode delta --format json # machine-readable output +``` + +In PR context (GitHub Actions, GitLab CI, CircleCI, Travis), delta mode automatically compares against the merge-base rather than the direct parent. Detection is via environment variables (`GITHUB_EVENT_NAME=pull_request`, `CI_MERGE_REQUEST_IID`, etc.). + +## `hotspots diff` + +Compare snapshots between any two git refs (not just parent → HEAD): + +```bash +hotspots diff main HEAD # compare branch vs main +hotspots diff v1.0.0 v2.0.0 # compare releases +hotspots diff main HEAD --top 10 --policy +hotspots diff main HEAD --format json +``` + +Both refs must have existing snapshots. If one is missing: +```bash +git checkout main && hotspots analyze . --mode snapshot +git checkout my-branch && hotspots analyze . --mode snapshot +hotspots diff main HEAD +``` + +`--top N` applies *after* policy evaluation, so violations outside the top N are still detected. + +**Exit codes:** 0 = success, 1 = policy failure, 2 = auto-analysis failed, 3 = snapshot missing. + +## Policy Engine + +The policy engine runs in delta mode (`--mode delta --policy` or `hotspots diff ... --policy`). + +**Blocking (exit code 1):** +- `critical-introduction` — new or existing function crosses LRS ≥ 9.0 +- `excessive-risk-regression` — LRS increases by ≥ 1.0 on a modified function + +**Warnings (exit code 0, informational):** +- `watch-threshold` — function entering watch range (default LRS 2.5–3.0) +- `attention-threshold` — function entering attention range (default LRS 5.5–6.0) +- `rapid-growth` — LRS increase > 50% on any function +- `suppression-missing-reason` — `// hotspots-ignore:` with no reason text +- `net-repo-regression` — total LRS increased across all changes (any positive delta) + +Configure thresholds in `.hotspotsrc.json`: +```json +{ + "warning_thresholds": { + "watch_min": 2.5, + "watch_max": 3.0, + "attention_min": 5.5, + "attention_max": 6.0, + "rapid_growth_percent": 50.0 + } +} +``` + +## CI/CD Setup + +### GitHub Action (recommended) + +```yaml +name: Hotspots +on: + pull_request: + push: + branches: [main] + +jobs: + analyze: + runs-on: ubuntu-latest + permissions: + contents: read + pull-requests: write + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 # required for git history + - uses: Stephen-Collins-tech/hotspots-action@v1 + with: + github-token: ${{ secrets.GITHUB_TOKEN }} +``` + +The action posts PR comments, generates HTML reports, and fails on blocking policy violations. + +**Action inputs:** + +| Input | Default | Description | +|---|---|---| +| `github-token` | — | Required for PR comments | +| `path` | `.` | Path to analyze | +| `policy` | `critical-introduction` | `critical-introduction`, `strict`, `moderate`, `custom` | +| `fail-on` | `error` | `error`, `warn`, `never` | +| `config` | auto-discover | Path to config file | +| `version` | `latest` | Pin a specific version | +| `post-comment` | `true` | Post PR comment | + +**Action outputs:** `violations` (JSON), `passed` (bool), `summary` (markdown), `report-path`, `json-output`. + +**Monorepo:** +```yaml +jobs: + frontend: + steps: + - uses: actions/checkout@v4 + with: { fetch-depth: 0 } + - uses: Stephen-Collins-tech/hotspots-action@v1 + with: + path: packages/frontend + github-token: ${{ secrets.GITHUB_TOKEN }} + backend: + steps: + - uses: actions/checkout@v4 + with: { fetch-depth: 0 } + - uses: Stephen-Collins-tech/hotspots-action@v1 + with: + path: packages/backend + github-token: ${{ secrets.GITHUB_TOKEN }} +``` + +**Upload HTML report as artifact:** +```yaml +- uses: Stephen-Collins-tech/hotspots-action@v1 + id: hotspots + with: + github-token: ${{ secrets.GITHUB_TOKEN }} +- uses: actions/upload-artifact@v4 + if: always() + with: + name: hotspots-report-${{ github.sha }} + path: ${{ steps.hotspots.outputs.report-path }} + retention-days: 30 +``` + +### Manual CI (GitLab, CircleCI, Jenkins, etc.) + +```bash +# Fail CI on blocking violations +hotspots analyze src/ --mode delta --policy +``` + +For GitLab CI: +```yaml +hotspots: + stage: analyze + image: rust:latest + before_script: + - cargo install hotspots-cli + script: + - hotspots analyze src/ --mode delta --policy + artifacts: + paths: [.hotspots/report.html] + expire_in: 1 week + rules: + - if: '$CI_PIPELINE_SOURCE == "merge_request_event"' +``` + +**Troubleshooting:** +- `"failed to extract git context"` — use `fetch-depth: 0` in checkout +- `"merge-base not found"` — fetch the base branch explicitly: `git fetch origin $BASE_BRANCH` +- PR comments not posting — ensure `pull-requests: write` permission and `github-token` is set + +## Output Formats + +### Text + +```bash +hotspots analyze src/ --format text # basic table +hotspots analyze . --mode snapshot --format text --explain # with per-function detail +hotspots analyze . --mode snapshot --format text --level file +``` + +Color-coded by risk band (critical=red, high=yellow, moderate=blue, low=green). Disable: `NO_COLOR=1 hotspots analyze ...`. + +The `--explain` view adds driver label, recommended action, and near-miss dimensions for `composite`-labeled functions. A co-change coupling section at the bottom shows file pairs that frequently change together in the same commit. + +### JSON + +```bash +hotspots analyze src/ --format json +hotspots analyze . --mode snapshot --format json --all-functions # full flat array (schema v2) +hotspots analyze . --mode snapshot --format json --include-models # add model risk map +``` + +Default snapshot JSON uses schema v4 (triage-first structure: `fire`/`debt`/`watch`/`ok` buckets). Use `--all-functions` for the flat `functions` array (schema v2). Always check `schema_version` in tooling. + +Useful `jq` patterns: +```bash +# Critical functions only +jq '.functions[] | select(.band == "critical")' output.json + +# Count by band +jq '.aggregates.by_band' output.json + +# Top 10 by LRS +jq '.functions | sort_by(.lrs) | reverse | .[0:10]' output.json + +# Functions with a specific pattern +jq '.functions[] | select(.patterns[]? == "god_function") | .function_id' output.json +``` + +### JSONL (streaming) + +One JSON object per line — ideal for pipelines and large repos: + +```bash +hotspots analyze src/ --format jsonl | grep '"band":"critical"' +hotspots analyze src/ --format jsonl | jq -c 'select(.lrs > 9)' +``` + +### HTML + +Interactive self-contained report with sortable table, risk landscape scatter plot, pattern breakdown panel, and trend charts (requires ≥ 2 snapshots): + +```bash +hotspots analyze . --mode snapshot --format html +open .hotspots/report.html # macOS +``` + +### SARIF (GitHub Code Scanning) + +```bash +hotspots analyze . --mode snapshot --format sarif --output .hotspots/results.sarif +``` + +Requires `--mode snapshot`. Maps bands to SARIF levels: critical→error, high→warning, moderate→note. Integrate with GitHub code scanning: + +```yaml +- name: Run Hotspots + run: hotspots analyze . --mode snapshot --format sarif --output .hotspots/results.sarif +- uses: github/codeql-action/upload-sarif@v3 + with: + sarif_file: .hotspots/results.sarif +``` + +## Suppression Comments + +Suppress CI policy failures while keeping the function visible in reports: + +```typescript +// hotspots-ignore: legacy payment processor, rewrite scheduled Q2 2026 +function legacyBillingLogic() { ... } +``` + +Rules: +- Comment must be on the line **immediately before** the function (no blank line between) +- Format: `// hotspots-ignore: ` +- Reason is required (missing reason triggers a warning, not a hard failure) +- Suppressed functions still appear in all reports with a `suppression_reason` field +- Suppressed functions still count toward net repo regression + +Good reasons: complex algorithm with test coverage, generated code, migration pending with date. Bad reasons: "TODO fix this later", no reason at all. + +## Touch Metrics + +Touch metrics measure how often functions change in git history. + +```bash +# Default: hybrid mode (file-level for most, per-function for active files) +hotspots analyze . --mode snapshot + +# Full per-function precision (slower cold start, cached after first run) +hotspots analyze . --mode snapshot --per-function-touches + +# File-level only (fastest, disables per-function cache) +hotspots analyze . --mode snapshot --no-per-function-touches + +# Skip all git I/O (for very large repos, 50k+ functions) +hotspots analyze . --mode snapshot --skip-touch-metrics +``` + +Per-function touch results are cached in `.hotspots/touch-cache.json.zst`. First run on a new commit is slow; subsequent runs are fast. + +Configure in `.hotspotsrc.json`: +```json +{ + "per_function_touches": false, + "hybrid_touch_threshold": 5 +} +``` + +## Snapshot Management + +```bash +# Prune snapshots for deleted/force-pushed branches +hotspots prune --unreachable --dry-run +hotspots prune --unreachable --older-than 30 + +# Compact snapshot storage +hotspots compact --level 0 + +# Analyze trends across snapshot history +hotspots trends . +hotspots trends . --window 20 --top 10 --format text +``` + +`hotspots trends` reports risk velocities (LRS change per snapshot), hotspot stability (consistent top-K presence), and refactor effectiveness (sustained LRS reduction). + +## Training a Repo-Specific Ranker + +By default, hotspots ranks by LRS. Training fits a RandomForest from your repo's bug-fix history to re-rank based on which structural features actually predict bugs in *your* codebase. + +```bash +# File-level labels (fast) +hotspots train . + +# Blame-based function-level labels (more precise, slower) +hotspots train . --blame + +# Train and immediately check whether the model is better than base LRS +hotspots train . --blame --eval +``` + +Example `--eval` output: +``` +P@K evaluation (365-day fix-label window): + K P@K base_rate + 10 0.400 0.084 + 20 0.300 0.084 +``` + +If `P@K` is well above `base_rate`, apply the model. If `P@K ≈ base_rate`, skip it — the default LRS ranking is just as good. + +Once `.hotspots/ranker.json` exists, every `hotspots analyze` loads it automatically and re-scores triage quadrants. + +Training requires: ≥ 50 functions in snapshot, ≥ 5 positive and ≥ 10 negative labels from fix commits. Fix keywords: `fix:`, `bug`, `patch`, `hotfix`, `regression`, `defect`. + +Training is most valuable on repos with 1+ year of history, recognizable fix-commit conventions, and bug clusters in specific files. Use `--screen` to check suitability before fitting: + +```bash +hotspots train . --screen # check only, no model written +``` + +## AI Integration + +```bash +# Agent-optimized snapshot JSON (triage buckets + action text) +hotspots analyze . --mode snapshot --format json + +# Flat array for tooling that expects a complete function list +hotspots analyze . --mode snapshot --format json --all-functions + +# Delta for PR review context +hotspots analyze . --mode delta --format json + +# Pipe critical functions to Claude/Cursor/Copilot +hotspots analyze src/ --format json | jq '.functions[] | select(.lrs > 9)' +``` + +## Hook Templates + +```bash +# Print pre-commit and CI hook templates +hotspots init --hooks +``` + +Seed a baseline snapshot first: +```bash +hotspots analyze . --mode snapshot +hotspots init --hooks # copy the hook template, install it +``` + +## Troubleshooting + +**`"snapshot already exists and differs"`** — regenerate with `--force`. + +**`"no parent snapshot found"` in delta mode** — run `hotspots analyze . --mode snapshot` on the parent commit first. + +**`"failed to extract git context"`** — must be run inside a git repository. + +**Snapshot mode text output requires `--explain` or `--level`** — text format in snapshot mode without one of these flags is an error. + +**`--no-persist` and `--force` are mutually exclusive** — pick one. + +**`--level` and `--explain` are mutually exclusive** — pick one. diff --git a/docs/architecture/ARCHITECTURE.md b/docs/architecture/ARCHITECTURE.md deleted file mode 100644 index a3f09f3..0000000 --- a/docs/architecture/ARCHITECTURE.md +++ /dev/null @@ -1,988 +0,0 @@ -# Hotspots Architecture - -**Version:** 1.0 -**Last Updated:** 2026-02-15 -**Status:** Current - ---- - -## Overview - -Hotspots is a multi-language static analysis tool that identifies high-risk functions by combining code complexity metrics with git activity data. It analyzes TypeScript, JavaScript, Go, Java, Python, and Rust codebases to produce prioritized risk scores that help teams focus refactoring efforts on code that's both complex and frequently changed. - -### Core Value Proposition - -Traditional complexity tools only measure code structure. Hotspots combines: -- **Static metrics** (cyclomatic complexity, nesting depth, fan-out, non-structured exits) -- **Git activity** (churn, touch count, recency) -- **Call graph analysis** (fan-in, PageRank, strongly connected components, dependency depth) - -This produces an **Activity-Weighted Risk Score** that identifies the 20% of functions causing 80% of production issues. - ---- - -## Core Principles & Invariants - -Hotspots enforces strict invariants to ensure deterministic, reproducible analysis: - -### Determinism -- **Identical input yields byte-for-byte identical output** — same source code produces identical results across runs -- **Deterministic traversal order** — functions sorted by `(file_index, span.start)` before analysis -- **No randomness** — no use of random number generators, hash map iteration order, or non-deterministic algorithms -- **No clocks** — analysis results don't depend on wall-clock time (uses commit timestamps) - -### Per-Function Isolation -- **One function = one analysis** — each function analyzed independently -- **No cross-function edges in CFG** — control flow graphs are per-function only -- **No global mutable state** — all analysis is pure functional transformations - -### Formatting Independence -- **Whitespace/formatting don't affect metrics** — code style changes don't change risk scores -- **Comments ignored** — comments don't affect complexity calculations -- **AST-based analysis** — metrics derived from abstract syntax trees, not text patterns - -### Immutability -- **Snapshots are immutable** — once written, snapshots are never modified (identified by commit SHA) -- **Atomic writes** — snapshot persistence uses temp-file-plus-rename to prevent corruption -- **Schema versioning** — snapshots carry version numbers for forward compatibility - ---- - -## High-Level Architecture - -Hotspots follows a **pipeline architecture** with clear separation of concerns: - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ CLI Entry Point │ -│ (hotspots-cli/src/main.rs) │ -└────────────────────────────┬────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ Configuration Loading │ -│ (.hotspotsrc.json, hotspots.config.json, package.json) │ -└────────────────────────────┬────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ File Discovery │ -│ Recursive traversal, include/exclude filtering, sorting │ -└────────────────────────────┬────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ Per-File Analysis Pipeline │ -│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ -│ │ Parse │→ │ Discover │→ │ Build │→ │ Extract │ │ -│ │ │ │ Functions │ │ CFG │ │ Metrics │ │ -│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ -│ │ │ │ │ │ -│ └──────────────┴──────────────┴──────────────┘ │ -│ │ │ -│ ▼ │ -│ ┌──────────────┐ │ -│ │ Calculate │ │ -│ │ LRS & Band │ │ -│ └──────────────┘ │ -└────────────────────────────┬────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ Snapshot Enrichment │ -│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ -│ │ Git │→ │ Churn │→ │ Touch │→ │ Call │ │ -│ │ Context │ │ Metrics │ │ Metrics │ │ Graph │ │ -│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ -│ │ │ │ │ │ -│ └──────────────┴──────────────┴──────────────┘ │ -│ │ │ -│ ▼ │ -│ ┌──────────────────────────┐ │ -│ │ Activity Risk Scoring │ │ -│ │ (LRS + activity + graph) │ │ -│ └──────────────────────────┘ │ -└────────────────────────────┬────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────┐ -│ Output Generation │ -│ Snapshot Mode: Persist + JSON/HTML/JSONL/Text │ -│ Delta Mode: Compare vs parent + Policy evaluation │ -└─────────────────────────────────────────────────────────────────┘ -``` - ---- - -## Component Architecture - -### 1. Language Abstraction Layer (`hotspots-core/src/language/`) - -Hotspots supports 6 languages through a unified abstraction: - -#### Language Detection -- **File extension mapping** — `.ts` → TypeScript, `.go` → Go, `.py` → Python, etc. -- **Language enum** — `TypeScript`, `TypeScriptReact`, `JavaScript`, `JavaScriptReact`, `Go`, `Java`, `Python`, `Rust` - -#### Parser Trait (`LanguageParser`) -```rust -trait LanguageParser { - fn parse(&self, source: &str, filename: &str) -> Result>; -} -``` - -Each language implements: -- **ECMAScript** (TS/JS) — Uses SWC parser (same as TypeScript compiler) -- **Go** — Uses tree-sitter-go -- **Java** — Uses tree-sitter-java -- **Python** — Uses tree-sitter-python -- **Rust** — Uses syn (same parser as rustc) - -#### ParsedModule Trait -```rust -trait ParsedModule { - fn discover_functions(&self, file_index: usize, source: &str) -> Vec; -} -``` - -Returns language-agnostic `FunctionNode` objects with: -- `FunctionId` (file_index, local_index) -- Function name (or None for anonymous) -- `SourceSpan` (start/end byte, line, column) -- `FunctionBody` (language-specific AST representation) -- Suppression reason (if `// hotspots-ignore: reason` comment present) - -#### CFG Builder Trait (`CfgBuilder`) -```rust -trait CfgBuilder { - fn build(&self, function: &FunctionNode) -> Cfg; -} -``` - -Each language builds a control flow graph from its AST: -- **ECMAScript** — Visits SWC AST, builds CFG nodes for if/switch/loop/break/continue -- **Go/Java/Python** — Re-parse with tree-sitter, traverse AST to build CFG -- **Rust** — Uses syn AST, handles match/if/loop/break/continue - -#### FunctionBody Enum -Wraps language-specific AST representations: -- `ECMAScript(BlockStmt)` — SWC block statement -- `Go { body_node, source }` — Tree-sitter node ID + source -- `Java { body_node, source }` — Tree-sitter node ID + source -- `Python { body_node, source }` — Tree-sitter node ID + source -- `Rust { source }` — Full function source (re-parsed on demand) - -**Why this design?** -- Tree-sitter nodes are tied to tree lifetime → store source + node ID, re-parse when needed -- SWC and syn provide owned ASTs → can store directly -- Unified interface allows language-agnostic metric extraction - ---- - -### 2. Analysis Pipeline (`hotspots-core/src/analysis.rs`) - -The per-file analysis pipeline: - -1. **Read source file** — `std::fs::read_to_string()` -2. **Detect language** — `Language::from_path(path)` -3. **Get parser** — Match language to parser implementation -4. **Parse** — `parser.parse(&src, filename)` → `ParsedModule` -5. **Discover functions** — `module.discover_functions(file_index, &src)` → `Vec` -6. **For each function:** - - **Build CFG** — `get_builder_for_function(function).build(function)` → `Cfg` - - **Validate CFG** — Ensure entry/exit nodes, no cycles, reachability - - **Extract metrics** — `metrics::extract_metrics(function, &cfg)` → `RawMetrics` - - **Calculate risk** — `risk::analyze_risk_with_config(&metrics, weights, thresholds)` → `(RiskComponents, LRS, RiskBand)` - - **Create report** — `FunctionRiskReport::new(...)` → `FunctionRiskReport` - -**Output:** `Vec` per file, aggregated across all files - ---- - -### 3. Metrics Extraction (`hotspots-core/src/metrics.rs`) - -Extracts 5 core metrics from AST + CFG: - -#### Cyclomatic Complexity (CC) -- **Formula:** `CC = E - N + 2` (edges - nodes + 2) -- **Computed from:** CFG structure (number of decision points) -- **Language-specific extras:** - - Go: Switch/select cases, boolean operators (`&&`, `||`) - - Java: Switch cases, ternary operators, boolean operators - - Python: Match cases, boolean operators - - ECMAScript/Rust: CFG-based only - -#### Nesting Depth (ND) -- **Definition:** Maximum depth of nested control structures (if/loop/switch/try) -- **Computed from:** AST traversal, tracking depth on entry/exit of control nodes -- **ECMAScript:** Visitor pattern with macro-generated visit methods -- **Tree-sitter languages:** Recursive traversal, depth tracking - -#### Fan-Out (FO) -- **Definition:** Number of unique functions called by this function -- **Computed from:** AST traversal, extracting callee names from call expressions -- **ECMAScript:** `FanOutVisitor` walks `CallExpr`, extracts identifier/member names -- **Go/Java/Python:** Tree-sitter traversal, extracts `call_expression` → `identifier`/`selector_expression` -- **Rust:** Syn AST traversal, extracts function/method/macro calls -- **Stored in:** `RawMetrics.callee_names` (used later for call graph) - -#### Non-Structured Exits (NS) -- **Definition:** Number of early returns, exceptions, panics, or other non-structured exits -- **Computed from:** AST traversal, counting: - - `return` statements (excluding final tail return) - - `throw` / `raise` / `panic()` calls - - `defer` statements (Go) - - `?` operator (Rust) - - `unwrap()` / `expect()` calls (Rust) - -#### Lines of Code (LOC) -- **Definition:** Physical lines from function start to end (inclusive) -- **Computed from:** `SourceSpan.end_line - SourceSpan.start_line + 1` -- **Includes:** Blank lines and comments within function body -- **Excludes:** Function signature if on separate line (language-dependent) - ---- - -### 4. Risk Scoring (`hotspots-core/src/risk.rs`) - -Transforms raw metrics into risk scores: - -#### Risk Component Transforms -- **R_cc** = `min(log2(CC + 1), 6.0)` — Logarithmic scaling caps at 6 -- **R_nd** = `min(ND, 8.0)` — Linear, capped at 8 -- **R_fo** = `min(log2(FO + 1), 6.0)` — Logarithmic scaling caps at 6 -- **R_ns** = `min(NS, 6.0)` — Linear, capped at 6 - -#### Local Risk Score (LRS) -Weighted sum of risk components: -``` -LRS = w_cc * R_cc + w_nd * R_nd + w_fo * R_fo + w_ns * R_ns -``` - -**Default weights:** -- `w_cc = 1.0` (cyclomatic complexity) -- `w_nd = 0.8` (nesting depth) -- `w_fo = 0.6` (fan-out) -- `w_ns = 0.7` (non-structured exits) - -**Configurable:** Weights and thresholds can be overridden via config file or CLI flags. - -#### Risk Bands -- **Low:** LRS < 3.0 -- **Moderate:** 3.0 ≤ LRS < 6.0 -- **High:** 6.0 ≤ LRS < 9.0 -- **Critical:** LRS ≥ 9.0 - -**Configurable:** Thresholds can be customized per project. - ---- - -### 5. Call Graph Analysis (`hotspots-core/src/callgraph.rs`) - -Builds a directed graph of function calls and computes graph metrics: - -#### Graph Construction -1. **Add nodes** — All functions become graph nodes (ID: `file::function`) -2. **Add edges** — For each function, add edges to all callees in `RawMetrics.callee_names` -3. **Resolve calls** — Match callee names to function IDs: - - Prefer same-file matches - - Fall back to first match if no same-file match - - Handle name collisions (multiple functions with same name) - -> **Known limitation — best-effort static approximation:** Name-based resolution works well -> for direct function calls but cannot resolve interface dispatch, virtual methods, higher-order -> functions, or closures. In Go/Java/Python codebases that use idiomatic interface patterns, -> a meaningful fraction of call edges may be unresolved or resolved to the wrong target. All -> graph metrics derived from the call graph (PageRank, betweenness, fan-in, neighbor churn) -> are proportionally affected. - -#### Graph Metrics - -**Fan-In** — Number of functions calling this function -- Higher fan-in = more dependents = higher change risk - -**Fan-Out** — Number of functions this function calls -- Already computed during metric extraction (reused here) - -**PageRank** — Importance/centrality score -- Iterative algorithm (20-50 iterations, damping factor 0.85) -- Functions called by important functions get higher scores -- Identifies architectural hubs - -**Betweenness Centrality** — Criticality on shortest paths -- Counts how many shortest paths between other functions pass through this one -- High betweenness = architectural bottleneck - -**Strongly Connected Components (SCC)** — Tarjan's algorithm -- Detects cyclic dependencies (functions that call each other) -- `scc_id` and `scc_size` identify cycles -- Functions in larger cycles are riskier - -**Dependency Depth** — Shortest path from entry points -- Entry points: `main`, exported functions, HTTP handlers (heuristic) -- BFS from all entry points computes shortest depth -- Deeper functions = longer dependency chain = more fragile - -**Neighbor Churn** — Sum of churn in all callees -- Indirect change risk (dependencies are changing) -- High neighbor churn = function is affected by volatile dependencies - ---- - -### 6. Activity-Weighted Risk Scoring (`hotspots-core/src/scoring.rs`) - -Combines LRS with activity and graph metrics: - -#### Formula -``` -activity_risk = LRS * 1.0 - + churn_factor * 0.5 - + touch_factor * 0.3 - + recency_factor * 0.2 - + fan_in_factor * 0.4 - + scc_penalty * 0.3 - + depth_penalty * 0.1 - + neighbor_churn_factor * 0.2 -``` - -#### Factor Calculations -- **churn_factor** = `(lines_added + lines_deleted) / 100` -- **touch_factor** = `min(touch_count_30d / 10, 5.0)` -- **recency_factor** = `max(0, 5.0 - days_since_last_change / 7)` -- **fan_in_factor** = `min(fan_in / 5, 10.0)` -- **scc_penalty** = `if scc_size > 1 { scc_size } else { 0 }` -- **depth_penalty** = `min(dependency_depth / 3, 5.0)` -- **neighbor_churn_factor** = `neighbor_churn / 500` - -**Configurable:** All weights can be customized via config file. - -> **Known limitation — fan-in/fan-out correlation:** Fan-out enters LRS via `R_fo`, and LRS -> feeds directly into `activity_risk`. Fan-in is added separately via `fan_in_factor`. Because -> fan-in and fan-out are positively correlated in most codebases (hub functions that call many -> things tend to also be called by many things), highly connected functions are penalized more -> than either metric implies in isolation. This is intentional but worth being aware of when -> interpreting scores for architectural hub functions. - ---- - -### 7. Snapshot System (`hotspots-core/src/snapshot.rs`) - -Immutable, commit-scoped snapshots of analysis results: - -#### Snapshot Structure -```rust -struct Snapshot { - schema_version: u32, // v2 (current) - commit: CommitInfo, // SHA, parents, timestamp, branch, message, author - analysis: AnalysisInfo, // Scope, tool version - functions: Vec, // Per-function metrics + risk scores - summary: Option, // Repo-level statistics (computed on output) - aggregates: Option, // File risk, co-change, module instability (computed on output) -} -``` - -#### FunctionSnapshot -Contains: -- **Static metrics:** CC, ND, FO, NS, LOC, LRS, band -- **Git activity:** Churn (lines added/deleted), touch_count_30d, days_since_last_change -- **Call graph:** Fan-in, fan-out, PageRank, betweenness, SCC, dependency_depth, neighbor_churn -- **Activity risk:** Unified risk score + risk factor breakdown -- **Percentiles:** `is_top_10_pct`, `is_top_5_pct`, `is_top_1_pct` flags - -#### SnapshotEnricher (Builder Pattern) -Enrichment pipeline with explicit ordering: -```rust -SnapshotEnricher::new(snapshot) - .with_churn(&churn_map) - .with_touch_metrics(repo_root) - .with_callgraph(&call_graph) - .enrich(&scoring_weights) - .build() -``` - -**Why builder pattern?** -- Makes enrichment order explicit and testable -- `Snapshot` remains a pure data container -- Prevents accidental mutation or wrong ordering - -#### Persistence -- **Location:** `.hotspots/snapshots/.json` -- **Index:** `.hotspots/index.json` — tracks all snapshots, commit order, compaction level -- **Atomic writes:** Temp file + rename to prevent corruption -- **Immutable:** Snapshots never overwritten (commit SHA is identity; use `--force` to override) - -#### Schema Versioning -- **Current version:** `SNAPSHOT_SCHEMA_VERSION = 2` -- **Minimum supported:** `SNAPSHOT_SCHEMA_MIN_VERSION = 1` (v1 snapshots load with missing fields defaulting to `None`) -- **Version range check:** `Snapshot::from_json()` explicitly rejects snapshots outside the supported range with a clear error: `"unsupported schema version: got X, supported range 1-2"` -- **Versioning contract:** Additive field changes increment the schema version and remain backward-compatible. Structural/breaking changes require a new minimum version and a migration guide. - ---- - -### 8. Delta System (`hotspots-core/src/delta.rs`) - -Compares current snapshot vs parent to identify changes: - -#### Delta Structure -```rust -struct Delta { - schema_version: u32, - commit: DeltaCommitInfo, // Current SHA, parent SHA - baseline: bool, // true if no parent - deltas: Vec, - policy: Option, // Policy evaluation results - aggregates: Option, -} -``` - -#### FunctionDeltaEntry -Tracks function changes: -- **Status:** `New`, `Deleted`, `Modified`, `Unchanged` -- **Before/After:** Function state (metrics, LRS, band) before and after -- **Delta:** Numeric changes (ΔCC, ΔND, ΔFO, ΔNS, ΔLRS) -- **Band transition:** e.g., `low` → `high` (risk band changed) -- **Suppression reason:** If function is ignored - -#### Matching Logic -- **By function_id:** `file::function` (not by file path or line number) -- **File moves:** Treated as delete + add (function_id changes) -- **Line changes:** Don't affect matching (function_id unchanged) - -> **Known limitation — function_id stability:** `function_id` depends on both the file path -> and the function name. File renames, directory moves, and function renames all generate a -> delete+add pair in delta output, losing continuity. This is especially noticeable in -> refactoring commits. Suppression annotations also cannot follow a renamed function since the -> ID changes. A content-hash or signature-based matching fallback is a planned improvement. - -#### Baseline Handling -If no parent snapshot exists: -- All functions marked as `New` -- `baseline = true` -- No `before` state or deltas - ---- - -### 9. Git Integration (`hotspots-core/src/git.rs`) - -Extracts git metadata and activity metrics: - -#### GitContext -- **Commit info:** SHA, parent SHAs, timestamp, branch, message, author -- **Event detection:** `is_fix_commit`, `is_revert_commit` (heuristic-based) -- **Ticket IDs:** Extracted from commit message and branch name (JIRA-123, #456, etc.) - -#### Churn Metrics -- **Extraction:** `git show --numstat ` → lines added/deleted per file -- **Mapping:** File-level churn mapped to functions by file path -- **Format:** `\t\t` (binary files skipped) - -#### Touch Metrics -- **Touch count:** `git log --since="30 days ago" --oneline -- ` → commit count -- **Days since last change:** Time from commit timestamp to last file modification -- **Computed at:** File level (all functions in file share same touch metrics) - -> **Known limitation — file-level granularity:** Touch count and days-since-last-change are -> computed once per file and applied uniformly to every function in that file. A file with 50 -> functions where only one was recently touched will report the same `touch_count_30d` for all -> 50. This means `touch_factor` in the activity risk score is a file-level approximation, not -> a per-function signal. Large files with many functions of varying activity levels will have -> noisier touch scores. Function-level touch metrics via `git log -L` are a planned improvement. - -#### PR Context Detection -- **Mechanism:** CI environment variables only (`GITHUB_BASE_REF`, `CI`, `PULL_REQUEST`, etc.) -- **Behavior in CI:** PR commits are detected and snapshot persistence is suppressed -- **Behavior locally:** Running `--mode snapshot` on any branch (including feature branches) - is treated as mainline — a snapshot will be persisted. There is no local branch detection. - -#### Co-Change Pairs -- **`extract_co_change_pairs(repo, window_days, min_count)`** — Walks git log for - the last N days, counts pairwise file co-occurrences, normalizes by minimum file - commit count, and returns `Vec` -- **Default:** 90-day window, `min_count = 3` -- **Filtering:** Ghost files (renamed/deleted) and trivially expected pairs - (e.g., `foo.rs` + `foo_test.rs`, `mod.rs` + sibling) are excluded - -#### Repository-Aware Operations -- **`extract_git_context_at(repo_path)`** — Uses explicit repo root (not CWD) -- **`extract_commit_churn_at(repo_path, sha)`** — Same -- **Graceful degradation:** Shallow clones, missing parents handled without errors - ---- - -### 10. Policy Evaluation (`hotspots-core/src/policy.rs`) - -Evaluates policies on deltas to enforce quality gates: - -#### Policy Types -1. **Critical Introduction** — Block new functions with critical risk -2. **Critical Regression** — Block functions that moved to critical band -3. **High Regression** — Warn on functions that moved to high band -4. **LRS Increase** — Warn on functions with LRS increase > threshold -5. **Metric Regression** — Warn on significant metric increases -6. **Band Transition** — Warn on any band transition (low→moderate, etc.) - -#### Evaluation Flow -1. Filter suppressed functions (`suppression_reason.is_some()`) -2. Check function status (`New`, `Modified`, etc.) -3. Evaluate policy condition (LRS threshold, band change, etc.) -4. Collect failures and warnings -5. Return `PolicyResults` with blocking failures and warnings - -#### CI Integration -- **Blocking failures** → CI fails -- **Warnings** → CI passes but reports issues -- **Suppressed functions** → Ignored (explicit opt-out) - ---- - -### 11. Output Formats - -#### Text Format - -In basic mode (`hotspots analyze src/`) the text output is a simple ranked table -(LRS, File, Line, Function, Risk). In snapshot mode (`--mode snapshot --format text`) -the text format requires one of three sub-modes: - -- **`--explain`** — Per-function human-readable breakdown: metric contributions, activity - signals (churn, touch count, fan-in, SCC, depth), plus a co-change coupling section - showing the top 10 high/moderate source-file pairs. -- **`--level file`** — Ranked file risk table (one row per file): max CC, avg CC, - function count, LOC, critical-band count, file churn, composite `file_risk_score`. -- **`--level module`** — Ranked module instability table (one row per directory): - file count, function count, avg CC, afferent/efferent coupling, instability, risk. - -#### JSON Format -- Complete snapshot/delta structure -- Pretty-printed for readability -- Includes all metrics, risk scores, and metadata - -#### JSONL Format -- One JSON object per line (newline-delimited) -- Each line is a complete `FunctionSnapshot` -- Suitable for streaming, database ingestion, DuckDB - -#### HTML Format -- Interactive report with: - - Sortable tables - - Risk band color coding - - Expandable function details - - Call graph visualization (future) -- Written to `.hotspots/report.html` by default - ---- - -### 12. Trend Analysis (`hotspots-core/src/trends.rs`) - -Analyzes the accumulated snapshot history to surface how risk has evolved over time. -Requires multiple snapshots in `.hotspots/snapshots/` (collected by CI or `hotspots replay`). - -**CLI:** `hotspots trends [--window N] [--top K] [--format json|text]` - -#### What It Computes - -**Risk Velocity (`Vec`)** — Rate and direction of LRS change per function over -the analysis window. Each entry includes `velocity: f64`, `direction` (Positive/Negative/Flat), -`first_lrs`, `last_lrs`, and `commit_count`. - -**Hotspot Stability (`Vec`)** — Consistency of a function appearing in the -top-K highest-risk functions across snapshots. Classified as: -- `Stable` — consistently in top-K (chronic hotspot) -- `Emerging` — recently appeared in top-K (rising risk) -- `Volatile` — intermittently in top-K (inconsistent) - -**Refactor Effectiveness (`Vec`)** — Detects functions that had a significant -LRS drop and tracks whether the improvement held. Classified as: -- `Successful` — improvement of ≥ 1.0 LRS, sustained, no rebound -- `Partial` — improvement occurred but partially rebounded -- `Cosmetic` — improvement below significance threshold - -#### Limitations -- Requires at least 2 snapshots to compute velocity; more snapshots yield more meaningful trends -- Operates on LRS (complexity-based score), not `activity_risk`, as the stable historical signal -- Snapshots must be on the same branch/mainline for meaningful comparison -- `--format html` not yet implemented (planned) - ---- - -### 13. Aggregate Analysis (`hotspots-core/src/aggregates.rs`, `hotspots-core/src/imports.rs`) - -Computes higher-level risk views from per-function data and git history. All three -aggregates are computed at output time and included in snapshot JSON under `aggregates`. - -#### File Risk (D-1) — `compute_file_risk_views()` - -Folds per-function data into one `FileRiskView` per unique file. No new git calls needed; -all inputs come from the enriched `FunctionSnapshot` list. - -``` -file_risk_score = max_cc × 0.4 - + avg_cc × 0.3 - + log2(function_count + 1) × 0.2 - + file_churn_factor × 0.1 -``` - -Ranked by `file_risk_score` descending. Accessible via `--level file` text output or -`aggregates.file_risk` in JSON. - -#### Co-Change Coupling (D-2) — `git::extract_co_change_pairs()` - -Mined from git log in `git.rs`, surfaced in aggregates. See section 9 (git.rs) for the -extraction details. Pairs are stored in `aggregates.co_change`. Shown in `--explain` -output as a coupling section below the per-function list. - -#### Module Instability (D-3) — `compute_module_instability()` - -Parses `use`/`import` statements per language (via `imports.rs`) to build a file-level -import graph, then aggregates to directory level: - -- **Afferent coupling** — number of external directories that import from this directory -- **Efferent coupling** — number of external directories this directory imports from -- **Instability** = `efferent / (afferent + efferent)` (0.0 = depended on by all; 1.0 = depends on others only) -- **`module_risk`** = `high` if `instability < 0.3` and `avg_complexity > 10` - -Accessible via `--level module` text output or `aggregates.modules` in JSON. - -> **Resolution quality note:** Import-based resolution is used for D-3 (not name-based -> call graph resolution). This gives better coverage than the function-level call graph -> but is still best-effort — re-exports, conditional imports, and generated code may -> produce inaccurate edge counts. - ---- - -## Data Flow Examples - -### Example 1: Basic Analysis (No Snapshot Mode) - -``` -1. CLI: hotspots analyze src/ -2. Load config (.hotspotsrc.json or defaults) -3. Collect source files (recursive, filtered by include/exclude) -4. For each file: - a. Parse → ParsedModule - b. Discover functions → Vec - c. For each function: - - Build CFG - - Extract metrics (CC, ND, FO, NS, LOC) - - Calculate LRS and risk band -5. Aggregate reports across all files -6. Sort by LRS descending -7. Output: Text table or JSON -``` - -### Example 2: Snapshot Mode - -``` -1. CLI: hotspots analyze src/ --mode snapshot -2. Run basic analysis (steps 1-4 above) -3. Extract git context (SHA, parents, timestamp, etc.) -4. Build call graph from callee_names -5. Extract churn metrics (if parent exists) -6. Extract touch metrics (30-day commit count) -7. Enrich snapshot: - - Populate churn → functions - - Populate touch metrics → functions - - Populate call graph metrics → functions - - Compute activity risk → functions - - Compute percentiles → functions - - Compute summary → snapshot -8. Persist snapshot: .hotspots/snapshots/.json -9. Update index: .hotspots/index.json -10. Output: JSON/HTML/JSONL -``` - -### Example 3: Delta Mode - -``` -1. CLI: hotspots analyze src/ --mode delta -2. Run snapshot enrichment (steps 1-7 above) -3. Load parent snapshot: .hotspots/snapshots/.json -4. Compute delta: - - Match functions by function_id - - Compare metrics/LRS/band - - Classify: New/Deleted/Modified/Unchanged - - Calculate numeric deltas - - Detect band transitions -5. Evaluate policies (if --policy flag): - - Check each policy condition - - Collect failures and warnings -6. Output: Delta JSON (with policy results if requested) -``` - ---- - -## Language-Specific Details - -### ECMAScript (TypeScript/JavaScript) - -**Parser:** SWC (same as TypeScript compiler) -- **Advantages:** Full TypeScript support, JSX, decorators, type-aware -- **AST:** Owned `swc_ecma_ast::Module` (can store directly) -- **CFG Builder:** Visitor pattern, handles if/switch/loop/break/continue -- **Metrics:** AST-based for all metrics -- **Callee extraction:** `FanOutVisitor` walks `CallExpr`, extracts identifiers/members - -### Go - -**Parser:** tree-sitter-go -- **AST:** Tree-sitter nodes (tied to tree lifetime) → store source + node ID -- **CFG Builder:** Re-parse with tree-sitter, traverse AST -- **Metrics:** Tree-sitter AST traversal -- **Callee extraction:** Extract `call_expression` → `identifier`/`selector_expression` -- **Special handling:** `defer`, `go` statements, `panic()`, `os.Exit()`, `log.Fatal*` - -### Java - -**Parser:** tree-sitter-java -- **AST:** Tree-sitter nodes → store source + node ID -- **CFG Builder:** Re-parse with tree-sitter, handles if/switch/loop/try-catch -- **Metrics:** Tree-sitter AST traversal -- **Callee extraction:** Extract method calls, constructor calls -- **Special handling:** Ternary operators, lambda expressions (partial) - -### Python - -**Parser:** tree-sitter-python -- **AST:** Tree-sitter nodes → store source + node ID -- **CFG Builder:** Re-parse with tree-sitter, handles if/elif/else/for/while/try-except-finally -- **Metrics:** Tree-sitter AST traversal -- **Callee extraction:** Extract function calls -- **Special handling:** Match statements (partial CFG support) - -### Rust - -**Parser:** syn (same as rustc) -- **AST:** Owned `syn::ItemFn` (can store directly) -- **CFG Builder:** Traverse syn AST, handles if/match/loop/break/continue -- **Metrics:** Syn AST traversal -- **Callee extraction:** Extract function/method/macro calls -- **Special handling:** `?` operator, `unwrap()`/`expect()`, `panic!()`, `unreachable!()` - ---- - -## Configuration System (`hotspots-core/src/config.rs`) - -### Config File Discovery -1. Explicit path (`--config path/to/config.json`) -2. `.hotspotsrc.json` in project root -3. `hotspots.config.json` in project root -4. `package.json` under `"hotspots"` key - -### Config Structure -```json -{ - "include": ["src/**/*.ts"], - "exclude": ["**/*.test.ts", "**/node_modules/**"], - "thresholds": { - "moderate": 3.0, - "high": 6.0, - "critical": 9.0 - }, - "weights": { - "cc": 1.0, - "nd": 0.8, - "fo": 0.6, - "ns": 0.7 - }, - "scoring_weights": { - "churn": 0.5, - "touch": 0.3, - "recency": 0.2, - "fan_in": 0.4, - "scc": 0.3, - "depth": 0.1, - "neighbor_churn": 0.2 - }, - "min_lrs": 0.0, - "top": null -} -``` - -### ResolvedConfig -Merges user config with defaults, compiles glob patterns for fast matching. - ---- - -## Suppression System (`hotspots-core/src/suppression.rs`) - -Functions can be suppressed with inline comments: - -```typescript -// hotspots-ignore: This is a legacy function, will be removed in v2 -function oldFunction() { - // ... -} -``` - -**Behavior:** -- Suppressed functions are still analyzed and included in snapshots -- They are excluded from policy evaluation -- Suppression reason is stored in snapshot/delta -- Useful for documenting intentional tech debt - ---- - -## Performance Characteristics - -### Time Complexity -- **Parsing:** O(n) per file (n = file size) -- **Function discovery:** O(n) per file (AST traversal) -- **CFG construction:** O(n) per function (n = function size) -- **Metric extraction:** O(n) per function (AST/CFG traversal) -- **Call graph:** O(V + E) where V = functions, E = call edges -- **PageRank:** O(V * E * iterations) ≈ O(V * E * 20) -- **Betweenness:** O(V * E) (Brandes algorithm) -- **SCC:** O(V + E) (Tarjan's algorithm) - -### Space Complexity -- **AST storage:** O(n) per file -- **CFG:** O(n) per function (nodes + edges) -- **Call graph:** O(V + E) -- **Snapshots:** O(V) per snapshot (one entry per function) - -### Typical Performance -- **Small repo** (< 1000 functions): < 1 second -- **Medium repo** (1000-10000 functions): 1-10 seconds -- **Large repo** (> 10000 functions): 10-60 seconds - -**Bottlenecks:** -- Git operations (churn, touch metrics) — can be slow on large repos -- Call graph PageRank — O(V * E) can be expensive for very large graphs -- File I/O — reading many small files -- **Tree-sitter re-parse (Go/Java/Python):** CFG builders for Go, Java, and Python re-parse - the full source file for every function in that file (O(n × m) where n = functions, m = file - size). This is a design trade-off: tree-sitter nodes are tied to the tree's lifetime, so - the source must be re-parsed to find the target function node. ECMAScript and Rust are - unaffected (SWC/syn provide owned ASTs). A parse-result cache scoped per analysis run is - a planned fix. - ---- - -## Extensibility - -### Adding a New Language - -1. **Implement `LanguageParser`:** - - Parse source → `ParsedModule` - - Implement `discover_functions()` → `Vec` - -2. **Implement `CfgBuilder`:** - - Build CFG from `FunctionNode` - - Handle control flow (if/loop/switch/break/continue) - -3. **Add metric extraction:** - - Implement `extract_*_metrics()` in `metrics.rs` - - Extract CC, ND, FO, NS, LOC from AST/CFG - - Extract callee names for call graph - -4. **Update language enum:** - - Add language variant - - Add extension mapping - - Add parser/CFG builder dispatch - -5. **Add tests:** - - Golden tests with known outputs - - Language parity tests (equivalent functions across languages) - -**Estimated effort:** 8-16 hours per language - ---- - -## Testing Strategy - -### Unit Tests -- Per-module tests for parsing, CFG building, metric extraction -- Language-specific test fixtures - -### Golden Tests -- Deterministic output comparison -- 31+ golden test files covering various code patterns -- Regenerated when metrics change - -### Integration Tests -- End-to-end analysis of real repositories -- Git history tests (churn, touch metrics) -- Snapshot persistence and delta computation - -### Invariant Tests -- Determinism tests (identical input → identical output) -- Formatting independence tests -- Suppression system tests - ---- - -## Future Architecture Considerations - -### Planned Enhancements -1. **Cross-repo call graph** — Stitch call graphs across multiple repositories (hotspots-cloud) -2. **ML-based scoring** — Learn risk weights from historical incident data (hotspots-cloud) -3. **External event correlation** — Jira tickets, PagerDuty incidents (hotspots-cloud) - -### Architectural Boundaries - -**hotspots CLI (current scope):** -- Single-repo analysis -- Static metrics + git activity -- Call graph within repo -- Snapshot persistence -- Trend analysis from accumulated snapshots (`hotspots trends`) - -**hotspots-cloud (future scope):** -- Multi-repo aggregation -- Time-series analysis -- ML model training -- External API integration -- Team/org dashboards - ---- - -## Key Design Decisions - -### Why Per-Function Analysis? -- **Isolation** — Each function analyzed independently -- **Parallelization** — Functions can be analyzed in parallel (future) -- **Incremental** — Only changed functions need re-analysis -- **Clarity** — Results map directly to code units developers understand - -### Why Deterministic Ordering? -- **Reproducibility** — Same code → same results -- **Testing** — Golden tests can compare byte-for-byte -- **Debugging** — Consistent ordering makes issues easier to track - -### Why Immutable Snapshots? -- **History** — Complete analysis history preserved -- **Auditability** — Can verify past analysis results -- **Delta computation** — Compare any two snapshots -- **No corruption** — Atomic writes prevent partial snapshots - -### Why AST-Based Metrics? -- **Accuracy** — Understands code structure, not text patterns -- **Language-aware** — Handles language-specific constructs correctly -- **Formatting-independent** — Whitespace doesn't affect metrics -- **Extensible** — Easy to add new metrics or languages - -### Why Activity-Weighted Scoring? -- **Actionable** — Identifies code that's both complex AND changing -- **Prioritization** — Focuses effort on highest-impact refactoring -- **Evidence-based** — Uses git history, not gut feeling - ---- - -## Glossary - -- **LRS** — Local Risk Score (complexity-based risk, 0-20+) -- **Activity Risk** — Unified risk score combining LRS + activity + graph metrics -- **CFG** — Control Flow Graph (representation of function control flow) -- **SCC** — Strongly Connected Component (cyclic dependency group) -- **Fan-In** — Number of functions calling this function -- **Fan-Out** — Number of functions this function calls -- **Churn** — Lines added/deleted in a commit -- **Touch Count** — Number of commits modifying a file in last 30 days -- **Snapshot** — Immutable analysis result for a specific commit -- **Delta** — Comparison between two snapshots -- **Function ID** — Unique identifier: `file_path::function_name` - ---- - -**Document Status:** Current as of 2026-02-19 -**Maintainer:** Stephen Collins -**Questions?** Open an issue or see `docs/` for more details. diff --git a/docs/architecture/ARCHITECTURE_REVIEW.md b/docs/architecture/ARCHITECTURE_REVIEW.md deleted file mode 100644 index 6d83739..0000000 --- a/docs/architecture/ARCHITECTURE_REVIEW.md +++ /dev/null @@ -1,257 +0,0 @@ -# Architecture Review Findings - -**Date:** 2026-02-16 -**Reviewer:** Claude (claude-sonnet-4-5) -**Status:** Findings only — no plan yet -**Scope:** Review of `ARCHITECTURE.md` against actual codebase behavior - ---- - -## Overview - -This document records findings from a review of `ARCHITECTURE.md` against the actual codebase. It is -distinct from `IMPROVEMENTS.md` (which focuses on performance and extensibility). These findings are -primarily about **scoring accuracy**, **correctness**, and **documentation gaps** — places where the -architecture doc either understates limitations or omits known constraints. - ---- - -## Findings - -### F-1: Touch Metrics Are File-Level, Not Function-Level - -**Severity:** Medium — affects scoring accuracy -**Location:** `hotspots-core/src/git.rs`, `snapshot.rs` enrichment pipeline - -**Description:** -Touch count (commits in last 30 days) and days-since-last-change are computed at the **file** level -and then attributed uniformly to every function in that file. A file with 50 functions where only -one has been touched in the last month will report `touch_count_30d = N` for all 50 functions. - -The architecture doc acknowledges this in a single line ("Computed at: File level") but does not -flag it as a limitation or note how it affects score interpretation. - -**Impact:** -- `touch_factor` in `activity_risk` is a noisy signal for large files with many functions -- Functions in "hot" files are over-penalized relative to their actual activity -- Especially misleading for large utility files or files with mixed-stability functions - -**What the doc should say:** -Explicitly call out that touch metrics are file-granularity approximations, and that files with many -functions will distribute the same touch count across all of them regardless of which function -actually changed. - ---- - -### F-2: Fan-Out Is Double-Penalized in the Scoring Formula - -**Severity:** Low-Medium — affects score calibration -**Location:** `hotspots-core/src/risk.rs`, `hotspots-core/src/scoring.rs` - -**Description:** -Fan-out enters the score in two distinct ways: - -1. `R_fo = min(log2(FO + 1), 6.0)` feeds into **LRS** (via `w_fo * R_fo`) -2. LRS feeds directly into `activity_risk` with weight 1.0 -3. `fan_in_factor = min(fan_in / 5, 10.0)` also feeds into `activity_risk` - -Fan-in and fan-out are different metrics, so (3) is not literally double-counting. However, because -high-fan-out functions tend to also have high fan-in (functions that call many things tend to be -called by many things), there is a systematic correlation that amplifies scores for highly-connected -functions beyond what either metric alone would suggest. - -The architecture doc describes these as independent additive factors without noting the correlation. - -**Impact:** -- Hub functions (high fan-in AND high fan-out) are disproportionately penalized -- Score interpretation is harder without knowing this interaction - -**What the doc should say:** -Note that fan-in and fan-out are correlated in typical codebases and that connected hub functions -will tend to score higher than the formula implies in isolation. - ---- - -### F-3: Name-Based Call Graph Resolution Has Significant Accuracy Limits - -**Severity:** Medium — affects call graph metric accuracy -**Location:** `hotspots-core/src/callgraph.rs` - -**Description:** -Call graph edges are built by matching callee names extracted from ASTs against known function names -using simple rules: prefer same-file match, fall back to first match globally. This approach cannot -resolve: - -- **Method dispatch through interfaces/traits** — calling `trait.method()` where the concrete type - is unknown -- **Higher-order functions** — functions passed as arguments and called indirectly -- **Closures** — anonymous functions called through variables -- **Dynamic dispatch** — virtual methods, function pointers, reflection - -For Go/Java/Python specifically, where interface-based dispatch is idiomatic, a significant -fraction of call edges may be unresolved or incorrectly resolved. - -The architecture doc says "Resolve calls — Match callee names to function IDs" and describes the -preference rules without noting these limitations. - -**Impact:** -- PageRank, betweenness centrality, fan-in, and neighbor_churn are all derived from an incomplete - call graph -- Functions that are primary callee targets via interfaces may appear to have lower fan-in than they - actually do -- "Architectural hub" detection may miss actual hubs that are only reached via interfaces - -**What the doc should say:** -Explicitly characterize the call graph as a "best-effort static approximation" that works well for -direct calls but systematically misses dynamic/interface dispatch. Quantify the expected coverage -where possible (e.g., "Go codebases using idiomatic interface patterns may have 20-40% unresolved -call edges"). - ---- - -### F-4: Tree-Sitter Re-Parse Per Function Is O(n × m) - -**Severity:** Medium — main performance bottleneck for Go/Java/Python -**Location:** `hotspots-core/src/language/{go,java,python}/cfg_builder.rs` - -**Description:** -The Go, Java, and Python CFG builders re-parse the full source file with tree-sitter for every -function in that file. A file with 30 functions is parsed 30 times. This is O(n × m) where n = -number of functions in the file and m = file size, rather than O(m) for parsing once and O(n) for -CFG construction. - -The architecture doc explains _why_ the source is stored (tree-sitter node lifetimes) and notes -"re-parse when needed" as the design choice, but frames this as a rational trade-off rather than -a known bottleneck. - -**Impact:** -- For large Go/Java/Python files (e.g. 1000+ line files with 30+ functions), CFG building - dominates analysis time -- Contrast: ECMAScript and Rust parse once (owned ASTs), so the problem is language-specific -- The `IMPROVEMENTS.md` "Incremental Analysis" section would partly address this but does not - identify it as the root cause - -**What the doc should say:** -Flag the re-parse pattern as a known performance issue specific to tree-sitter languages, and note -that the fix (parse once, traverse AST to find the target function node) is straightforward but -requires careful handling of tree-sitter lifetimes. - ---- - -### F-5: `function_id` Is Path-Dependent — Refactoring Commits Lose History - -**Severity:** Medium — affects delta accuracy for refactoring -**Location:** `hotspots-core/src/delta.rs`, `hotspots-core/src/snapshot.rs` - -**Description:** -`function_id` is `file_path::function_name`. Any of these operations reset the ID and cause a -delete+add in delta output: - -- Renaming a file -- Moving a file to a different directory -- Renaming a function - -This is documented for file moves ("Treated as delete + add") but not for function renames, and -neither case is framed as a limitation — just as a design choice. - -**Impact:** -- Refactoring commits (which frequently rename/move things) produce delta noise: functions appear - as deleted and re-added with no history -- Trend analysis and policy evaluation see new functions where old ones existed -- Teams that rename functions to improve clarity get penalized by policies that flag "New critical - functions" -- The suppression system cannot easily suppress renames since the ID has changed - -**What the doc should say:** -Explicitly document that `function_id` stability depends on stable file paths and function names. -Refactoring commits that rename or move functions will appear as delete+add pairs. Consider whether -a content-hash or signature-based ID would better serve the delta use case. - ---- - -### F-6: Schema Migration Strategy Is Undefined - -**Severity:** Low-Medium — operational risk -**Location:** `hotspots-core/src/snapshot.rs` - -**Description:** -Snapshots carry a `schema_version: u32` field described as enabling "forward compatibility." The -architecture doc says: "Schema versioning — snapshots carry version numbers for forward -compatibility." But nowhere is the actual migration behavior defined: - -- What happens when a new version of `hotspots` reads an old-schema snapshot? -- Is it silently ignored? Does it fail with an error? Is it automatically migrated? -- What happens to the index if it contains mixed schema versions? - -**Impact:** -- Users who upgrade hotspots may find existing snapshots unreadable or silently dropped -- The index could reference snapshots that the new version cannot parse -- No migration guide exists - -**What the doc should say:** -Document the schema migration policy explicitly: which schema versions are supported, what happens -on mismatch, and whether migration is automatic or manual. - ---- - -### F-7: PR Context Detection Is CI-Only - -**Severity:** Low — documentation gap -**Location:** `hotspots-core/src/git.rs` (`detect_pr_context`) - -**Description:** -PR detection relies on CI environment variables (`GITHUB_BASE_REF`, `CI`, `PULL_REQUEST`, etc.). -Running `hotspots analyze --mode snapshot` locally on a PR branch will be detected as mainline and -will persist a snapshot for that commit, which may not be desired. - -The architecture doc mentions "Detect PR context (best-effort, CI env vars only)" but the -"best-effort" qualifier is easy to overlook and the local-branch implication is not spelled out. - -**Impact:** -- Local development on PR branches generates snapshots indexed as mainline -- Could pollute the snapshot history with PR commits if developer runs snapshot mode locally - -**What the doc should say:** -Explicitly note that `--mode snapshot` run locally on any branch (including feature branches) will -persist a snapshot. PR detection only suppresses persistence in CI environments that set standard -PR environment variables. - ---- - -### F-8: Trend Analysis Module Is Undocumented - -**Severity:** Low — documentation gap -**Location:** `hotspots-core/src/trends.rs`, `hotspots-cli/src/main.rs` (`Commands::Trends`) - -**Description:** -The `trends` subcommand (`hotspots trends`) is implemented and exposed in the CLI but does not -appear anywhere in `ARCHITECTURE.md`. The module computes risk velocity and hotspot stability from -snapshot history. - -**Impact:** -- Users and contributors have no architecture-level documentation for this feature -- The scoring formula, window size semantics, and output format are undocumented architecturally - -**What the doc should say:** -Add a section covering the trends system: what it computes, how it reads the snapshot index, what -"risk velocity" and "hotspot stability" mean, and how the window parameter works. - ---- - -## Summary Table - -| ID | Finding | Severity | Type | -|-----|------------------------------------------|-------------|-------------------------| -| F-1 | Touch metrics are file-level | Medium | Scoring accuracy | -| F-2 | Fan-out double-penalized | Low-Medium | Score calibration | -| F-3 | Name-based call graph accuracy limits | Medium | Call graph correctness | -| F-4 | Tree-sitter re-parse is O(n×m) | Medium | Performance bottleneck | -| F-5 | function_id is path-dependent | Medium | Delta accuracy | -| F-6 | Schema migration strategy undefined | Low-Medium | Operational risk | -| F-7 | PR detection is CI-only | Low | Documentation gap | -| F-8 | Trends module undocumented | Low | Documentation gap | - ---- - -**Status:** Findings documented. Plan to address not yet determined. -**See also:** [`IMPROVEMENTS.md`](./IMPROVEMENTS.md) for performance and extensibility proposals. diff --git a/docs/architecture/AUDIT_REPORT.md b/docs/architecture/AUDIT_REPORT.md deleted file mode 100644 index 254f513..0000000 --- a/docs/architecture/AUDIT_REPORT.md +++ /dev/null @@ -1,287 +0,0 @@ -# Hotspots Codebase Audit Report - -**Date:** 2026-02-17 (Revised) -**Scope:** Architectural issues and code smells across the Rust codebase -**Codebase Size:** ~18,090 LOC (production + tests) - ---- - -## Executive Summary - -Hotspots is a well-structured Rust project with strong invariants (determinism, immutability, per-function isolation). The codebase shows good patterns: `SnapshotEnricher` builder, shared `tree_sitter_utils`, AST-based call graph, parameter structs. Several issues remain: structural duplication in metrics, repetitive policy evaluation, panic-prone CFG visitor state, dead code, and oversized modules. None are critical; addressing them would improve maintainability and robustness. - ---- - -## 1. Architectural Issues - -### 1.1 metrics.rs — Large Monolith with Structural Duplication (~1,708 lines) - -**Severity:** High -**Status:** Partially addressed - -The largest file in the codebase. Tree-sitter metrics (Go, Java, Python) share helpers (`ts_with_function_body`, `ts_find_function_by_start`, `ts_find_child_by_kind`, `ts_nesting_depth`), but still have duplication: - -- **Callee extraction:** `go_extract_callees`, `java_extract_callees`, `python_extract_callees` follow the same pattern (collect nodes, extract identifiers, dedupe) -- **Non-structured exits:** `go_non_structured_exits`, `java_non_structured_exits`, `python_non_structured_exits` traverse AST counting exits -- **CC extras:** `go_count_cc_extras`, `java_count_cc_extras`, `python_count_cc_extras` count language-specific complexity contributors - -**What's good:** Shared helpers reduce duplication. The macro `impl_nesting_visitor!` eliminates ECMAScript visitor duplication. - -**Remaining duplication:** ~300-400 lines could be unified with a generic `TreeSitterMetricsConfig` that parameterizes node kinds per metric. - -**Ref:** ANALYSIS.md §1, CODEBASE_IMPROVEMENTS §2.2 - ---- - -### 1.2 Snapshot — Data Struct with Orchestration Logic - -**Severity:** Medium -**Status:** Partially addressed - -`Snapshot` is both a data container and an orchestrator. It exposes: -- `populate_churn()`, `populate_touch_metrics()`, `populate_callgraph()` -- `compute_activity_risk()`, `compute_percentiles()`, `compute_summary()` - -**Status:** `SnapshotEnricher` builder exists and makes ordering explicit. However, the mutation methods still live on `Snapshot`; enrichment could be moved entirely into the enricher for a pure data struct. - -**Recommendation:** Move all enrichment logic into `SnapshotEnricher`, leaving `Snapshot` as a pure data container. - -**Ref:** ANALYSIS.md §4 - ---- - -### 1.3 policy.rs — Repetitive Evaluation Pattern (~1,077 lines) - -**Severity:** Medium - -Seven policy evaluators share this pattern: - -```rust -for entry in active_deltas(deltas) { - if entry.status != { continue; } - // condition check... - results.failed.push(...) or results.warnings.push(...); -} -``` - -The loop and status filter are duplicated. `active_deltas()` centralizes suppression filtering, but status checks and result pushing are repeated. - -**Recommendation:** Introduce a `Policy` trait: -```rust -trait Policy { - fn id(&self) -> PolicyId; - fn severity(&self) -> PolicySeverity; - fn target_statuses(&self) -> &[FunctionStatus]; - fn evaluate(&self, entry: &FunctionDeltaEntry, config: &Config) -> Option; -} -``` - -Then a single evaluation loop dispatches to all policies. - -**Ref:** ANALYSIS.md §3 - ---- - -### 1.4 main.rs — Long CLI Module (~1,401 lines) - -**Severity:** Medium -**Status:** Partially addressed (2026-02-18) - -`main()` reduced from cc=50 to ~cc=6 by extracting `handle_analyze`, `handle_prune`, -`handle_compact`, `handle_config`, `handle_trends`. `build_enriched_snapshot` also extracted. - -Remaining: `handle_mode_output` (cc=30, ~170 lines) still interleaves snapshot/delta output -formatting. Output formatting (JSON/HTML/JSONL/Text) and aggregates computation are duplicated -between the two modes. - -**Recommendation:** Extract `emit_snapshot_output()` and `emit_delta_output()` to shrink -`handle_mode_output` to ~50 lines and clarify control flow. - ---- - -### 1.5 html.rs — String-Based Templating (~1,034 lines) - -**Severity:** Low - -HTML is built with `format!()` and inline CSS/JS strings. Drawbacks: -- No compile-time validation of HTML -- No syntax checking for CSS/JS -- Potential XSS if user content is ever interpolated - -**Mitigation:** Data is internal; no user input is currently rendered. Consider `askama` or `include_str!` for CSS/JS if reports grow. - -**Ref:** ANALYSIS.md §5 - ---- - -### 1.6 callgraph.rs — Dead Code Path - -**Severity:** Low -**Status:** ✅ Fixed (2026-02-17) - -`CallGraph::from_sources()` has been deleted along with its `use regex::Regex` import. -`build_call_graph` in `lib.rs` uses AST-derived `callee_names` from reports — the correct path. - ---- - -## 2. Code Smells - -### 2.1 Panic-Prone expect/unwrap in Production - -**Severity:** Medium - -| Location | Context | Risk | -|----------|---------|------| -| `metrics.rs:518` | `.unwrap_or(RawMetrics{...})` | ✅ Safe fallback | -| `go/java/python cfg_builder.rs` | `self.current_node.expect("Current node should exist")` | ⚠️ Can panic if visitor state is wrong | -| `cfg/builder.rs` | `self.current_node.expect("Current node should exist")` (8 instances) | ⚠️ Same risk | -| `snapshot.rs:650` | `.unwrap_or(0)` | ✅ Safe fallback | -| `callgraph.rs:371` | `current_depth.unwrap()` after `is_none()` check | ✅ Safe (checked first) | -| `html.rs:639` | `.unwrap_or(Ordering::Equal)` | ✅ Safe fallback | -| `trends.rs:195` | `match (first(), last())` with `_ => continue` | ✅ Handles empty case | -| `git.rs:22,26` | `Regex::new(...).unwrap()` in `OnceLock` | ✅ Compile-time regex, safe | -| `policy.rs:662` | `.unwrap()` in test | ✅ Test-only | - -**Production panic risks:** CFG builder visitor state (15 instances). If visitor callbacks are invoked out of order or without proper state, these will panic. - -**Recommendation:** Replace CFG builder `.expect()` with `?` or explicit error handling. Consider `Option<&Node>` return type from visitor methods. - ---- - -### 2.2 TODOs in Production Code - -**Severity:** Low - -| File | Line | TODO | -|------|------|------| -| `python/cfg_builder.rs` | 367 | "Model match statement CFG more precisely" | -| `java/cfg_builder.rs` | 507-509 | "Check for conditional_expression (ternary)", "binary_expression with && or ||", "lambda_expression with control flow" | - -**Recommendation:** Implement or add tracking in TASKS.md. Remove or clarify if deferred. - -**Ref:** CODEBASE_IMPROVEMENTS §5.2 - ---- - -### 2.3 Documentation Gaps - -**Severity:** Trivial -**Status:** ✅ Fixed - -- ✅ `lib.rs:79`: Comment correctly lists "TypeScript, JavaScript, Go, Java, Python, Rust" -- ✅ `lib.rs:137-143`: `collect_source_files` doc correctly lists all languages including Java (.java) and Python (.py, .pyw) - -**Previous issue resolved.** - ---- - -### 2.4 Incomplete Features - -**Severity:** Low -**Status:** ✅ Fixed (2026-02-17) - -- `hotspots compact --level 1|2`: Now exits non-zero with a clear "not yet implemented" error - instead of silently updating metadata. Prevents misleading UX. - ---- - -## 3. Module Size / Complexity - -| File | Lines | Notes | -|------|-------|-------| -| metrics.rs | 1,708 | Largest; tree-sitter duplication | -| main.rs | 1,401 | Long; could split output handling | -| snapshot.rs | 1,213 | Mix of data and orchestration | -| policy.rs | 1,077 | Repetitive evaluators | -| html.rs | 1,034 | String-based templating | -| config.rs | 930 | Config loading and resolution | -| git.rs | 856 | Git operations | -| cfg/builder.rs | 732 | ECMAScript CFG builder | -| trends.rs | 723 | Trend analysis | - -Files over ~800 lines are candidates for splitting or refactoring. - ---- - -## 4. Error Handling Patterns - -- **Strengths:** `anyhow::Result` and `.context()` used consistently; graceful handling for shallow clones and missing parents. -- **Concerns:** CFG builder visitor state uses `.expect()` (15 instances); `eprintln!` for touch-metrics failure instead of structured logging. - ---- - -## 5. Test Code Quality - -- **Tests use `.unwrap()` and `.expect()`:** Acceptable in tests where failure should be immediate. -- **Integration tests:** Use temp dirs, real git, real filesystem. -- **Golden tests:** Good coverage across languages and patterns. - ---- - -## 6. Positive Patterns - -1. **Determinism:** Sorted outputs, no HashMap iteration leaks. -2. **SnapshotEnricher:** Explicit enrichment ordering (builder pattern). -3. **`tree_sitter_utils`:** Shared helpers for parsers and CFG builders. -4. **AST-based call graph:** Uses callee names from metrics; no regex in main path. -5. **Parameter structs:** `ModeOutputOptions`, `ActivityRiskInput`, `TarjanState` reduce argument count. -6. **No `#[allow(...)]` in production code:** Clippy compliance. -7. **Safe fallbacks:** Most `.unwrap()` uses have safe fallbacks (`unwrap_or`, `unwrap_or_else`). - ---- - -## 7. Priority Recommendations - -| Priority | Issue | Effort | Impact | -|----------|-------|--------|--------| -| 1 | Replace CFG builder `.expect()` with error handling | Medium | Robustness (prevents panics) | -| 2 | Extract policy trait / table-driven evaluation | Medium | Maintainability (reduces duplication) | -| 3 | Remove or document `CallGraph::from_sources` | Low | Cleanliness (dead code) | -| 4 | Extract TreeSitterMetricsConfig (generic tree-sitter metrics) | Medium | Less duplication (~300-400 lines) | -| 5 | Extract emit_snapshot_output / emit_delta_output from main | Low | Readability (shrink main.rs) | -| 6 | Move enrichment logic entirely into SnapshotEnricher | Low | Separation of concerns | -| 7 | Address CFG builder TODOs or track in TASKS | Low–Medium | Completeness | -| 8 | Document or implement compact subcommand | Low | UX | - ---- - -## 8. Summary - -| Category | Count | Notes | -|----------|-------|-------| -| Architectural issues | 6 | Duplication, dead code, oversized modules | -| Code smells | 4 | Panic risks, TODOs, incomplete features | -| Oversized modules (>800 LOC) | 9 | Candidates for splitting | -| Panic-prone production paths | 15 | CFG builder visitor state | -| TODOs in production | 4 | Python match CFG, Java ternary/binary/lambda | -| Dead code | 1 | `CallGraph::from_sources` | - -**Overall Assessment:** The codebase is in good shape with strong patterns. Highest-value improvements: reduce CFG builder panic risk, simplify policy evaluation, remove dead code, and extract generic tree-sitter metrics. - ---- - -## 9. Changes Since Last Audit - -**Fixed:** -- ✅ `lib.rs` documentation now correctly lists all 6 languages (A-3) -- ✅ `html.rs` uses `unwrap_or(Ordering::Equal)` for safe fallback -- ✅ `trends.rs` handles empty slices correctly with `match` -- ✅ `callgraph.rs` — `from_sources()` dead code deleted (A-1) -- ✅ `snapshot.rs` — `to_jsonl()` `.unwrap()` replaced with `.context()` (A-2) -- ✅ `compact --level 1|2` now bails with clear "not implemented" error (A-4) -- ✅ `main()` cc reduced from 50 to ~6 via command handler extraction (A-7 partial) - -**Still Open:** -- ⚠️ CFG builder visitor state panic risk (~28 `.expect()` instances across 4 files) -- ⚠️ Policy evaluation duplication (A-6) -- ⚠️ `handle_mode_output` emit extraction — cc=30 still high (A-7 remainder) -- ⚠️ Tree-sitter metrics duplication (A-8, deferred) -- ⚠️ CFG builder TODOs — Python match CC, Java ternary/lambda (A-5) - ---- - -**References:** -- [ANALYSIS.md](../../ANALYSIS.md) -- [CODEBASE_IMPROVEMENTS.md](../../CODEBASE_IMPROVEMENTS.md) -- [ARCHITECTURE.md](./ARCHITECTURE.md) -- [IMPROVEMENTS.md](./IMPROVEMENTS.md) diff --git a/docs/architecture/IMPROVEMENTS.md b/docs/architecture/IMPROVEMENTS.md deleted file mode 100644 index 22e0b8f..0000000 --- a/docs/architecture/IMPROVEMENTS.md +++ /dev/null @@ -1,664 +0,0 @@ -# Hotspots Architecture Improvements - -**Version:** 1.0 -**Last Updated:** 2026-02-15 -**Status:** Proposal - ---- - -## Overview - -This document outlines potential architectural improvements to Hotspots, organized by priority and impact. These improvements would enhance performance, maintainability, extensibility, and scalability without breaking existing functionality. - ---- - -## High Priority: Performance & Scalability - -### 1. Parallel File Analysis - -**Current State:** -- Analysis is single-threaded -- Files are processed sequentially -- No parallelization of independent operations - -**Proposed Improvement:** -- Use `rayon` for parallel file processing -- Parallelize independent stages: - - File parsing (per-file, no shared state) - - Function discovery (per-file) - - CFG building (per-function) - - Metric extraction (per-function) - -**Implementation:** -```rust -// Parallel file analysis -let reports: Vec<_> = files - .par_iter() - .map(|file| analyze_file(file, config)) - .collect(); -``` - -**Benefits:** -- 4-8x speedup on multi-core systems -- Scales with CPU cores -- Minimal code changes (rayon's `par_iter`) - -**Challenges:** -- Must maintain deterministic ordering for output -- Git operations still sequential (external dependency) -- Memory usage increases with parallelism - -**Estimated Impact:** 4-8x faster for large repos - ---- - -### 2. Incremental Analysis & Caching - -**Current State:** -- Full analysis runs every time -- No caching of parsed ASTs or CFGs -- No change detection - -**Proposed Improvement:** -- Cache parsed ASTs per file (hash-based) -- Cache CFGs per function (content hash) -- Only re-analyze changed files/functions -- Store cache in `.hotspots/cache/` - -**Implementation:** -```rust -struct AnalysisCache { - ast_cache: HashMap, // hash -> AST - cfg_cache: HashMap, // hash -> CFG -} - -fn analyze_with_cache(file: &Path, cache: &mut AnalysisCache) -> Result> { - let content_hash = hash_file(file)?; - if let Some((cached_hash, ast)) = cache.ast_cache.get(file) { - if *cached_hash == content_hash { - return Ok(use_cached_ast(ast)); - } - } - // Parse and cache - let ast = parse(file)?; - cache.ast_cache.insert(file.clone(), (content_hash, ast)); - // ... -} -``` - -**Benefits:** -- 10-100x faster for incremental changes -- Reduces CPU usage -- Enables faster CI feedback - -**Challenges:** -- Cache invalidation strategy -- Cache size management -- Determinism must be preserved - -**Estimated Impact:** 10-100x faster for incremental runs - ---- - -### 3. Batched Git Operations - -**Current State:** -- Each file's touch metrics require separate `git log` calls -- Sequential git operations -- No caching of git results - -**Proposed Improvement:** -- Batch git operations: `git log --since=X --until=Y -- file1 file2 file3 ...` -- Cache git results per commit SHA -- Parallel git operations where possible - -**Implementation:** -```rust -fn batch_touch_metrics( - files: &[PathBuf], - as_of: i64, -) -> Result> { - let output = git(&[ - "log", - &format!("--since={}", as_of - 30*24*60*60), - &format!("--until={}", as_of), - "--oneline", - "--", - // All files at once - ])?; - // Parse output and group by file -} -``` - -**Benefits:** -- 10-50x faster git operations for many files -- Reduces git process overhead -- Better for large repos - -**Challenges:** -- Git command line length limits -- Output parsing complexity -- Still sequential (git limitation) - -**Estimated Impact:** 10-50x faster git operations - ---- - -### 4. Optimize Call Graph Algorithms - -**Current State:** -- PageRank: O(V * E * iterations) ≈ O(V * E * 20) -- Betweenness: O(V * E) (Brandes algorithm) -- Both run on every snapshot - -**Proposed Improvement:** -- Incremental PageRank (only recompute changed nodes) -- Approximate Betweenness (sampling-based) -- Skip graph metrics if call graph unchanged -- Use sparse matrix representations - -**Implementation:** -```rust -// Incremental PageRank -fn incremental_pagerank( - graph: &CallGraph, - previous_scores: &HashMap, - changed_nodes: &HashSet, -) -> HashMap { - // Only recompute affected nodes -} -``` - -**Benefits:** -- 5-10x faster for large call graphs -- Scales better with repo size -- Enables real-time analysis - -**Challenges:** -- Algorithm correctness -- Maintaining determinism -- Testing complexity - -**Estimated Impact:** 5-10x faster call graph computation - ---- - -## Medium Priority: Architecture & Maintainability - -### 5. Plugin System for Metrics - -**Current State:** -- Metrics hardcoded in `metrics.rs` -- Adding new metrics requires core changes -- No way to extend metrics per-project - -**Proposed Improvement:** -- Trait-based metric system -- Plugin registry for custom metrics -- Config-driven metric selection - -**Implementation:** -```rust -trait MetricExtractor { - fn name(&self) -> &str; - fn extract(&self, function: &FunctionNode, cfg: &Cfg) -> f64; - fn weight(&self) -> f64; -} - -struct MetricRegistry { - extractors: Vec>, -} - -// Custom metric example -struct CustomMetric { - name: String, - extractor: fn(&FunctionNode, &Cfg) -> f64, - weight: f64, -} -``` - -**Benefits:** -- Extensibility without core changes -- Project-specific metrics -- Community contributions - -**Challenges:** -- Plugin API design -- Backward compatibility -- Performance overhead - -**Estimated Impact:** High extensibility, low performance impact - ---- - -### 6. Policy Trait System - -**Current State:** -- Policy evaluation duplicated across 7 functions -- Manual status filtering and suppression checks -- Hard to add new policies - -**Proposed Improvement:** -- Trait-based policy system -- Single evaluation loop -- Declarative policy definitions - -**Implementation:** -```rust -trait Policy { - fn id(&self) -> PolicyId; - fn severity(&self) -> PolicySeverity; - fn target_statuses(&self) -> &[FunctionStatus]; - fn evaluate(&self, entry: &FunctionDeltaEntry, config: &Config) -> Option; -} - -struct PolicyRegistry { - policies: Vec>, -} - -fn evaluate_all_policies( - deltas: &[FunctionDeltaEntry], - registry: &PolicyRegistry, - config: &Config, -) -> PolicyResults { - let mut results = PolicyResults::new(); - for entry in active_deltas(deltas) { - for policy in ®istry.policies { - if policy.target_statuses().contains(&entry.status) { - if let Some(result) = policy.evaluate(entry, config) { - results.add(result); - } - } - } - } - results -} -``` - -**Benefits:** -- Eliminates duplication -- Easier to add policies -- Testable policy logic - -**Challenges:** -- Migration from current system -- Performance (trait objects) -- Backward compatibility - -**Estimated Impact:** Better maintainability, minimal performance impact - ---- - -### 7. Dependency Injection for Testability - -**Current State:** -- Tight coupling to file system, git, and external dependencies -- Hard to test without real filesystem/git -- No way to mock dependencies - -**Proposed Improvement:** -- Trait-based abstractions for I/O -- Dependency injection container -- Test doubles for git/filesystem - -**Implementation:** -```rust -trait FileSystem { - fn read_file(&self, path: &Path) -> Result; - fn write_file(&self, path: &Path, content: &str) -> Result<()>; -} - -trait GitOperations { - fn get_commit_info(&self, sha: &str) -> Result; - fn get_churn(&self, sha: &str) -> Result>; -} - -struct AnalysisContext { - fs: Box, - git: Box, - config: ResolvedConfig, -} -``` - -**Benefits:** -- Unit tests without real filesystem -- Mock git operations -- Better test coverage - -**Challenges:** -- Large refactoring -- Trait object overhead -- Migration complexity - -**Estimated Impact:** Better testability, minimal runtime impact - ---- - -### 8. Streaming Output for Large Repos - -**Current State:** -- All results collected in memory -- JSON/HTML generated at end -- Memory usage scales with repo size - -**Proposed Improvement:** -- Stream results as they're computed -- Incremental JSON/HTML generation -- Support for very large repos - -**Implementation:** -```rust -trait OutputStream { - fn write_function(&mut self, report: &FunctionRiskReport) -> Result<()>; - fn finish(&mut self) -> Result<()>; -} - -struct StreamingJsonOutput { - writer: BufWriter, - first: bool, -} - -impl OutputStream for StreamingJsonOutput { - fn write_function(&mut self, report: &FunctionRiskReport) -> Result<()> { - if !self.first { - self.writer.write_all(b",\n")?; - } - serde_json::to_writer(&mut self.writer, report)?; - self.first = false; - Ok(()) - } -} -``` - -**Benefits:** -- Constant memory usage -- Handles very large repos -- Faster time-to-first-result - -**Challenges:** -- Output format changes -- HTML streaming complexity -- Backward compatibility - -**Estimated Impact:** Enables analysis of very large repos - ---- - -## Lower Priority: Quality of Life - -### 9. Language Plugin System - -**Current State:** -- Languages hardcoded in core -- Adding language requires core changes -- No way to extend language support externally - -**Proposed Improvement:** -- Dynamic language registration -- Plugin-based language support -- External language implementations - -**Implementation:** -```rust -trait LanguagePlugin { - fn name(&self) -> &str; - fn extensions(&self) -> &[&str]; - fn parser(&self) -> Box; - fn cfg_builder(&self) -> Box; -} - -struct LanguageRegistry { - languages: HashMap>, -} -``` - -**Benefits:** -- Community language contributions -- No core changes for new languages -- Experimental language support - -**Challenges:** -- Plugin API complexity -- Version compatibility -- Security considerations - -**Estimated Impact:** Enables community language support - ---- - -### 10. Structured Error Types - -**Current State:** -- Uses `anyhow::Result` everywhere -- Generic error messages -- Hard to handle specific error cases - -**Proposed Improvement:** -- Domain-specific error types -- Structured error information -- Better error recovery - -**Implementation:** -```rust -#[derive(Debug, thiserror::Error)] -enum AnalysisError { - #[error("Parse error in {file}: {message}")] - ParseError { file: PathBuf, message: String }, - #[error("Git error: {0}")] - GitError(#[from] GitError), - #[error("Config error: {0}")] - ConfigError(#[from] ConfigError), -} - -type AnalysisResult = Result; -``` - -**Benefits:** -- Better error messages -- Programmatic error handling -- Error recovery strategies - -**Challenges:** -- Migration effort -- Breaking changes -- Error type proliferation - -**Estimated Impact:** Better error handling and debugging - ---- - -### 11. AST Storage Optimization - -**Current State:** -- Full AST stored for all functions -- Tree-sitter nodes require source + node ID -- Memory usage scales with codebase size - -**Proposed Improvement:** -- Lazy AST parsing (parse on demand) -- Compact AST representation -- Shared AST nodes where possible - -**Implementation:** -```rust -enum LazyFunctionBody { - Parsed(FunctionBody), - Unparsed { source: String, language: Language }, -} - -impl LazyFunctionBody { - fn parse(&mut self) -> Result<&FunctionBody> { - match self { - Self::Parsed(body) => Ok(body), - Self::Unparsed { source, language } => { - let parsed = parse_function(source, language)?; - *self = Self::Parsed(parsed); - Ok(match self { - Self::Parsed(body) => body, - _ => unreachable!(), - }) - } - } - } -} -``` - -**Benefits:** -- Reduced memory usage -- Faster initial analysis -- Better for large repos - -**Challenges:** -- Complexity increase -- Parsing overhead on access -- Cache invalidation - -**Estimated Impact:** 30-50% memory reduction - ---- - -### 12. Configuration Validation & Schema - -**Current State:** -- Config validated at runtime -- No schema documentation -- Easy to make config mistakes - -**Proposed Improvement:** -- JSON Schema for config -- Config validation with clear errors -- IDE autocomplete support - -**Implementation:** -```json -{ - "$schema": "https://hotspots.dev/schemas/config-v1.json", - "include": ["src/**/*.ts"], - "thresholds": { - "moderate": 3.0 - } -} -``` - -**Benefits:** -- Better developer experience -- Catch errors early -- Self-documenting config - -**Challenges:** -- Schema maintenance -- Version compatibility -- Tooling support - -**Estimated Impact:** Better UX, fewer config errors - ---- - -## Implementation Roadmap - -### Phase 1: Performance (3-6 months) -1. **Parallel file analysis** (2-3 weeks) - - Add rayon dependency - - Parallelize file processing - - Maintain deterministic ordering - -2. **Batched git operations** (1-2 weeks) - - Batch touch metrics queries - - Cache git results - - Measure performance gains - -3. **Incremental analysis** (4-6 weeks) - - Implement cache system - - Change detection - - Cache invalidation - -### Phase 2: Architecture (6-9 months) -4. **Policy trait system** (2-3 weeks) - - Define Policy trait - - Migrate existing policies - - Add tests - -5. **Dependency injection** (4-6 weeks) - - Define I/O traits - - Refactor core to use traits - - Add test doubles - -6. **Streaming output** (2-3 weeks) - - Implement streaming JSON - - Streaming HTML (if feasible) - - Backward compatibility - -### Phase 3: Extensibility (9-12 months) -7. **Metric plugin system** (3-4 weeks) - - Define MetricExtractor trait - - Plugin registry - - Documentation - -8. **Language plugin system** (6-8 weeks) - - Define LanguagePlugin trait - - Dynamic registration - - Example plugins - -9. **Structured errors** (2-3 weeks) - - Define error types - - Migrate error handling - - Update documentation - ---- - -## Trade-offs & Considerations - -### Performance vs. Simplicity -- Parallelization adds complexity but significant speedup -- Caching adds complexity but enables incremental analysis -- **Recommendation:** Start with parallelization, add caching later - -### Extensibility vs. Performance -- Plugin systems add indirection overhead -- Trait objects have vtable cost -- **Recommendation:** Use generics where possible, traits where necessary - -### Memory vs. Speed -- Streaming reduces memory but adds complexity -- AST caching speeds up but uses more memory -- **Recommendation:** Make configurable, default to balanced - -### Backward Compatibility -- Most improvements can be additive -- Some require breaking changes (error types, config schema) -- **Recommendation:** Version APIs, provide migration guides - ---- - -## Success Metrics - -### Performance -- **Target:** 5-10x faster for large repos (>10k functions) -- **Measure:** Analysis time, memory usage, CPU utilization - -### Maintainability -- **Target:** Reduce code duplication by 50% -- **Measure:** Lines of code, cyclomatic complexity, test coverage - -### Extensibility -- **Target:** Add new metric/language in <100 lines -- **Measure:** Plugin API complexity, documentation quality - -### Scalability -- **Target:** Handle repos with 100k+ functions -- **Measure:** Analysis time, memory usage, success rate - ---- - -## References - -- [Current Architecture](./ARCHITECTURE.md) -- [Performance Bottlenecks](../reference/limitations.md#performance) -- [Codebase Analysis](../../ANALYSIS.md) -- [Improvement Tasks](../../CODEBASE_IMPROVEMENTS.md) - ---- - -**Document Status:** Proposal -**Next Review:** After Phase 1 completion -**Questions?** Open an issue or see `docs/` for more details. diff --git a/docs/architecture/IMPROVEMENTS_SUMMARY.md b/docs/architecture/IMPROVEMENTS_SUMMARY.md deleted file mode 100644 index 48aa17a..0000000 --- a/docs/architecture/IMPROVEMENTS_SUMMARY.md +++ /dev/null @@ -1,216 +0,0 @@ -# Hotspots Architecture Improvements Summary - -**Date:** 2026-02-15 -**Status:** Proposal -**Full Document:** [IMPROVEMENTS.md](./IMPROVEMENTS.md) - ---- - -## Executive Summary - -This report outlines 12 architectural improvements to Hotspots, organized by priority and impact. The highest-impact improvements focus on performance and scalability, with potential for 10-100x speedup for typical workflows. Medium-priority improvements enhance maintainability and extensibility, while lower-priority items improve quality of life and developer experience. - ---- - -## High Priority: Performance & Scalability - -### 1. Parallel File Analysis -- **Impact:** 4-8x speedup on multi-core systems -- **Effort:** 2-3 weeks -- **Approach:** Use `rayon` to parallelize independent file processing -- **Challenge:** Maintain deterministic ordering for output - -### 2. Incremental Analysis & Caching -- **Impact:** 10-100x faster for incremental changes -- **Effort:** 4-6 weeks -- **Approach:** Cache parsed ASTs and CFGs, only re-analyze changed code -- **Challenge:** Cache invalidation strategy and size management - -### 3. Batched Git Operations -- **Impact:** 10-50x faster git operations for many files -- **Effort:** 1-2 weeks -- **Approach:** Batch multiple file queries into single git commands -- **Challenge:** Git command line length limits - -### 4. Optimize Call Graph Algorithms -- **Impact:** 5-10x faster for large call graphs -- **Effort:** 3-4 weeks -- **Approach:** Incremental PageRank, approximate Betweenness, sparse matrices -- **Challenge:** Algorithm correctness and determinism - -**Combined Impact:** 10-100x faster analysis for typical incremental workflows - ---- - -## Medium Priority: Architecture & Maintainability - -### 5. Plugin System for Metrics -- **Impact:** High extensibility, low performance impact -- **Effort:** 3-4 weeks -- **Approach:** Trait-based metric system with plugin registry -- **Benefit:** Extensibility without core changes - -### 6. Policy Trait System -- **Impact:** Better maintainability, eliminates duplication -- **Effort:** 2-3 weeks -- **Approach:** Single evaluation loop with trait-based policies -- **Benefit:** Easier to add new policies, testable logic - -### 7. Dependency Injection for Testability -- **Impact:** Better testability, minimal runtime impact -- **Effort:** 4-6 weeks -- **Approach:** Trait-based abstractions for I/O and git operations -- **Benefit:** Unit tests without real filesystem/git - -### 8. Streaming Output for Large Repos -- **Impact:** Constant memory usage, handles very large repos -- **Effort:** 2-3 weeks -- **Approach:** Stream results as computed, incremental JSON/HTML -- **Benefit:** Enables analysis of repos with 100k+ functions - ---- - -## Lower Priority: Quality of Life - -### 9. Language Plugin System -- **Impact:** Enables community language support -- **Effort:** 6-8 weeks -- **Approach:** Dynamic language registration with plugin API -- **Benefit:** No core changes for new languages - -### 10. Structured Error Types -- **Impact:** Better error handling and debugging -- **Effort:** 2-3 weeks -- **Approach:** Domain-specific error types instead of `anyhow` -- **Benefit:** Programmatic error handling and recovery - -### 11. AST Storage Optimization -- **Impact:** 30-50% memory reduction -- **Effort:** 2-3 weeks -- **Approach:** Lazy AST parsing, compact representations -- **Benefit:** Better for large repos - -### 12. Configuration Validation & Schema -- **Impact:** Better developer experience -- **Effort:** 1-2 weeks -- **Approach:** JSON Schema for config, validation with clear errors -- **Benefit:** Catch errors early, IDE autocomplete - ---- - -## Implementation Roadmap - -### Phase 1: Performance (3-6 months) -**Focus:** Speed and scalability -1. Parallel file analysis (2-3 weeks) -2. Batched git operations (1-2 weeks) -3. Incremental analysis (4-6 weeks) - -**Expected Outcome:** 10-100x faster for incremental workflows - -### Phase 2: Architecture (6-9 months) -**Focus:** Maintainability and testability -4. Policy trait system (2-3 weeks) -5. Dependency injection (4-6 weeks) -6. Streaming output (2-3 weeks) - -**Expected Outcome:** Reduced duplication, better testability - -### Phase 3: Extensibility (9-12 months) -**Focus:** Plugin systems and extensibility -7. Metric plugin system (3-4 weeks) -8. Language plugin system (6-8 weeks) -9. Structured errors (2-3 weeks) - -**Expected Outcome:** Community extensibility, better error handling - ---- - -## Key Trade-offs - -### Performance vs. Simplicity -- **Decision:** Start with parallelization, add caching later -- **Rationale:** Parallelization provides immediate benefit with manageable complexity - -### Extensibility vs. Performance -- **Decision:** Use generics where possible, traits where necessary -- **Rationale:** Balance between zero-cost abstractions and runtime flexibility - -### Memory vs. Speed -- **Decision:** Make configurable, default to balanced -- **Rationale:** Different repos have different constraints - -### Backward Compatibility -- **Decision:** Version APIs, provide migration guides -- **Rationale:** Most improvements can be additive, some require breaking changes - ---- - -## Success Metrics - -### Performance Targets -- **5-10x faster** for large repos (>10k functions) -- **10-100x faster** for incremental analysis -- **Constant memory** usage for streaming output - -### Maintainability Targets -- **50% reduction** in code duplication -- **<100 lines** to add new metric/language -- **100% test coverage** for core components - -### Scalability Targets -- Handle repos with **100k+ functions** -- Support **10+ languages** via plugins -- **Sub-second** analysis for incremental changes - ---- - -## Risk Assessment - -### Low Risk -- Parallel file analysis (proven pattern, `rayon` is mature) -- Batched git operations (straightforward optimization) -- Configuration validation (additive feature) - -### Medium Risk -- Incremental analysis (cache invalidation complexity) -- Policy trait system (migration effort) -- Dependency injection (large refactoring) - -### High Risk -- Call graph optimizations (algorithm correctness) -- Language plugin system (API design complexity) -- Streaming output (HTML streaming complexity) - ---- - -## Recommendations - -### Immediate Actions (Next 3 Months) -1. **Implement parallel file analysis** — Highest ROI, low risk -2. **Batch git operations** — Quick win, significant speedup -3. **Add configuration validation** — Improves DX with minimal effort - -### Short-term (3-6 Months) -4. **Incremental analysis** — Enables faster CI feedback -5. **Policy trait system** — Reduces maintenance burden -6. **Dependency injection** — Improves testability - -### Long-term (6-12 Months) -7. **Plugin systems** — Enables community contributions -8. **Streaming output** — Handles very large repos -9. **Error type improvements** — Better error handling - ---- - -## Conclusion - -The proposed improvements would transform Hotspots from a fast single-threaded tool into a highly scalable, extensible platform. The Phase 1 performance improvements alone could provide 10-100x speedup for typical workflows, making Hotspots viable for very large codebases and faster CI integration. - -The architectural improvements in Phase 2 would reduce maintenance burden and improve testability, while Phase 3's plugin systems would enable community contributions and long-term extensibility. - -**Priority:** Focus on Phase 1 performance improvements first, as they provide the highest immediate value with manageable risk. - ---- - -**Full Details:** See [IMPROVEMENTS.md](./IMPROVEMENTS.md) for complete specifications, code examples, and implementation details. diff --git a/docs/architecture/approximate-betweenness.md b/docs/architecture/approximate-betweenness.md deleted file mode 100644 index 0c87be9..0000000 --- a/docs/architecture/approximate-betweenness.md +++ /dev/null @@ -1,311 +0,0 @@ -# Approximate Betweenness Centrality for Large Codebases - -**Status:** Proposed -**Addresses:** Scalability of `hotspots analyze --mode snapshot` on large repositories - ---- - -## Problem - -Betweenness centrality is currently computed exactly using Brandes' algorithm, which -runs O(N × (N + E)) — quadratic in node count for sparse graphs. On the hotspots -codebase (562 functions, 193 edges) this takes under 1 ms and is invisible. On large -codebases it becomes the dominant cost by a wide margin: - -| Codebase scale | N | E | Estimated time | -|---|---|---|---| -| hotspots itself | 562 | 193 | < 1 ms | -| Mid-size service (est.) | 5,000 | 10,000 | ~2 s | -| Large monorepo (est.) | 50,000 | 75,000 | ~34 min | -| Kubernetes (est.) | 100,000 | 150,000 | ~134 min | - -These estimates are calibrated from measured benchmarks on this machine; see the -companion benchmark run at the bottom of this document. - -The key aggravating fact: **betweenness does not feed the risk score**. It is not an -input to `compute_activity_risk`, is not used by any pattern classifier, and is not -referenced in policy enforcement. It is stored on `CallGraphMetrics.betweenness` and -surfaced in JSON and HTML output as an informational signal only. We are paying an -O(N²) cost for a display-only field. - ---- - -## Why Not Skip It - -Betweenness is the only metric in the tool that measures _path criticality_ — how -often a function sits on the shortest call route between two other functions. Fan-in -measures how many callers a function has; betweenness measures how many indirect -dependencies route through it. A function with modest fan-in but high betweenness is -a structural bottleneck: removing or breaking it would disconnect large parts of the -call graph. - -That signal is genuinely useful for architectural work (identifying refactoring risks, -finding hidden coupling). Dropping it entirely, or zeroing it out above some threshold, -removes a qualitatively distinct piece of information that no other metric provides. -The goal should be to preserve the signal at a fraction of the cost, not eliminate it. - ---- - -## Proposed Approach: Pivoted Source Sampling - -### Theory - -Brandes' algorithm accumulates betweenness by summing contributions from every node -used as a BFS source. The contribution from a single source is independent of all -others. This means we can sample k sources, accumulate their contributions, and scale -the result by N/k to obtain an unbiased estimator of the exact values. - -Formally, let `B(v)` denote exact normalized betweenness for node `v`, and -`B̂(v)` denote the approximation using k sampled sources `S ⊆ V`, `|S| = k`: - -``` -B̂(v) = (N / k) × Σ_{s ∈ S} δ_s(v) / ((N-1)(N-2)) -``` - -where `δ_s(v)` is the dependency score accumulated by Brandes' BFS from source `s`. - -Properties of this estimator: -- **Unbiased**: E[B̂(v)] = B(v) for any sampling strategy that covers S uniformly -- **Error bound**: relative error decreases as O(1/√k) with high probability - (Bader, Meyerhenke, Sanders, Wagner 2007) -- **Rank preservation**: high-betweenness nodes remain high; the top-K ranking is - stable long before exact values converge -- **Complexity**: O(k × (N + E)), linear in k for fixed graph shape - -For k = 256 and N = 100,000: estimated time drops from ~134 minutes to ~20 seconds. - -### Why Rank Preservation Is What Matters Here - -Since betweenness is not an input to risk scoring or pattern classification, users -interact with it as a relative signal: "this function has notably high betweenness -compared to others." They do not divide betweenness values or compare them across -different snapshots in arithmetic ways. Ranking stability is therefore the correct -accuracy criterion, not absolute value precision. - -Empirically, uniform k-source sampling achieves Kendall's τ > 0.95 for the top -quartile of nodes at k ≥ 64, and τ > 0.99 for k ≥ 256 on scale-free graphs (which -call graphs resemble). The top-10 highest-betweenness functions are correctly -identified at k = 32 in almost all practical cases. - -### Source Selection: Deterministic Systematic Sampling - -The codebase has a hard invariant: identical input yields byte-for-byte identical -output. A pseudo-random sampler would require a seed, and any externally visible seed -value (timestamp, thread ID) would break this. A fixed seed (e.g., 42) would work but -is arbitrary and fragile. - -The cleaner solution is **systematic sampling** — no RNG at all: - -1. Sort the node list lexicographically (already done in `find_strongly_connected_components`) -2. Compute `step = N / k` -3. Select nodes at positions `0, step, 2×step, ..., (k-1)×step` - -This is a pure function of the sorted node list. Same graph → same sample → same -output. It also distributes samples evenly across the alphabetical namespace, which -in practice distributes them across files and modules. - -**Edge cases:** -- If `N ≤ k`: use all nodes (exact algorithm, no approximation) -- If `N < 4`: betweenness is already 0 by convention (normalization denominator is 0) - ---- - -## Threshold: When to Approximate - -Exact betweenness should be preferred when it is cheap enough that the approximation -error is unjustified. Based on the benchmark data: - -| N | Exact time | Approx (k=256) | Recommendation | -|---|---|---|---| -| ≤ 2,000 | < 2 s | ~1 ms | Use exact | -| 2,000–10,000 | 2 s–50 s | 1–5 ms | Use approx | -| > 10,000 | > 50 s | 5–50 ms | Must use approx | - -**Proposed default threshold: N = 2,000.** - -Below 2,000 nodes, exact Brandes completes in under 2 seconds, which is acceptable -within the enrichment pipeline. Above 2,000 nodes, approximation is strictly better -on every axis: faster, uses less memory (no per-source delta accumulation), and the -ranking accuracy is effectively identical. - -The threshold should be configurable via `.hotspotsrc.json` for teams that need to -tune it, but the default should be conservative enough that most users never need to -touch it. - ---- - -## Normalization Adjustment - -The exact algorithm normalises by dividing by `(N-1)(N-2)`. With k-source sampling, -the raw sum is approximately `(k/N)` of the exact raw sum. After scaling by `N/k` -the pre-normalisation total is restored, so the same normalisation denominator -`(N-1)(N-2)` applies unchanged. No special handling is needed. - -However, the JSON field `betweenness` should be accompanied by a snapshot-level -flag `betweenness_approximate: bool` so downstream tools can distinguish exact from -estimated values. This is a non-breaking addition to the snapshot schema. - ---- - -## Accuracy Validation Strategy - -Before shipping, accuracy should be validated on a medium-scale synthetic graph -(N ≈ 5,000, E ≈ 15,000) by: - -1. Running exact betweenness -2. Running approximate with k = 64, 128, 256, 512 -3. Computing Kendall's τ between exact and approximate top-100 rankings at each k -4. Computing max absolute error and mean relative error across all nodes - -Acceptance criteria: -- τ ≥ 0.95 for top-100 at k = 256 -- Max absolute error ≤ 0.05 (on the 0–1 normalised scale) at k = 256 -- No regression in the golden test suite (exact values are currently captured; they - will change to approximate values above the threshold and should be re-goldenised) - ---- - -## Implementation Plan - -### Step 1: Add `betweenness_approximate` to snapshot schema - -Add a bool field to the snapshot summary indicating whether betweenness was computed -exactly or approximately. This is the only schema change and should be done first so -downstream tooling can react to it. - -### Step 2: Implement `betweenness_centrality_approx(k: usize)` - -Add a new method on `CallGraph` alongside `betweenness_centrality()`: - -```rust -pub fn betweenness_centrality_approx(&self, k: usize) -> HashMap { - let n = self.nodes.len(); - // Fall back to exact when N is small enough - if n <= k { - return self.betweenness_centrality(); - } - - // Systematic sample: sorted nodes at stride n/k - let mut sorted_nodes: Vec<&String> = self.nodes.iter().collect(); - sorted_nodes.sort(); - let step = n / k; - let sources: Vec<&String> = (0..k).map(|i| sorted_nodes[i * step]).collect(); - - let mut betweenness: HashMap = - self.nodes.iter().map(|node| (node.clone(), 0.0)).collect(); - - let scale = n as f64 / k as f64; - for source in sources { - let (stack, predecessors, sigma) = brandes_bfs(source, &self.nodes, &self.edges); - let delta = brandes_accumulate(&stack, &predecessors, &sigma); - for w in &stack { - if w != source { - *betweenness.entry(w.clone()).or_insert(0.0) += - delta.get(w).copied().unwrap_or(0.0) * scale; - } - } - } - - if n > 2 { - let normalization = 1.0 / ((n - 1) * (n - 2)) as f64; - for value in betweenness.values_mut() { - *value *= normalization; - } - } - - betweenness -} -``` - -The method reuses the existing `brandes_bfs` and `brandes_accumulate` free functions -unchanged. No new BFS logic is introduced. - -### Step 3: Add threshold config - -Add `betweenness_exact_threshold: Option` to `HotspotsConfig` (default 2,000) -and `betweenness_approx_k: Option` (default 256). Validate that k ≥ 1 and -that k ≤ threshold (approximating when N ≤ k would be exact anyway). - -### Step 4: Thread threshold and k into `populate_callgraph` - -`populate_callgraph` currently calls `call_graph.betweenness_centrality()` directly. -Change it to accept the threshold and k values and dispatch: - -```rust -let n = call_graph.nodes.len(); -let betweenness_scores = if n > betweenness_exact_threshold { - call_graph.betweenness_centrality_approx(betweenness_approx_k) -} else { - call_graph.betweenness_centrality() -}; -``` - -### Step 5: Propagate `is_approximate` flag - -Set `snapshot.summary.betweenness_approximate = n > betweenness_exact_threshold` so -callers and output renderers can annotate accordingly. - -### Step 6: Update golden tests - -The golden test suite captures exact betweenness values. Test graphs are all small -(N < 2,000), so with the default threshold they will continue to use the exact -algorithm and golden values will not change. Explicit approximation tests should be -added using a graph just above the threshold. - ---- - -## What This Does Not Fix - -This proposal addresses the O(N²) cost of betweenness. Even after this change, the -remaining O(N+E) algorithms (PageRank, SCC, dependency depth, fan-in) are linear and -fast. The O(N log N) sorts for determinism are negligible. - -For Kubernetes-scale codebases the remaining bottleneck after this change would be -**git operations**: `git log` for touch metrics and co-change extraction. Those are -I/O-bound and are a separate concern outside the call graph module. - ---- - -## Rejected Alternatives - -**Skip betweenness above threshold (set to 0):** Simplest implementation, but removes -a qualitatively distinct architectural signal precisely for the large codebases where -users would benefit most from understanding structural bottlenecks. Rejected. - -**Approximate via forest sampling (FOSCA):** Generates random spanning forests to -estimate betweenness. Theoretically stronger guarantees but significantly more complex -to implement correctly, and the practical accuracy advantage over uniform sampling is -small for ranking purposes. The added implementation complexity is not justified. -Rejected for this iteration. - -**Parallelize exact Brandes with rayon:** Each source BFS is independent, so the -N outer iterations can be parallelized trivially. This would give a linear speedup -proportional to core count. However, the codebase design note in `lib.rs` explicitly -states that call graph logic is single-threaded, and a 4–8× speedup from parallelism -still leaves Kubernetes-scale codebases taking 20–30 minutes. Approximation is a -strictly better solution. The parallel-exact approach could be layered on top later -if needed. Rejected as primary fix. - -**KADABRA adaptive sampling (Borassi & Natale 2016):** Provides rigorous ε-δ -guarantees by adaptively determining how many sources to sample. Optimal in theory, -but requires a stopping criterion based on online variance estimation, adding -implementation complexity without meaningful practical benefit over fixed-k sampling -for our use case (ranking stability, not ε-approximate absolute values). Rejected. - ---- - -## Benchmark Reference - -Measured on this machine, release build, ring graph with E = 3N: - -| N | Exact betweenness | Approx k=256 (projected) | -|---|---|---| -| 500 | 293 ms | ~1.5 ms | -| 1,000 | 1,155 ms | ~3 ms | -| 2,000 | 5,154 ms | ~6 ms | -| 50,000 | ~34 min | ~150 ms | -| 100,000 | ~134 min | ~300 ms | - -Projected values for N ≥ 5,000 are extrapolated from the O(N²) fit confirmed at -N = 500/1000/2000. Approximation cost is O(k × (N+E)) ≈ O(256 × 4N) = O(1024 N), -linear in N. diff --git a/docs/architecture/design-decisions.md b/docs/architecture/design-decisions.md deleted file mode 100644 index 86d9369..0000000 --- a/docs/architecture/design-decisions.md +++ /dev/null @@ -1,423 +0,0 @@ -# Design Decisions - -This document captures key design decisions made during Hotspots MVP development. These decisions are **final and binding** for the MVP scope. - -## Architecture Decisions - -### Rust Workspace Structure - -**Decision:** Single workspace with two crates only. - -**Rationale:** -- Separation of concerns: core library vs CLI -- Library can be reused by other tools -- CLI provides user-facing interface -- Keeps structure simple for MVP - -**Implementation:** -- `hotspots-core` - Library crate with all analysis logic -- `hotspots-cli` - Binary crate with CLI interface - -### Rust Version - -**Decision:** Rust 2021 Edition, MSRV 1.75. - -**Rationale:** -- Stable, widely-supported version -- Avoids nightly-only features -- Ensures compatibility -- No 2024-only features required - -### Parser Selection - -**Decision:** Use `swc_ecma_parser` (SWC - Speedy Web Compiler). - -**Rationale:** -- Rust-native parser (no Node.js dependency) -- Fast and well-maintained -- Supports TypeScript syntax -- Actively developed (mature but evolving) -- Used by major projects (Next.js, etc.) - -**Trade-offs:** -- Requires pinning specific versions for compatibility -- SWC version evolution may require updates -- TypeScript-only (no JSX in MVP) - -### No Kernel Dependency - -**Decision:** Explicitly avoid any kernel or external service dependencies. - -**Rationale:** -- Standalone tool requirement -- Works offline -- No external API calls -- Fully deterministic analysis -- Portable and self-contained - -## Analysis Design Decisions - -### Per-Function Analysis - -**Decision:** Analyze each function in complete isolation. - -**Rationale:** -- Simpler model (no inter-function dependencies) -- Parallelizable (future enhancement) -- Clear boundaries -- Matches "local" in Local Risk Score -- Easier to reason about - -**Implications:** -- No cross-function call graph analysis -- No inter-function metrics -- Each function analyzed independently - -### Control Flow Graph Model - -**Decision:** Build explicit CFG for each function with entry/exit nodes. - -**Rationale:** -- Formal model for control flow -- Enables metric calculation (CC uses E-N+2) -- Validates program structure -- Handles complex control flow (try/catch/finally) -- Makes edge cases explicit - -**Structure:** -- One CFG per function -- Entry and exit nodes -- Explicit edges for all control flow -- No global CFG (per-function only) - -### Deterministic Ordering - -**Decision:** Sort functions and results deterministically by (file, span.start). - -**Rationale:** -- Reproducible output -- Stable test fixtures -- Predictable user experience -- Byte-for-byte identical output - -**Ordering Rules:** -- Functions: sorted by `span.lo` (byte offset) -- Reports: sorted by (LRS desc, file asc, line asc, name asc) -- Files: processed in discovery order (deterministic) - -### Anonymous Function Naming - -**Decision:** Format: `@:` - -**Rationale:** -- Stable across runs -- Human-readable -- Includes location context -- Deterministic (uses file and line) -- Example: `@src/api.ts:42` - -**Alternative Considered:** -- Synthetic numeric IDs (too cryptic) -- Hash-based names (not human-readable) -- Context-based names (not stable) - -## Metric Calculation Decisions - -### Cyclomatic Complexity Formula - -**Decision:** `CC = E - N + 2` with additional increments. - -**Rationale:** -- Standard McCabe formula -- Accounts for decision points -- Additional increments for: - - Short-circuit operators (implicit decisions) - - Switch cases (explicit decisions) - - Catch clauses (exception paths) - -**Increments:** -- `&&` and `||`: +1 each -- Switch case: +1 per case -- Catch clause: +1 per catch - -### Fan-Out Chained Calls - -**Decision:** Count each segment of chained calls independently. - -**Example:** `foo().bar().baz()` counts as: -- `foo` -- `foo().bar` -- `foo().bar().baz` - -**Rationale:** -- Each call expression is a distinct dependency -- Chained calls represent multiple coupling points -- More accurate representation of complexity -- Matches actual function call sites - -**Alternative Considered:** -- Count only terminal call (under-counts coupling) -- Count only root identifier (misses intermediate calls) - -### Non-Structured Exits - -**Decision:** Count all early exits except final tail return. - -**Rationale:** -- Early exits increase complexity -- Tail return is expected control flow -- Includes: `return`, `break`, `continue`, `throw` -- Excludes: final `return` statement - -**Implication:** -- Functions with multiple exit points have higher NS -- Tail recursion patterns don't inflate NS -- Exception handling increases NS appropriately - -### Nesting Depth Calculation - -**Decision:** Count only control constructs (if, loops, switch, try). - -**Rationale:** -- Focuses on control flow complexity -- Ignores lexical scoping (less relevant) -- Maximum depth tracks worst-case path -- Excludes plain blocks - -**Included:** -- `if`, `else if` -- `for`, `while`, `do-while`, `for-in`, `for-of` -- `switch` -- `try`, `catch`, `finally` - -**Excluded:** -- Lexical scopes (`{ }` blocks) -- Function bodies (separate analysis) -- Object literals -- Array literals - -## Risk Scoring Decisions - -### Risk Transform Functions - -**Decision:** Use logarithmic transforms for CC and FO, linear for ND and NS. - -**Rationale:** -- Logarithmic for metrics that can grow unbounded (CC, FO) -- Linear for metrics with natural bounds (ND, NS) -- Bounded to prevent extreme scores -- Monotonic (higher metric → higher risk) - -**Formulas:** -- `R_cc = min(log2(CC + 1), 6)` - Logarithmic, capped at 6 -- `R_nd = min(ND, 8)` - Linear, capped at 8 -- `R_fo = min(log2(FO + 1), 6)` - Logarithmic, capped at 6 -- `R_ns = min(NS, 6)` - Linear, capped at 6 - -### LRS Weights - -**Decision:** Weighted sum with CC having highest weight. - -**Rationale:** -- CC is most established metric -- ND important but secondary -- FO and NS have lower but meaningful weights -- Weights chosen to balance contributions - -**Weights:** -- `R_cc`: 1.0 (highest) -- `R_nd`: 0.8 -- `R_ns`: 0.7 -- `R_fo`: 0.6 (lowest) - -### Risk Bands - -**Decision:** Four bands with specific thresholds. - -**Rationale:** -- Clear categorization -- Actionable thresholds -- Balanced distribution -- Intuitive ranges - -**Bands:** -- **Low:** LRS < 3 -- **Moderate:** 3 ≤ LRS < 6 -- **High:** 6 ≤ LRS < 9 -- **Critical:** LRS ≥ 9 - -## Output Format Decisions - -### JSON Schema - -**Decision:** Include both raw metrics and risk components. - -**Rationale:** -- Transparency (users can see inputs) -- Debugging (verify calculations) -- Flexibility (users can recompute with different weights) -- Complete information - -**Schema:** -```json -{ - "file": "...", - "function": "...", - "line": 42, - "metrics": { "cc": 5, "nd": 2, "fo": 3, "ns": 1 }, - "risk": { "r_cc": 2.58, "r_nd": 2, "r_fo": 2, "r_ns": 1 }, - "lrs": 5.96, - "band": "moderate" -} -``` - -### Text Output Format - -**Decision:** Simple aligned columns, no borders. - -**Rationale:** -- Human-readable -- Easy to scan -- Works in terminals -- Minimal formatting overhead - -**Example:** -``` -LRS File Line Function -11.2 src/api.ts 88 handleRequest -9.8 src/db/migrate.ts 41 runMigration -``` - -### Precision - -**Decision:** Full `f64` precision in JSON, 2 decimals in text. - -**Rationale:** -- JSON: Machine-readable, preserve precision -- Text: Human-readable, round for display -- Internal calculations: Full precision -- No rounding of intermediate values - -## Testing Decisions - -### Golden Files - -**Decision:** Snapshot expected JSON outputs in `tests/golden/`. - -**Rationale:** -- Regression testing -- Verify output stability -- Easy to update when needed -- Clear expected vs actual comparison - -**Location:** -- Fixtures: `tests/fixtures/*.ts` -- Golden: `tests/golden/*.json` - -### Determinism Tests - -**Decision:** Explicitly test byte-for-byte identical output. - -**Rationale:** -- Core requirement (invariant #6) -- Catches non-deterministic bugs -- Ensures reproducible results -- Verifies stable sorting - -**Implementation:** -- Run analysis twice -- Compare JSON output byte-for-byte -- Fail if any difference - -## Error Handling Decisions - -### Parse Errors - -**Decision:** Fail fast per file, continue with other files. - -**Rationale:** -- Clear error attribution -- Don't fail entire run for one bad file -- Aggregate errors at end -- Valid results still reported - -**Behavior:** -- Parse error → skip file, report error -- Continue with remaining files -- Exit non-zero if any errors - -### Unsupported Features - -**Decision:** Emit error for unsupported function, skip it, continue. - -**Rationale:** -- Graceful degradation -- Don't fail entire analysis -- Clear error messages -- Continue with supported functions - -**Examples:** -- Generator functions (`function*`) -- JSX syntax (file-level error) - -## Scope Limitations (Intentional) - -### No JSX Support - -**Decision:** Plain TypeScript only, no JSX/TSX. - -**Rationale:** -- MVP scope limitation -- JSX adds complexity -- Can be added later -- Clear error message when encountered - -### No Type-Aware Analysis - -**Decision:** Parse types but don't use for analysis. - -**Rationale:** -- Keeps MVP focused -- Types add significant complexity -- Structural analysis sufficient for MVP -- Can be enhanced later - -### No Cross-Function Analysis - -**Decision:** Per-function analysis only. - -**Rationale:** -- Simpler model -- Matches "local" scope -- Can add later if needed -- Sufficient for function-level risk - -### Break/Continue Placeholder - -**Decision:** Route break/continue to exit (placeholder). - -**Rationale:** -- Loop context tracking complex -- MVP placeholder works -- Documented as limitation -- Can be refined later - -**Note:** Labeled break/continue support planned but loop context tracking needs refinement. - -## Trade-offs Summary - -### Chosen Approaches - -1. **Explicit CFG over implicit flow** - More complex but more precise -2. **Deterministic over fast** - Reproducibility over performance -3. **Complete metrics over simplified** - More information for users -4. **Per-function over cross-function** - Simpler, clearer boundaries -5. **Static analysis over dynamic** - No execution required - -### Future Considerations - -- Incremental analysis (cache CFGs) -- Type-aware metrics (use type information) -- Cross-function analysis (call graph) -- Configuration files (custom thresholds) -- Performance optimization (parallel analysis) diff --git a/docs/architecture/index.md b/docs/architecture/index.md deleted file mode 100644 index 691df81..0000000 --- a/docs/architecture/index.md +++ /dev/null @@ -1,26 +0,0 @@ -# Architecture Notes - -This section collects design records, invariants, implementation notes, and historical architecture reviews. - -For the current contributor-facing map of the codebase, start with the [Codebase Guide](/code-architecture/). The pages here provide deeper background and context. - -## Current design references - -- [Design Decisions](./design-decisions.md) -- [Invariants](./invariants.md) -- [Multi-language Design](./multi-language.md) -- [Testing Strategy](./testing.md) -- [Approximate Betweenness](./approximate-betweenness.md) - -## Historical notes and reviews - -These pages are useful context, but may describe earlier implementation states or completed improvement plans: - -- [Architecture Overview](./overview.md) -- [Architecture Review](./ARCHITECTURE_REVIEW.md) -- [Audit Report](./AUDIT_REPORT.md) -- [Improvements](./IMPROVEMENTS.md) -- [Improvements Summary](./IMPROVEMENTS_SUMMARY.md) -- [SQLite Pipeline Refactor](./sqlite-pipeline-refactor.md) - -When current behavior conflicts with historical notes, prefer the [Codebase Guide](/code-architecture/) and source code. diff --git a/docs/architecture/invariants.md b/docs/architecture/invariants.md deleted file mode 100644 index 6c21398..0000000 --- a/docs/architecture/invariants.md +++ /dev/null @@ -1,78 +0,0 @@ -# Global Invariants - -These invariants are **non-negotiable** and apply to **all phases** of implementation. - -Any violation is considered a bug. - ---- - -## 1. Analysis is strictly per-function - -* Each function is analyzed independently -* No cross-function state is maintained -* No global analysis state exists -* Results are computed per-function and aggregated only at reporting time - -**Enforcement:** Code must not maintain global mutable state for analysis. - ---- - -## 2. No global mutable state - -* All state must be local to functions or explicit data structures -* No static mutable variables -* No shared mutable state between functions during analysis - -**Enforcement:** Use of `static mut`, global variables, or shared mutable references is prohibited. - ---- - -## 3. No randomness, clocks, threads, or async - -* No use of `rand`, `std::time`, `std::thread`, or async runtimes -* All operations must be deterministic -* No time-based or random behavior in analysis - -**Enforcement:** Dependencies on randomness, timing, threading, or async are disallowed. - ---- - -## 4. Deterministic traversal order must be explicit - -* File traversal order must be deterministic (sorted by path) -* Function traversal within files must be deterministic (sorted by span start) -* All iteration over data structures must use explicit ordering - -**Enforcement:** All collections must be sorted before iteration when order affects output. - ---- - -## 5. Formatting, comments, and whitespace must not affect results - -* AST parsing must ignore comments -* Whitespace changes must not affect analysis results -* Code formatting must not change metrics or LRS - -**Enforcement:** Only structural AST nodes are used for analysis, never lexical details. - ---- - -## 6. Identical input yields byte-for-byte identical output - -* Running the same input twice must produce exactly the same output -* No timestamps, IDs, or non-deterministic elements in output -* Output must be deterministic and reproducible - -**Enforcement:** All output serialization must be deterministic, including JSON key ordering and floating-point formatting. - ---- - -## Violations - -If any invariant is violated: - -1. The implementation is incorrect -2. Tests must catch the violation -3. The violation must be fixed immediately - -These invariants ensure that hotspots produces trusted, reproducible results suitable for static analysis workflows. diff --git a/docs/architecture/multi-language.md b/docs/architecture/multi-language.md deleted file mode 100644 index 8657477..0000000 --- a/docs/architecture/multi-language.md +++ /dev/null @@ -1,262 +0,0 @@ -# Multi-Language Support Architecture - -**Status:** Planning / Research -**Last Updated:** 2026-02-04 - ---- - -## Overview - -This document outlines the architectural considerations for adding multi-language support to Hotspots beyond TypeScript/JavaScript. - -## Current State - -Hotspots is currently tightly coupled to TypeScript/JavaScript via SWC (Speedy Web Compiler): - -```rust -// Heavy dependency on swc_ecma_ast types -use swc_ecma_ast::*; // BlockStmt, Stmt, Expr, etc. - -pub struct FunctionNode { - pub body: BlockStmt, // SWC-specific type - // ... -} -``` - -**Key language-specific modules:** -- `parser.rs` - SWC parsing logic -- `ast.rs` - SWC AST wrapper -- `cfg/builder.rs` - Heavily coupled to SWC AST types -- `discover.rs` - Function discovery from SWC AST - -**Language-agnostic modules:** -- `metrics.rs` - Works on CFG (language-agnostic) -- `cfg.rs` - CFG representation - -Total language-specific code: ~1,200 lines - ---- - -## Architectural Approaches - -### Option 1: Per-Language Implementations (Isolated) - -Create separate parsers/CFG builders for each language: - -``` -hotspots-core/src/ -├── languages/ -│ ├── typescript/ (existing, refactored) -│ │ ├── parser.rs -│ │ ├── ast.rs -│ │ └── cfg_builder.rs -│ ├── python/ -│ ├── rust/ -│ └── go/ -└── common/ - ├── metrics.rs (shared) - └── cfg.rs (shared CFG representation) -``` - -**Pros:** -- Clean separation -- No language cross-contamination -- Easy to test in isolation -- Can evolve languages independently - -**Cons:** -- Code duplication (CFG logic) -- Larger codebase -- More maintenance burden - -### Option 2: Unified AST Abstraction (Trait-Based) - -Create a language-agnostic AST trait: - -```rust -trait LanguageParser { - fn parse(&self, source: &str) -> Result; - fn discover_functions(&self, module: &Module) -> Vec; -} - -trait AstNode { - fn kind(&self) -> NodeKind; - fn children(&self) -> Vec<&dyn AstNode>; -} - -// Implement for each language -impl LanguageParser for TypeScriptParser { ... } -impl LanguageParser for PythonParser { ... } -``` - -**Pros:** -- Shared CFG builder -- Less code duplication -- Unified architecture - -**Cons:** -- Complex trait design -- Language quirks force compromises -- Harder to optimize per-language -- Upfront design cost - ---- - -## Language-Specific Challenges - -### Python (Complexity: HIGH) - -**Unique constructs:** -- List/dict/set comprehensions (implicit loops) -- Context managers (with statement) -- Generators (yield) -- Decorators -- Multiple exception types -- Else clause on loops/try - -**CFG Impact:** -- Comprehensions need implicit loop modeling -- Context managers = implicit try/finally -- Generators = multiple exit points -- Loop else = additional control flow edge - -### Rust (Complexity: VERY HIGH) - -**Unique constructs:** -- Pattern matching (exhaustive, complex branching) -- if let / while let -- Loop labels and break with values -- ? operator (implicit early return) -- Async/await (state machine transformation) -- Closures capturing environment - -**CFG Impact:** -- Match arms = multi-way branching (high CC) -- ? operator = implicit return path -- Async/await = state machine (very complex CFG) -- Pattern matching in conditions - -### Go (Complexity: MEDIUM) - -**Unique constructs:** -- Defer (deferred function calls) -- Goroutines (concurrent execution) -- Select (channel operations) -- Multiple return values -- Error handling idiom (if err != nil) - -**CFG Impact:** -- Defer = implicit finally-like block -- Select = multi-way branching -- Error checking pattern inflates CC/NS -- Multiple returns = multiple exit points - ---- - -## Cross-Language Concerns - -### 1. Metric Consistency - -Challenge: Should LRS be comparable across languages? - -Example: -```python -# Python -result = [x for x in range(10) if x % 2 == 0] - -# JavaScript equivalent -result = [...Array(10).keys()].filter(x => x % 2 === 0) -``` - -**Decision needed:** -- Option A: Language-normalized (comprehension = loop) -- Option B: Language-specific (comprehension counts differently) - -### 2. Determinism - -Ensure byte-for-byte identical output across languages: -- Python: Dictionary ordering -- Rust: Macro expansion -- Go: Goroutine scheduling (ignore runtime behavior) - -**Mitigation:** Strict ordering in function discovery, ignore runtime behavior - -### 3. Testing Strategy - -Each language needs: -- 50+ unit tests for parser and CFG builder -- 20+ integration tests for metrics -- 10+ cross-language comparison tests -- Comprehensive language feature coverage - ---- - -## Recommended Approach - -### Phase 1: Architecture Refactoring (2 weeks) - -Before adding languages, refactor current code: - -1. Extract language-agnostic CFG -2. Create language abstraction layer -3. Refactor existing TypeScript/JavaScript to use traits -4. Ensure no regression (run full test suite) - -### Phase 2: Add First Additional Language (4 weeks) - -Start with Go (medium complexity) to validate architecture: -- Good validation of approach -- Simpler than Python/Rust -- Useful for analyzing Go projects - -### Phase 3+: Add Other Languages Based on Demand - -Priority based on user requests: -- Python (6 weeks) -- Rust subset (8 weeks, no async initially) -- Other languages as needed - ---- - -## Effort Estimates - -| Language | Complexity | Development | Testing | Total | Risk | -|----------|-----------|-------------|---------|-------|------| -| **Refactoring** | - | 2 weeks | 1 week | **3 weeks** | Medium | -| **Go** | Medium | 3 weeks | 1 week | **4 weeks** | Medium | -| **Python** | High | 4 weeks | 2 weeks | **6 weeks** | High | -| **Rust (subset)** | Very High | 6 weeks | 2 weeks | **8 weeks** | High | - -Total for Go + Python + Rust: ~5 months - ---- - -## Decision Framework - -### Should we add multi-language support? - -**YES, if:** -- Need to analyze polyglot repos holistically -- Want Hotspots to be a universal complexity tool -- Have 3-6 months for focused development - -**NO (or DEFER), if:** -- TypeScript/JavaScript coverage is sufficient -- Want to focus on GitHub Action adoption first -- Need to validate market fit before expanding - -### Recommended Next Step - -**Validate demand first:** -1. Release with TypeScript/JavaScript -2. Gather user feedback -3. Survey: "What languages do you need?" -4. Add top-requested language - ---- - -## References - -- [Language Support Documentation](../reference/language-support.md) -- [Design Decisions](design-decisions.md) -- Original analysis: `docs/.internal/archive/multi-language-analysis-2026-02-04.md` diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md deleted file mode 100644 index fa98cef..0000000 --- a/docs/architecture/overview.md +++ /dev/null @@ -1,202 +0,0 @@ -# Hotspots Architecture - -## Overview - -Hotspots is a static analysis tool that computes **Local Risk Scores (LRS)** for TypeScript functions. It analyzes individual functions in isolation, extracting four key metrics and transforming them into a unified risk assessment. - -## System Architecture - -### Workspace Structure - -Hotspots is implemented as a Rust workspace with two crates: - -1. **`hotspots-core`** - Library crate containing all analysis logic - - TypeScript parsing and AST traversal - - Function discovery - - Control Flow Graph (CFG) construction - - Metric extraction - - Risk score calculation - - Report generation - -2. **`hotspots-cli`** - Binary crate providing command-line interface - - Argument parsing - - File collection and traversal - - Output formatting (text/JSON) - - Error handling and reporting - -### Technology Stack - -- **Language:** Rust 2021 Edition (MSRV 1.75) -- **Parser:** `swc_ecma_parser` v33.0.0 (SWC - Speedy Web Compiler) -- **AST Libraries:** `swc_ecma_ast` v20.0.0, `swc_ecma_visit` v20.0.0 -- **Serialization:** `serde` + `serde_json` for JSON output -- **CLI:** `clap` v4.5 for argument parsing -- **Error Handling:** `anyhow` for error propagation - -### Design Principles - -1. **Per-Function Analysis:** Each function is analyzed in complete isolation -2. **Determinism:** Identical input produces byte-for-byte identical output -3. **No Global State:** Stateless analysis with no shared mutable state -4. **Explicit Control Flow:** All control flow constructs are explicitly modeled in the CFG -5. **Stable Ordering:** Results are sorted deterministically for reproducible output - -## Analysis Pipeline - -The analysis follows a strict pipeline: - -``` -TypeScript Source - ↓ -[Parser] → Module AST - ↓ -[Function Discovery] → FunctionNode[] - ↓ (for each function) -[CFG Builder] → Control Flow Graph - ↓ -[Metric Extraction] → RawMetrics (CC, ND, FO, NS) - ↓ -[Risk Calculation] → RiskComponents + LRS + RiskBand - ↓ -[Report Generation] → FunctionRiskReport[] - ↓ -[Sorting & Filtering] → Final Reports - ↓ -[Output Rendering] → Text or JSON -``` - -### Phase Breakdown - -#### Phase 1: Parsing and Discovery -- Parse TypeScript source into SWC AST -- Discover all functions (declarations, expressions, arrows, methods) -- Extract function metadata (name, line number, span) -- Assign deterministic function IDs - -#### Phase 2: CFG Construction -- Build control flow graph for each function -- Model all control structures (if/else, loops, switch, try/catch/finally) -- Handle early exits (return, throw, break, continue) -- Validate CFG structure (entry/exit, reachability) - -#### Phase 3: Metric Extraction -- **Cyclomatic Complexity (CC):** `E - N + 2` + short-circuit operators + switch cases + catch clauses -- **Nesting Depth (ND):** Maximum depth of control constructs -- **Fan-Out (FO):** Distinct function call sites (including chained calls) -- **Non-Structured Exits (NS):** Early returns, breaks, continues, throws - -#### Phase 4: Risk Scoring -- Transform each metric into risk components: - - `R_cc = min(log2(CC + 1), 6)` - - `R_nd = min(ND, 8)` - - `R_fo = min(log2(FO + 1), 6)` - - `R_ns = min(NS, 6)` -- Aggregate: `LRS = 1.0*R_cc + 0.8*R_nd + 0.6*R_fo + 0.7*R_ns` -- Assign risk band: Low (<3), Moderate (3-6), High (6-9), Critical (≥9) - -#### Phase 5: Reporting -- Generate structured reports with all metrics and risk data -- Sort by LRS (descending), file, line, function name -- Support text and JSON output formats - -## Data Models - -### FunctionNode -Represents a discovered function with: -- `id: FunctionId` - Unique identifier (file_index, local_index) -- `name: Option` - Function name (or None for anonymous) -- `span: Span` - Source location -- `body: BlockStmt` - Function body AST node - -### CFG Components -- `CfgNode` - Graph node with ID and kind (Statement, Condition, LoopHeader, Join, etc.) -- `CfgEdge` - Directed edge connecting nodes -- `Cfg` - Complete graph with entry/exit nodes - -### Metrics -- `RawMetrics` - CC, ND, FO, NS values -- `RiskComponents` - Transformed risk values (R_cc, R_nd, R_fo, R_ns) -- `RiskBand` - Enum (Low, Moderate, High, Critical) - -### Reports -- `FunctionRiskReport` - Complete report including: - - File path, function name, line number - - Raw metrics and risk components - - LRS score and risk band - -## Global Invariants - -These invariants are enforced throughout the system: - -1. **Per-function analysis:** No cross-function dependencies -2. **No global mutable state:** All analysis is stateless -3. **No randomness/clocks/threads/async:** Fully deterministic -4. **Deterministic traversal:** Explicit ordering by (file, span.start) -5. **Formatting invariance:** Whitespace and formatting don't affect results -6. **Output determinism:** Identical input → identical output - -See [`invariants.md`](./invariants.md) for detailed documentation. - -## Supported Features - -### TypeScript Syntax -- Function declarations and expressions -- Arrow functions (with expression and block bodies) -- Class methods -- Object literal methods -- All control flow structures -- Type annotations (parsed but not analyzed) - -### Explicitly Unsupported -- JSX/TSX syntax -- Generator functions (`function*`) -- Experimental decorators -- Async/await analysis (parsed but not modeled in CFG) - -See [`ts-support.md`](./ts-support.md) for complete details. - -## Testing Strategy - -### Unit Tests -- Parser tests (syntax validation, error handling) -- Function discovery tests (ordering, anonymous functions) -- CFG tests (construction, validation) -- Metric calculation tests - -### Integration Tests -- End-to-end analysis of fixture files -- Determinism verification (identical outputs) -- Whitespace invariance testing -- Golden file comparisons - -### Golden Files -- Expected JSON outputs for known fixtures -- Automatically verified in CI -- Location: `tests/golden/*.json` - -## Performance Characteristics - -- **Static analysis:** No execution required -- **Per-file parsing:** Files analyzed independently -- **Deterministic algorithms:** O(n) for AST traversal -- **CFG construction:** Linear in function size -- **No caching:** Each run is independent (by design) - -## Limitations - -See [`limitations.md`](./limitations.md) for detailed known limitations, including: -- Break/continue target resolution -- Labeled break/continue -- Generator functions -- Async function CFG modeling -- Type-aware analysis - -## Future Considerations - -Potential enhancements beyond MVP: -- Incremental analysis -- Type-aware metrics -- Cross-function dependency tracking -- Configuration file support -- Custom risk thresholds -- Export to various formats (SARIF, etc.) diff --git a/docs/architecture/sqlite-pipeline-refactor.md b/docs/architecture/sqlite-pipeline-refactor.md deleted file mode 100644 index 4e300df..0000000 --- a/docs/architecture/sqlite-pipeline-refactor.md +++ /dev/null @@ -1,272 +0,0 @@ -# SQLite Pipeline Refactor Plan - -## Problem: Peak Memory in Snapshot Mode - -Running `hotspots analyze . --mode snapshot` on a large monorepo (e.g. expo/expo, ~51k -functions) currently holds multiple large structures in memory simultaneously: - -| Phase | Structure | Approx size | -|-------|-----------|-------------| -| After analysis | `Vec` | ~23 MB | -| While building call graph | `Vec` + `CallGraph` | ~48 MB | -| After `Snapshot::new` | `CallGraph` + `Vec` | ~75 MB | -| During enrichment (churn, touch, risk…) | `Vec` | ~50 MB | -| JSON output (pre-T4.7) | `Vec` + JSON `String` | ~200 MB | - -Peak was ~250 MB before T4.7 (streaming JSON output). With T4.7 in place, peak is ~75 MB -(CallGraph + FunctionSnapshot Vec overlap). The goal of the full pipeline refactor is to -reduce this to ~25 MB by never having more than one large structure alive at a time. - ---- - -## How the Current Code Flows - -**Entry point**: `hotspots-cli/src/cmd/analyze.rs → handle_mode_output → Snapshot branch` - -``` -analyze_with_progress(path, ...) - → Vec ← parallel rayon, all collected before returning - -build_call_graph(&reports, repo_root) - → CallGraph { nodes, edges, ... } ← built while Vec still alive - -Snapshot::new(git_context, reports) - → Vec ← consumes + drops Vec - but CallGraph is still alive here - -SnapshotEnricher - .with_churn(...) ← mutates Vec in place - .with_touch_metrics(...) ← mutates Vec in place - .with_callgraph(&graph, ...) ← reads CallGraph, mutates Vec; then graph can drop - .enrich(...) ← activity_risk, percentiles, driver labels, quadrants - .build() - → Snapshot { functions: Vec, summary, ... } - -snapshot.populate_patterns(...) ← mutates Vec in place - -emit_snapshot_output(...) - → snapshot.write_json_to(stdout) ← streams from Vec (T4.7 already done) -``` - ---- - -## Refactored Pipeline: SQLite as the In-Process Buffer - -The refactored pipeline replaces the Vec-based pipeline with a TempDb (in-memory SQLite) -that acts as the single store. Each phase writes results and then the in-memory structure -is dropped. Only one large thing lives in RAM at a time. - -``` -analyze_with_progress(path, ...) - → Vec - -db.insert_reports(sha, &reports) ← write all rows to SQLite (includes raw callee lists) -drop(reports) ← Vec freed (~23 MB recovered) - -db.build_call_graph(sha, repo_root) - → CallGraph ← loaded from SQLite: only (function_id, callee_names) - much leaner than loading full FunctionRiskReport - - [graph algorithm phase] - pagerank = graph.pagerank(...) - betweenness = graph.betweenness_centrality_approx(k) - scc_info = graph.find_strongly_connected_components() - depths = graph.compute_dependency_depth() - fan_in_map = graph.build_fan_in_map() - -db.update_callgraph_metrics(sha, &graph, &fan_in_map, &pagerank, ...) - ← SQL UPDATE: write fan_in, fan_out, pagerank, betweenness, scc, depths back to rows - -drop(graph) ← CallGraph freed (~25 MB recovered) -drop(pagerank, betweenness, ...) - -db.update_churn(sha, &churn_by_file) ← SQL UPDATE per file path -db.update_touch(sha, &touch_data) ← SQL UPDATE per file path - -db.update_activity_risk(sha, &weights) - ← streams 1k rows at a time: load (lrs, churn, touch, callgraph cols) - → call scoring::compute_activity_risk per row - → batch UPDATE activity_risk, risk_factors - -db.update_percentile_flags(sha) - ← pure SQL: NTILE(100) OVER (ORDER BY activity_risk) in a CTE → UPDATE - -db.update_driver_and_quadrant(sha, percentile) - ← streams rows to compute distribution thresholds (percentile_idx) - → classify each row via existing Rust logic - → batch UPDATE driver, driver_detail, quadrant - -db.update_patterns(sha, &thresholds) - ← streams rows: load (cc, nd, fo, ns, loc, fan_in, scc_size, touch, days_since) - → call patterns::classify per row - → batch UPDATE patterns - -summary = db.compute_summary(sha, betweenness_approximate) - ← SQL GROUP BY band + window functions for top-k shares - -db.write_snapshot_json_to(sha, &commit, &analysis, &summary, &mut stdout) - ← cursor streams rows, serializes each function as JSON, writes directly -``` - -**Peak memory with refactored pipeline**: max(Vec\, CallGraph, 1k-row -batch) ≈ **25 MB** — a 10× reduction from the current ~250 MB peak. - ---- - -## "Raw Partial" Graph in SQLite - -The raw callee lists (what you called the "raw partial graph") are stored as a `callees TEXT` -column (JSON array of callee name strings, straight from the AST) in the `functions` table: - -```sql --- functions table (abbreviated) -function_id TEXT -- e.g. "src/auth/login.ts::validateToken" -file TEXT -- absolute path -callees TEXT -- JSON: ["checkPermission", "hashPassword", ...] -fan_in INTEGER -- written back after graph computation -fan_out INTEGER -pagerank REAL -betweenness REAL -scc_id INTEGER -... -``` - -`build_call_graph` reads just `(function_id, file, callees)` from SQLite — no metrics, no -enrichment data. This is the lean graph load. The CallGraph struct holds only: -- `nodes: HashSet` — function IDs (~4 MB for 51k functions) -- `edges: HashMap>` — callee ID lists (~20 MB) - -After `update_callgraph_metrics` writes computed values back, the CallGraph is dropped. -The "raw partial" (callee lists in the `callees` column) stays on disk as audit data -but is no longer needed in RAM. - -For degree-only metrics (fan_in, fan_out), these could in principle be computed with SQL -`COUNT` queries without loading the graph into memory at all. The iterative algorithms -(PageRank, betweenness, SCC) require the full edge structure in RAM for performance — -doing 30 PageRank iterations via SQL would be ~100× slower than in-memory traversal. - ---- - -## Files Changed - -| File | Change | -|------|--------| -| `hotspots-core/src/db/mod.rs` | Add `callees TEXT` to schema; add `insert_reports`, `build_call_graph`, `update_callgraph_metrics`, `update_churn`, `update_touch`, `update_activity_risk`, `update_percentile_flags`, `update_driver_and_quadrant`, `update_patterns`, `compute_summary`, `write_snapshot_json_to` to `TempDb` | -| `hotspots-cli/src/cmd/analyze.rs` | New `build_and_stream_snapshot_via_db` function; wire into snapshot mode | - -The delta mode pipeline (`build_enriched_snapshot`) is **unchanged** — it still returns a -`Snapshot` struct because delta computation needs to load two snapshots and diff them. - ---- - -## Enrichment Phases That Use SQL vs Rust Streaming - -| Phase | Approach | Why | -|-------|----------|-----| -| Fan-in/fan-out | In-memory from CallGraph (also writable as SQL COUNT) | Graph already loaded | -| PageRank | In-memory (iterative, 30 passes over all edges) | SQL would be 100× slower | -| Betweenness | In-memory (BFS per source node) | Same reason | -| SCC | In-memory (Tarjan DFS) | Requires global stack/visited state | -| Dependency depth | In-memory (BFS level-by-level) | Graph already loaded | -| Churn | SQL UPDATE per file | Single-pass, file → rows mapping | -| Touch metrics | SQL UPDATE per file | Same | -| Activity risk | Rust streaming: 1k rows at a time → batch UPDATE | Pure function per row | -| Percentile flags | SQL NTILE window function | Pure aggregation | -| Driver labels | Rust streaming: load distribution, then label per row | Needs percentile thresholds first | -| Quadrant | Rust streaming: same pass as driver labels | Depends on driver + touch data | -| Patterns | Rust streaming: 1k rows at a time → batch UPDATE | Pure function per row | -| Summary stats | SQL: GROUP BY band, SUM(activity_risk), etc. | Efficient aggregation | -| JSON output | SQL cursor → serde_json per row | Already done (T4.7) | - ---- - -## CPU Utilization Impact - -### Where CPU comes from today - -| Phase | CPU character | Bounded? | -|-------|--------------|----------| -| Analysis (rayon workers) | Parallel, all cores, dominant consumer | Yes — `--jobs N` flag | -| Touch cache cold start | Sequential `git log -1` per stale file → rapid subprocess fan-out | Yes — batch calls now | -| `build_call_graph` | Single-threaded, O(F) construction | No (fast though) | -| PageRank | Single-threaded, 30 × O(E) iterations | No | -| Betweenness approx | Single-threaded, k × O(V+E) BFS | No | -| SCC / dependency depth | Single-threaded, O(V+E) | No | -| Enrichment (churn/touch/risk/patterns) | Single-threaded, O(F) passes | No (fast) | -| JSON serialization (old path) | Single-threaded, O(F), large allocator pressure | Eliminated by T4.7 | - -### What the SQLite pipeline changes for CPU - -**Adds overhead:** - -- **INSERT on write**: Each `FunctionRiskReport` row requires serializing the `callees` - field to a JSON string (Vec\ → text). At 51k functions with an average of ~5 - callees each, this is ~255k string serializations — cheap but not free. -- **SQL UPDATE passes**: Each enrichment phase issues a batch of UPDATE statements rather - than mutating a Vec index. A Vec write is a pointer store; a SQLite UPDATE is a B-tree - key lookup + page dirtying. Rough overhead: 2–5× per-row cost vs direct mutation. -- **Cursor deserialization**: Streaming reads deserialize each row back from SQLite types - into Rust values. Similar overhead to the INSERT path. -- **Multiple passes**: The pipeline makes ~8 passes over the data (graph write, churn, - touch, activity\_risk, percentiles, driver labels, patterns, output) rather than the - current ~6 in-memory passes. Two extra passes for the graph write/read cycle. - -**Removes or reduces overhead:** - -- **Allocator pressure**: The current pipeline allocates a 50 MB `Vec` - and then a 150 MB JSON `String` in quick succession, forcing the allocator to find and - manage large contiguous regions. SQLite's page cache avoids both of these large Rust heap - allocations. Fewer large allocations = less allocator CPU and fewer OS page faults. -- **Cache locality per batch**: Processing 1 000 rows at a time from a cursor fits in L3 - cache. Processing 51k functions at once does not. Each enrichment pass has better - spatial locality in the batched model. -- **Eliminated JSON string allocation (T4.7)**: The `serde_json::to_string_pretty` call - that built a 150–200 MB string for the entire snapshot is already gone. That was the - single largest CPU + allocator event in the output phase. -- **No swap / OOM-killer overhead**: On the Docker benchmark (512 MB limit), the old - pipeline was running close to the limit. When the OS is under memory pressure it - spends CPU on page reclaim. Reducing peak memory from ~250 MB to ~25 MB eliminates - that hidden CPU tax. - -### Net CPU effect - -For the graph algorithms (PageRank, betweenness, SCC) there is **no change** — same code, -same complexity. These are the dominant single-threaded CPU consumers after the analysis -phase and the SQLite refactor does not touch them. - -For everything else the overhead of SQLite operations (B-tree, serialization) is offset by -the reduction in allocator pressure and improved cache behavior. On a warm L3 cache with -25 MB working set vs a cold 250 MB working set, the later enrichment phases run faster. - -Observed benchmark behaviour is expected to show: -- Analysis phase CPU profile: unchanged (still rayon workers up to `--jobs` limit) -- Post-analysis CPU: flatter, shorter spikes (no large-allocation events) -- Total wall-clock time: roughly neutral to 10–20% slower on small repos (SQLite overhead - dominates), roughly neutral to faster on large repos (cache + allocator pressure win) - -The SQLite refactor is primarily a **memory** optimization. CPU is a secondary benefit -for very large repos where memory pressure was causing OS-level overhead. - ---- - -## What Is NOT Changing - -- The `CallGraph` struct and all graph algorithm implementations — no changes needed -- The `scoring::compute_activity_risk` function — called per-row from the streaming loop -- The `patterns::classify` function — called per-row from the streaming loop -- The `Snapshot` struct and all its serialization — still used for delta mode and persistence -- The `.json.zst` snapshot file format — backward-compatible persistence unchanged -- The `SnapshotEnricher` — still used for delta mode - ---- - -## Persistence (non-benchmark case) - -When `--no-persist` is NOT passed, the snapshot must also be written to -`.hotspots/snapshots/.json.zst`. With the DB pipeline, this means loading a full -`Snapshot` from TempDb after enrichment — a one-time cost only paid when persisting. -Alternatively, the `SnapshotDb` (`.hotspots/snapshots.db`) can be used for persistence, -avoiding the round-trip through the `Snapshot` struct entirely. - -For now: the DB pipeline handles `--no-persist` (the benchmark case). Persistence can -be converted to use `SnapshotDb` in a follow-on. diff --git a/docs/architecture/testing.md b/docs/architecture/testing.md deleted file mode 100644 index b1fcc41..0000000 --- a/docs/architecture/testing.md +++ /dev/null @@ -1,759 +0,0 @@ -# Testing Strategy - -Hotspots testing approach ensuring correctness, determinism, and cross-language consistency. - -## Overview - -Hotspots employs a multi-layered testing strategy: - -1. **Unit Tests** - Test individual components in isolation -2. **Integration Tests** - Test end-to-end analysis pipeline -3. **Golden Tests** - Verify deterministic output -4. **Language Parity Tests** - Ensure cross-language consistency -5. **CI Invariant Tests** - Enforce critical invariants -6. **Suppression Tests** - Validate suppression comments - -**Test Coverage:** >80% (target 90%) -**Total Tests:** 220+ tests across all categories - ---- - -## Test Types - -### Unit Tests - -Test individual functions and modules in isolation. - -**Location:** `hotspots-core/src/**/tests.rs` (inline with code) - -**Run Command:** -```bash -cargo test -``` - -**Example:** -```rust -#[cfg(test)] -mod tests { - use super::*; - - #[test] - fn test_cyclomatic_complexity() { - let cfg = build_test_cfg(); - let cc = calculate_cc(&cfg); - assert_eq!(cc, 5); - } -} -``` - -**Coverage:** -- CFG construction -- Metrics calculation (CC, ND, FO, NS) -- LRS formula -- Risk band classification -- Configuration loading -- Suppression parsing - ---- - -### Integration Tests - -Test complete analysis pipeline from source code to reports. - -**Location:** `hotspots-core/tests/integration_tests.rs` - -**Run Command:** -```bash -cargo test --test integration_tests -``` - -**What They Test:** -- Full analysis pipeline -- File discovery -- Parser integration -- CFG building -- Metrics extraction -- Report generation - -**Example:** -```rust -#[test] -fn test_analyze_typescript_file() { - let path = PathBuf::from("tests/fixtures/simple.ts"); - let options = AnalysisOptions { - min_lrs: None, - top_n: None, - }; - - let reports = analyze(&path, options).unwrap(); - - assert_eq!(reports.len(), 2); // simpleFunction, complexFunction - assert_eq!(reports[0].metrics.cc, 1); - assert_eq!(reports[1].metrics.cc, 4); -} -``` - -**Key Tests:** -- Single file analysis -- Directory analysis -- Multi-language projects -- Config file loading -- Output formatting (JSON, text) -- Error handling (invalid syntax, missing files) - ---- - -### Golden Tests - -Verify byte-for-byte deterministic output against saved snapshots. - -**Location:** `hotspots-core/tests/golden_tests.rs` - -**Run Command:** -```bash -cargo test --test golden_tests -``` - -**How They Work:** - -1. **Fixture** - Test input file (e.g., `tests/fixtures/simple.ts`) -2. **Analysis** - Run Hotspots analysis -3. **Golden File** - Expected output (e.g., `tests/golden/simple.json`) -4. **Comparison** - Assert actual output matches golden file exactly - -**Example:** -```rust -#[test] -fn test_simple_golden() { - let fixture = "tests/fixtures/simple.ts"; - let golden = "tests/golden/simple.json"; - - let reports = analyze(fixture, options).unwrap(); - let actual = render_json(&reports); - let expected = read_golden(golden); - - // Parse as JSON, normalize paths, compare - assert_eq!(parse_json(actual), parse_json(expected)); -} -``` - -**Path Normalization:** - -Golden files use absolute paths, which vary by machine. The test harness normalizes paths: - -```rust -fn normalize_paths(json: &mut Value, project_root: &Path) { - // Extract path after "hotspots/" and normalize to current root - if let Some(idx) = path.find("hotspots/") { - let suffix = &path[idx + "hotspots/".len()..]; - *path = project_root.join(suffix).to_string(); - } -} -``` - -**Generating Golden Files:** - -```bash -# Build release binary -cargo build --release - -# Generate golden output -./target/release/hotspots analyze tests/fixtures/simple.ts --format json > tests/golden/simple.json - -# Verify manually -cat tests/golden/simple.json | jq . - -# Commit golden file -git add tests/golden/simple.json -``` - -**When to Update Golden Files:** -- New feature changes output format -- Metric calculation improved -- Bug fix changes results - -**Never update golden files to make tests pass without understanding why output changed!** - ---- - -### Language Parity Tests - -Ensure identical code structure produces identical metrics across languages. - -**Location:** `hotspots-core/tests/language_parity_tests.rs` - -**Run Command:** -```bash -cargo test --test language_parity_tests -``` - -**Critical Invariant:** - -> Functions with identical control flow structure MUST produce identical complexity metrics regardless of language. - -**Example:** - -TypeScript: -```typescript -function example(x: number): number { - if (x > 0) { - return x * 2; - } - return 0; -} -``` - -JavaScript: -```javascript -function example(x) { - if (x > 0) { - return x * 2; - } - return 0; -} -``` - -**Expected:** Both must have CC=2, ND=1, FO=0, NS=1 - -**Test Implementation:** -```rust -#[test] -fn test_typescript_javascript_parity() { - let ts_reports = analyze("tests/fixtures/example.ts", options).unwrap(); - let js_reports = analyze("tests/fixtures/js/example.js", options).unwrap(); - - assert_eq!(ts_reports.len(), js_reports.len()); - - for (ts, js) in ts_reports.iter().zip(js_reports.iter()) { - assert_eq!(ts.metrics.cc, js.metrics.cc, "CC must match"); - assert_eq!(ts.metrics.nd, js.metrics.nd, "ND must match"); - assert_eq!(ts.metrics.fo, js.metrics.fo, "FO must match"); - assert_eq!(ts.metrics.ns, js.metrics.ns, "NS must match"); - assert_eq!(ts.lrs, js.lrs, "LRS must match"); - } -} -``` - -**Fixtures:** -- `tests/fixtures/simple.ts` ↔ `tests/fixtures/js/simple.js` -- `tests/fixtures/nested-branching.ts` ↔ `tests/fixtures/js/nested-branching.js` -- `tests/fixtures/loop-breaks.ts` ↔ `tests/fixtures/js/loop-breaks.js` -- `tests/fixtures/pathological.ts` ↔ `tests/fixtures/js/pathological.js` - ---- - -### CI Invariant Tests - -Enforce critical behavioral invariants. - -**Location:** `hotspots-core/tests/ci_invariant_tests.rs` - -**Run Command:** -```bash -cargo test --test ci_invariant_tests -``` - -**Invariants Tested:** - -#### 1. Determinism - -> Running analysis twice on identical code MUST produce byte-for-byte identical output. - -```rust -#[test] -fn test_determinism() { - let run1 = analyze("tests/fixtures/simple.ts", options).unwrap(); - let run2 = analyze("tests/fixtures/simple.ts", options).unwrap(); - - assert_eq!(render_json(&run1), render_json(&run2)); -} -``` - -#### 2. Ordering - -> Functions MUST always appear in source order. - -```rust -#[test] -fn test_function_ordering() { - let reports = analyze("tests/fixtures/multiple-functions.ts", options).unwrap(); - - for i in 1..reports.len() { - assert!(reports[i-1].line < reports[i].line, "Functions must be ordered by line number"); - } -} -``` - -#### 3. Monotonicity - -> Adding control flow MUST increase or maintain CC (never decrease). - -```rust -#[test] -fn test_cc_monotonicity() { - let simple_cc = analyze_snippet("return x;").metrics.cc; - let with_if_cc = analyze_snippet("if (x > 0) return x; return 0;").metrics.cc; - - assert!(with_if_cc >= simple_cc, "Adding 'if' must increase CC"); -} -``` - -#### 4. Non-Negativity - -> All metrics MUST be non-negative. - -```rust -#[test] -fn test_non_negative_metrics() { - let reports = analyze("tests/fixtures/pathological.ts", options).unwrap(); - - for report in reports { - assert!(report.metrics.cc >= 0); - assert!(report.metrics.nd >= 0); - assert!(report.metrics.fo >= 0); - assert!(report.metrics.ns >= 0); - assert!(report.lrs >= 0.0); - } -} -``` - ---- - -### Suppression Tests - -Validate suppression comment parsing and behavior. - -**Location:** `hotspots-core/tests/suppression_tests.rs` - -**Run Command:** -```bash -cargo test --test suppression_tests -``` - -**What They Test:** -- Suppression comment detection -- Reason extraction -- Metrics still calculated (suppression doesn't skip analysis) -- Report includes suppression metadata - -**Example:** -```rust -#[test] -fn test_suppression_comment() { - let source = r#" - // @hotspots-ignore: Legacy code, refactor planned - function legacyFunction(x) { - // complex logic... - } - "#; - - let reports = analyze_source(source).unwrap(); - - assert_eq!(reports.len(), 1); - assert!(reports[0].suppressed); - assert_eq!(reports[0].suppression_reason, Some("Legacy code, refactor planned")); - assert_eq!(reports[0].metrics.cc, 8); // Still calculated! -} -``` - ---- - -## Test Fixtures - -### Directory Structure - -``` -tests/fixtures/ -├── typescript/ -│ ├── simple.ts -│ ├── nested-branching.ts -│ ├── loop-breaks.ts -│ ├── try-catch-finally.ts -│ └── pathological.ts -├── javascript/ -│ ├── simple.js -│ ├── nested-branching.js -│ └── loop-breaks.js -├── go/ -│ ├── simple.go -│ ├── loops.go -│ └── branching.go -├── python/ -│ ├── simple.py -│ ├── comprehensions.py -│ └── exceptions.py -├── rust/ -│ ├── simple.rs -│ ├── pattern-matching.rs -│ └── iterators.rs -└── java/ - ├── Simple.java - ├── Loops.java - └── Exceptions.java -``` - -### Fixture Categories - -#### 1. Simple - -Basic functions with minimal complexity. - -**Purpose:** Baseline testing, smoke tests - -**Example:** `simple.ts` -```typescript -function simple(x: number): number { - return x + 1; -} - -function withEarlyReturn(x: number): number { - if (x < 0) return 0; - return x; -} -``` - -**Expected Metrics:** -- `simple`: CC=1, ND=0, FO=0, NS=0 -- `withEarlyReturn`: CC=2, ND=1, FO=0, NS=1 - -#### 2. Nested Branching - -Deeply nested if/else structures. - -**Purpose:** Test ND calculation, CC with nested conditions - -**Example:** `nested-branching.ts` -```typescript -function nested(x: number): number { - if (x > 0) { - if (x < 100) { - if (x % 2 === 0) { - return x * 2; - } - } - } - return 0; -} -``` - -**Expected Metrics:** CC=4, ND=3 - -#### 3. Loop Breaks - -Loops with break/continue. - -**Purpose:** Test NS calculation, loop CFG routing - -**Example:** `loop-breaks.ts` -```typescript -function loopWithBreak(items: number[]): number { - for (const item of items) { - if (item > 10) break; - if (item < 0) continue; - } - return items[0]; -} -``` - -**Expected Metrics:** CC=3, NS=2 (break + continue) - -#### 4. Try-Catch-Finally - -Exception handling. - -**Purpose:** Test CC with catch clauses - -**Example:** `try-catch-finally.ts` -```typescript -function tryCatch(x: number): number { - try { - return x / 2; - } catch (err) { - return 0; - } finally { - console.log("done"); - } -} -``` - -**Expected Metrics:** CC=2 (try + catch), FO=1 (console.log) - -#### 5. Pathological - -Extremely complex functions. - -**Purpose:** Stress testing, performance benchmarks - -**Example:** `pathological.ts` -```typescript -function pathological(data: any): any { - if (data.a) { - if (data.b) { - if (data.c) { - for (let i = 0; i < 10; i++) { - if (i % 2 === 0) { - if (data.items && data.items[i]) { - try { - return process(data.items[i]); - } catch (e) { - if (e.code === 500) { - throw e; - } else { - continue; - } - } - } - } - } - } - } - } - return null; -} -``` - -**Expected Metrics:** CC=11, ND=7 - ---- - -## Running Tests - -### All Tests - -```bash -cargo test -``` - -### Specific Test Suite - -```bash -# Unit tests only -cargo test --lib - -# Integration tests -cargo test --test integration_tests - -# Golden tests -cargo test --test golden_tests - -# Language parity -cargo test --test language_parity_tests - -# CI invariants -cargo test --test ci_invariant_tests -``` - -### Specific Test - -```bash -# By name -cargo test test_simple_golden - -# With output -cargo test test_simple_golden -- --nocapture - -# Filter by pattern -cargo test typescript -``` - -### Watch Mode - -```bash -# Rerun tests on file changes -cargo watch -x test -``` - ---- - -## Continuous Integration - -### GitHub Actions Workflow - -`.github/workflows/test.yml`: -```yaml -name: Tests - -on: [push, pull_request] - -jobs: - test: - runs-on: ubuntu-latest - - steps: - - uses: actions/checkout@v4 - - - uses: dtolnay/rust-toolchain@stable - - - name: Cache cargo - uses: actions/cache@v4 - with: - path: | - ~/.cargo/bin/ - ~/.cargo/registry/index/ - ~/.cargo/registry/cache/ - ~/.cargo/git/db/ - target/ - key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }} - - - name: Run tests - run: cargo test --verbose - - - name: Run golden tests - run: cargo test --test golden_tests - - - name: Run language parity tests - run: cargo test --test language_parity_tests - - - name: Run CI invariant tests - run: cargo test --test ci_invariant_tests -``` - -### Required Tests for PR - -All PRs must pass: -- ✅ All unit tests -- ✅ All integration tests -- ✅ All golden tests -- ✅ All language parity tests -- ✅ All CI invariant tests -- ✅ `cargo clippy` (zero warnings) -- ✅ `cargo fmt -- --check` (formatted) - ---- - -## Coverage - -### Generate Coverage Report - -```bash -# Install tarpaulin -cargo install cargo-tarpaulin - -# Generate coverage -cargo tarpaulin --out Html --output-dir coverage - -# View report -open coverage/index.html -``` - -### Coverage Targets - -- **Overall:** >80% (target 90%) -- **Core modules:** >90% - - `metrics.rs` - 95% - - `cfg/builder.rs` - 90% - - `language/*/parser.rs` - 85% -- **Less critical:** >70% - - `cli.rs` - 70% - - `render.rs` - 75% - ---- - -## Performance Benchmarks - -### Benchmark Suite - -`benches/analysis_benchmark.rs`: -```rust -use criterion::{black_box, criterion_group, criterion_main, Criterion}; -use hotspots_core::analyze; - -fn bench_simple_file(c: &mut Criterion) { - c.bench_function("analyze simple.ts", |b| { - b.iter(|| analyze(black_box("tests/fixtures/simple.ts"), options)) - }); -} - -fn bench_complex_file(c: &mut Criterion) { - c.bench_function("analyze pathological.ts", |b| { - b.iter(|| analyze(black_box("tests/fixtures/pathological.ts"), options)) - }); -} - -criterion_group!(benches, bench_simple_file, bench_complex_file); -criterion_main!(benches); -``` - -### Run Benchmarks - -```bash -cargo bench -``` - -### Performance Targets - -- **Simple file (<50 LOC):** <5ms -- **Medium file (50-500 LOC):** <30ms -- **Complex file (500+ LOC):** <100ms -- **Directory (100 files):** <3s - ---- - -## Best Practices - -### Writing Tests - -1. **Test one thing** - Each test should verify a single behavior -2. **Clear names** - Test name should describe what's being tested -3. **Arrange-Act-Assert** - Follow AAA pattern -4. **Independent** - Tests should not depend on each other -5. **Deterministic** - No randomness, no flakiness - -**Good:** -```rust -#[test] -fn test_if_statement_adds_one_to_cc() { - let source = "if (x > 0) return x;"; - let cc = analyze_snippet(source).metrics.cc; - assert_eq!(cc, 2); // Baseline 1 + if statement 1 -} -``` - -**Bad:** -```rust -#[test] -fn test_stuff() { - let reports = analyze("tests/fixtures/simple.ts", options).unwrap(); - assert!(reports.len() > 0); - assert!(reports[0].metrics.cc > 0); -} -``` - -### Updating Golden Files - -1. **Understand why** - Don't blindly regenerate -2. **Manual verification** - Check output looks correct -3. **Document reason** - Explain in commit message -4. **Review carefully** - Golden file changes are high-risk - -**Good workflow:** -```bash -# 1. Make code change -# 2. Test fails -cargo test test_simple_golden - -# 3. Investigate difference -diff <(./target/release/hotspots analyze tests/fixtures/simple.ts --format json) tests/golden/simple.json - -# 4. If change is expected, regenerate -./target/release/hotspots analyze tests/fixtures/simple.ts --format json > tests/golden/simple.json - -# 5. Verify -cat tests/golden/simple.json | jq . - -# 6. Commit with explanation -git add tests/golden/simple.json -git commit -m "test: update simple.json golden file after CC fix" -``` - ---- - -## Related Documentation - -- [Adding Language Support](../contributing/adding-languages.md) - Testing new languages -- [Development Setup](../contributing/development.md) - Running tests locally -- [Invariants](./invariants.md) - Critical invariants enforced by tests - ---- - -**Testing is critical to Hotspots' reliability. When in doubt, add more tests!** ✅ diff --git a/docs/code-architecture/change-guide.md b/docs/code-architecture/change-guide.md deleted file mode 100644 index 052f368..0000000 --- a/docs/code-architecture/change-guide.md +++ /dev/null @@ -1,132 +0,0 @@ -# Contributor Change Guide - -Use this guide to choose where a code change belongs. - -## Add or change a metric - -Start in: - -- `hotspots-core/src/metrics.rs` -- `hotspots-core/src/risk.rs` -- `hotspots-core/src/report.rs` - -Then update: - -- golden tests under `hotspots-core/tests/` -- JSON/schema docs if output changes -- user docs if the metric is public - -Be careful: metric changes usually alter golden output and may affect policy behavior. - -## Add a language feature - -Start in the relevant language module under `hotspots-core/src/language/`. - -Common steps: - -1. Parse/discover the construct. -2. Ensure function spans and names are deterministic. -3. Model control flow in the CFG builder. -4. Add metric/golden fixtures. -5. Add parity tests when behavior should match another language. - -## Add a new language - -Typical work areas: - -- `hotspots-core/src/language/mod.rs` -- `hotspots-core/src/language/parser.rs` -- new parser module under `hotspots-core/src/language/` -- CFG builder implementation -- metric extraction support -- fixtures and golden tests -- docs/reference language support updates - -Keep the language layer behind the existing traits so the rest of the pipeline remains language-agnostic. - -## Change scoring or quadrants - -Start in: - -- `hotspots-core/src/scoring.rs` -- `hotspots-core/src/risk.rs` -- `hotspots-core/src/snapshot.rs` - -Also review: - -- policy thresholds -- docs explaining quadrants -- tests involving activity risk, touch counts, and fire/debt classification - -Do not describe `activity_risk` alone as active churn. Use `quadrant` and `touches_30d`. - -## Change GitHub Action behavior - -Start in: - -- `action/src/main.ts` -- `action/action.yml` -- `action/__tests__/` -- `.github/workflows/test-action.yml` - -After TypeScript changes, run: - -```bash -npm -C action test -npm -C action run package -``` - -Commit the regenerated `action/dist/` files. Do not commit CLI binaries. - -## Change CI workflow binary reuse - -The intended model is: - -- cache key includes runner OS and commit SHA -- restore exact key only for commit-scoped binaries -- build on GitHub-hosted runners on cache miss -- cache save is best-effort because parallel workflows may race -- pass binaries between jobs in the same workflow via artifacts when a strict dependency is needed - -## Add a CLI command - -Put reusable logic in `hotspots-core`; keep `hotspots-cli` as the command adapter. - -Add command wiring in: - -- `hotspots-cli/src/main.rs` -- `hotspots-cli/src/cmd/mod.rs` -- a new file under `hotspots-cli/src/cmd/` - -Add integration tests for externally visible behavior. - -## Documentation update checklist - -When a change affects public behavior, update docs with the code change. - -| Code change | Docs to review | -|---|---| -| CLI flags or command behavior | `docs/reference/cli.md`, `docs/guide/usage.md` | -| JSON, JSONL, HTML, or SARIF output | `docs/reference/json-schema.md`, `docs/guide/output-formats.md` | -| language support or parser behavior | `docs/reference/language-support.md`, `docs/contributing/adding-languages.md` | -| scoring, LRS, quadrants, or risk bands | `docs/reference/metrics.md`, `docs/reference/lrs-spec.md`, `docs/code-architecture/data-model.md` | -| GitHub Action inputs/outputs | `docs/guide/ci-cd.md`, `docs/guide/github-action.md`, `action/action.yml` | -| config keys or defaults | `docs/guide/configuration.md`, `.hotspotsrc.json` examples | -| suppression comments | `docs/guide/suppression.md` | - -## Required validation - -Before proposing changes, run the relevant subset. For broad changes, run: - -```bash -cargo fmt --all -- --check -cargo clippy --all-targets --all-features -- -D warnings -cargo test -``` - -For action changes, also run: - -```bash -npm -C action test -npm -C action run package -``` diff --git a/docs/code-architecture/cli-and-action.md b/docs/code-architecture/cli-and-action.md deleted file mode 100644 index ad5b8ea..0000000 --- a/docs/code-architecture/cli-and-action.md +++ /dev/null @@ -1,79 +0,0 @@ -# CLI and GitHub Action - -This page explains the runtime wrappers around `hotspots-core`: the Rust CLI and the JavaScript GitHub Action. - -## CLI crate - -`hotspots-cli` is the user-facing binary crate. Its job is orchestration, not analysis. - -Important paths: - -| Path | Purpose | -|---|---| -| `hotspots-cli/src/main.rs` | Top-level CLI parser and command dispatch. | -| `hotspots-cli/src/cmd/analyze.rs` | `hotspots analyze`: snapshot and delta analysis modes. | -| `hotspots-cli/src/cmd/diff.rs` | `hotspots diff`: compare snapshots/commits. | -| `hotspots-cli/src/cmd/trends.rs` | Historical trend commands. | -| `hotspots-cli/src/cmd/config.rs` | Config inspection/validation commands. | -| `hotspots-cli/src/cmd/init.rs` | Config initialization. | -| `hotspots-cli/src/output/` | CLI-specific output helpers. | - -The CLI should generally: - -1. Parse arguments with `clap`. -2. Resolve config. -3. Determine git context when needed. -4. Call `hotspots-core` APIs. -5. Render output and set exit status. - -## Command architecture - -The key command families are: - -- `analyze`: produce current reports, snapshots, or delta-mode analysis. -- `diff`: compare base/head snapshots and evaluate policies. -- `trends`: inspect historical risk movement. -- `config`: inspect resolved configuration. -- `prune`: remove old local artifacts. - -When adding a command, keep reusable business logic in `hotspots-core` and make the CLI command a thin adapter. - -## GitHub Action - -The GitHub Action lives in `action/`. - -| Path | Purpose | -|---|---| -| `action/action.yml` | Public action inputs/outputs and runtime entry point. | -| `action/src/main.ts` | TypeScript source for the action. | -| `action/dist/index.js` | Committed bundled JavaScript runtime used by GitHub Actions. | -| `action/__tests__/` | Jest tests for action helper behavior. | - -The action does not contain static analysis logic. It resolves a `hotspots` binary and invokes the CLI. - -Binary resolution order: - -1. Use `binary-path` input when provided. -2. Resolve/download a release binary for the configured version. -3. Fall back to building from source when running inside the repository and Rust is available. - -## Why `dist/` is committed but CLI binaries are not - -JavaScript GitHub Actions execute checked-in JavaScript. For this reason, `action/dist/index.js` is committed after running: - -```bash -npm -C action run package -``` - -The Rust CLI binary is not committed. CI can build it on GitHub-hosted runners and pass it to the action through `binary-path` or workflow artifacts. Release assets provide prebuilt binaries for normal external users. - -## Action test workflow - -The action validation workflow builds or restores the CLI binary once, uploads it as a short-lived artifact, and then runs `uses: ./action` with: - -```yaml -with: - binary-path: .hotspots-bin/hotspots -``` - -This tests the local action code against a trusted binary built by GitHub Actions without adding binary files to git. diff --git a/docs/code-architecture/core-modules.md b/docs/code-architecture/core-modules.md deleted file mode 100644 index 32070e3..0000000 --- a/docs/code-architecture/core-modules.md +++ /dev/null @@ -1,84 +0,0 @@ -# Core Crate Modules - -`hotspots-core` is the main library crate. The CLI and GitHub Action exist primarily to drive this crate and present its output. - -## Public entry points - -`hotspots-core/src/lib.rs` exports the public analysis entry points: - -- `analyze(path, options)` -- `analyze_with_config(path, options, resolved_config)` -- `analyze_with_progress(path, options, resolved_config, progress)` - -It also re-exports commonly used types such as `FunctionRiskReport`, `ResolvedConfig`, `GitContext`, `CallGraph`, and `TouchMode`. - -## Module groups - -### Source analysis - -| Module | Responsibility | -|---|---| -| `analysis.rs` | Per-file orchestration: parse, discover functions, compute reports. | -| `language/` | Language abstraction layer and parser/CFG implementations. | -| `parser.rs` | ECMAScript parser compatibility layer and parser tests. | -| `discover.rs` | Function discovery helpers and legacy ECMAScript discovery. | -| `cfg.rs`, `cfg/builder.rs` | Control-flow graph representation and ECMAScript CFG construction. | -| `metrics.rs` | Raw metric extraction: cyclomatic complexity, nesting, fan-out, exits. | -| `suppression.rs` | `hotspots-ignore` suppression comment handling. | - -### Risk and prioritization - -| Module | Responsibility | -|---|---| -| `risk.rs` | LRS weights, thresholds, bands, and base risk calculations. | -| `scoring.rs` | Activity-weighted scoring and composite risk logic. | -| `patterns.rs` | Pattern classification such as god function, churn magnet, hub function. | -| `aggregates.rs` | File and directory aggregates. | -| `compact.rs` | Compact/default output data shaping. | - -### Git, snapshots, and deltas - -| Module | Responsibility | -|---|---| -| `git.rs` | Git context, refs, churn extraction, commit metadata. | -| `snapshot.rs` | Snapshot construction, enrichment, persistence model. | -| `delta.rs` | Snapshot/function comparison. | -| `policy.rs` | Policy evaluation over deltas. | -| `touch_cache.rs` | Cache for expensive function touch calculations. | -| `db/` | SQLite-backed storage helpers. | -| `trends.rs` | Historical trend analysis. | -| `prune.rs` | Snapshot/cache pruning. | - -### Graph and model enrichment - -| Module | Responsibility | -|---|---| -| `callgraph.rs` | Call graph, fan-in/fan-out, PageRank, betweenness approximations. | -| `imports.rs` | Import extraction and module relationship helpers. | -| `models.rs` | Model/entity declaration extraction and association. | - -### Rendering and output - -| Module | Responsibility | -|---|---| -| `report.rs` | Text/JSON rendering for function reports. | -| `html.rs` | HTML report generation. | -| `sarif.rs` | SARIF output for code scanning integrations. | - -## Language abstraction - -The language layer normalizes TypeScript, JavaScript, Go, Java, Python, Rust, C#, and Vue-flavored inputs into common concepts: - -- `Language` -- `LanguageParser` -- `ParsedModule` -- `FunctionNode` -- `FunctionBody` -- `CfgBuilder` -- `SourceSpan` - -Most new language work should begin in `hotspots-core/src/language/` and then add golden tests under `hotspots-core/tests/fixtures/` and `hotspots-core/tests/golden_tests.rs`. - -## Where not to put logic - -Avoid putting core analysis behavior in `hotspots-cli`. CLI code should parse arguments, load config, call `hotspots-core`, and render command-specific output. If behavior needs tests independent of terminal invocation, it probably belongs in `hotspots-core`. diff --git a/docs/code-architecture/data-model.md b/docs/code-architecture/data-model.md deleted file mode 100644 index c4a58b4..0000000 --- a/docs/code-architecture/data-model.md +++ /dev/null @@ -1,97 +0,0 @@ -# Data Model and Persistence - -Hotspots has three important layers of data: per-function static reports, enriched snapshots, and deltas/policy results. - -## Function reports - -`FunctionRiskReport` is the primary static analysis output. It includes source location, raw metrics, risk components, LRS, risk band, patterns, and related metadata. - -Produced by: - -- `hotspots-core/src/analysis.rs` -- `hotspots-core/src/report.rs` - -Used by: - -- CLI rendering -- snapshot construction -- HTML/JSON/SARIF output -- tests and golden fixtures - -## Snapshots - -Snapshots represent a commit-level view of function risk. They are built from current function reports and enriched with git, activity, and graph context. - -Key concepts: - -- A snapshot belongs to a commit SHA. -- Snapshots are immutable once written. -- Snapshot output should be deterministic. -- Activity fields must be interpreted with quadrant context. - -Important fields include: - -| Field | Meaning | -|---|---| -| `lrs` | Static local risk score. | -| `activity_risk` | Composite activity-weighted score; decays but does not become zero. | -| `quadrant` | Authoritative classification: `fire`, `debt`, `simple-active`, or `simple-stable`. | -| `touches_30d` | Recent commit touches for true recent activity. | -| `churn` fields | Change volume over git history/windows. | -| Graph metrics | Fan-in, PageRank, SCC/cycle information, dependency signals. | - -Use `quadrant` and `touches_30d` together when describing urgency. A `debt` function with `touches_30d == 0` is stable structural debt, not actively changing code. - -## Deltas - -Deltas compare a base snapshot to a head snapshot. - -Function states include: - -- new -- modified -- deleted -- unchanged - -Policies consume deltas to decide whether a PR should fail, warn, or pass. - -Relevant files: - -- `hotspots-core/src/delta.rs` -- `hotspots-core/src/policy.rs` -- `hotspots-cli/src/cmd/diff.rs` - -## Persistence - -Persistence is split between snapshot files and SQLite-backed helpers. - -| Component | Files | -|---|---| -| Snapshot construction/enrichment | `hotspots-core/src/snapshot.rs` | -| SQLite helpers | `hotspots-core/src/db/` | -| Git context/churn | `hotspots-core/src/git.rs` | -| Touch cache | `hotspots-core/src/touch_cache.rs` | -| Pruning | `hotspots-core/src/prune.rs` | - -## Deterministic persistence rules - -When changing persistence code: - -- Sort collections before serialization. -- Avoid hash-map iteration order in output. -- Write snapshots atomically where possible. -- Preserve backwards compatibility or add explicit schema/version handling. -- Do not mutate existing commit snapshots in place. - -## Output formats - -The same data can be rendered as: - -- human text -- compact text -- JSON -- JSONL -- HTML -- SARIF - -Renderers should not recompute analysis. They should format already-computed reports, snapshots, deltas, or policy results. diff --git a/docs/code-architecture/index.md b/docs/code-architecture/index.md deleted file mode 100644 index cc2e006..0000000 --- a/docs/code-architecture/index.md +++ /dev/null @@ -1,34 +0,0 @@ -# Codebase Guide - -This section documents how the Hotspots codebase is organized for contributors and maintainers. It is intentionally separate from user-facing CLI docs: it explains crates, modules, data flow, invariants, and where to make changes. - -For deeper background, design records, and historical reviews, see [Architecture Notes](/architecture/). - -## Repository map - -| Path | Purpose | -|---|---| -| `hotspots-core/` | Rust library containing analysis, scoring, git enrichment, snapshot/delta logic, policies, and renderers. | -| `hotspots-cli/` | Thin CLI layer: argument parsing, command dispatch, terminal output, and command-specific orchestration. | -| `action/` | JavaScript GitHub Action wrapper around the CLI. The committed runtime is `action/dist/index.js`. | -| `docs/` | VitePress documentation site. | -| `tests/`, `hotspots-core/tests/` | Integration, invariant, golden, and language parity tests. | -| `packages/`, `action/` | TypeScript packages and action packaging surface. | - -## Main design rules - -Hotspots relies on a few architectural invariants: - -- **Deterministic output:** identical source and git inputs should produce byte-for-byte stable output. -- **Per-function static analysis:** CFG and raw complexity metrics are computed per function. -- **Explicit ordering after parallel work:** file analysis may run in parallel, but results are sorted before output. -- **Snapshot immutability:** snapshots are keyed by commit SHA and treated as immutable historical records. -- **Quadrant-aware activity:** `quadrant`, not raw `activity_risk` alone, is the primary fire/debt classification. - -## Read next - -- [Analysis Pipeline](./pipeline.md) -- [Core Crate Modules](./core-modules.md) -- [CLI and GitHub Action](./cli-and-action.md) -- [Data Model and Persistence](./data-model.md) -- [Contributor Change Guide](./change-guide.md) diff --git a/docs/code-architecture/pipeline.md b/docs/code-architecture/pipeline.md deleted file mode 100644 index 3fe30a4..0000000 --- a/docs/code-architecture/pipeline.md +++ /dev/null @@ -1,107 +0,0 @@ -# Analysis Pipeline - -Hotspots is a pipeline-oriented analyzer. Most commands eventually turn source files into function reports, optionally enrich those reports with git and graph data, then render or persist results. - -## High-level flow - -```text -CLI command - ↓ -configuration discovery and resolution - ↓ -source file discovery and filtering - ↓ -per-file static analysis - ↓ -function reports - ↓ -snapshot/delta/policy/report command behavior -``` - -## Static analysis flow - -The central library entry points are in `hotspots-core/src/lib.rs`: - -- `analyze` -- `analyze_with_config` -- `analyze_with_progress` - -The core per-file flow is: - -```text -source file - ↓ -language detection - ↓ -parse source into language-specific AST - ↓ -discover functions - ↓ -build per-function CFG - ↓ -extract raw metrics - ↓ -score LRS and patterns - ↓ -FunctionRiskReport[] -``` - -Important implementation locations: - -| Stage | Main files | -|---|---| -| Config resolution | `hotspots-core/src/config.rs` | -| Source collection | `hotspots-core/src/lib.rs` | -| File analysis | `hotspots-core/src/analysis.rs` | -| Language detection/parsing | `hotspots-core/src/language/`, `hotspots-core/src/parser.rs` | -| Function discovery | `hotspots-core/src/discover.rs`, language parser modules | -| CFG construction | `hotspots-core/src/cfg.rs`, `hotspots-core/src/cfg/builder.rs`, `hotspots-core/src/language/*/cfg_builder.rs` | -| Metrics | `hotspots-core/src/metrics.rs` | -| LRS/risk bands | `hotspots-core/src/risk.rs`, `hotspots-core/src/scoring.rs` | -| Reports | `hotspots-core/src/report.rs` | - -## Snapshot and delta flow - -Snapshot mode enriches static function reports with commit-aware data: - -```text -FunctionRiskReport[] - ↓ -git context and commit metadata - ↓ -touch/churn metrics - ↓ -call graph metrics - ↓ -quadrant classification - ↓ -snapshot persisted by commit SHA -``` - -Delta mode compares snapshots or analyzes against a base commit: - -```text -base snapshot + head snapshot - ↓ -function matching by stable identifiers - ↓ -new / modified / deleted / unchanged classification - ↓ -policy evaluation - ↓ -text/json/html/sarif output -``` - -Relevant files: - -- `hotspots-core/src/snapshot.rs` -- `hotspots-core/src/delta.rs` -- `hotspots-core/src/policy.rs` -- `hotspots-core/src/git.rs` -- `hotspots-core/src/touch_cache.rs` - -## Determinism boundaries - -File analysis uses Rayon in `analyze_with_progress`, so worker completion order is nondeterministic. The implementation restores determinism by sorting intermediate results by file index before producing final reports. - -When adding parallelism, always add an explicit deterministic sort before exposing results, persisting snapshots, or comparing deltas. diff --git a/docs/contributing/adding-languages.md b/docs/contributing/adding-languages.md deleted file mode 100644 index 7d01960..0000000 --- a/docs/contributing/adding-languages.md +++ /dev/null @@ -1,759 +0,0 @@ -# Adding Language Support - -Step-by-step guide to adding a new programming language to Hotspots. - -## Overview - -Adding language support involves: - -1. **Parser Integration** - Integrate tree-sitter parser for the language -2. **CFG Builder** - Implement Control Flow Graph construction -3. **Test Fixtures** - Create test code files for the language -4. **Golden Files** - Generate expected output for determinism testing -5. **Documentation** - Update language support documentation - -**Current Supported Languages:** TypeScript, JavaScript, Go, Java, Python, Rust - -**Estimated Effort:** 7-14 days per language (varies by complexity) - ---- - -## Prerequisites - -Before starting: - -- ✅ Familiarity with Rust programming -- ✅ Understanding of Control Flow Graphs (CFGs) -- ✅ Knowledge of the target language's syntax and semantics -- ✅ tree-sitter parser exists for the target language -- ✅ Read [Architecture Overview](../architecture/overview.md) -- ✅ Review existing language implementations (Go, Python, Rust) - ---- - -## Step-by-Step Guide - -### Step 1: Parser Integration - -#### 1.1 Add tree-sitter Dependency - -Edit `hotspots-core/Cargo.toml`: - -```toml -[dependencies] -# ... existing dependencies ... -tree-sitter-<language> = "0.x.y" -``` - -**Find tree-sitter parsers:** Check [tree-sitter GitHub](https://github.com/tree-sitter) or [crates.io](https://crates.io). - -#### 1.2 Create Language Module - -Create `hotspots-core/src/language/<language>/mod.rs`: - -```rust -//! <Language> language support -//! -//! This module provides <Language> parsing, function discovery, and CFG building -//! using the tree-sitter-<language> parser. - -pub mod cfg_builder; -pub mod parser; - -pub use cfg_builder::<Language>CfgBuilder; -pub use parser::<Language>Parser; -``` - -#### 1.3 Implement Parser - -Create `hotspots-core/src/language/<language>/parser.rs`: - -```rust -use tree_sitter::{Node, Parser, TreeCursor}; -use anyhow::{Context, Result}; -use crate::language::function_body::FunctionBody; - -pub struct <Language>Parser { - parser: Parser, -} - -impl <Language>Parser { - pub fn new() -> Result { - let mut parser = Parser::new(); - let language = tree_sitter_<language>::LANGUAGE.into(); - parser.set_language(&language) - .context("Failed to load <Language> grammar")?; - - Ok(Self { parser }) - } - - /// Parse source code and discover functions - pub fn discover_functions(&mut self, source: &str) -> Result> { - let tree = self.parser.parse(source, None) - .context("Failed to parse <Language> source")?; - - let root = tree.root_node(); - let mut functions = Vec::new(); - - // Walk AST to find function declarations - let mut cursor = root.walk(); - self.visit_node(&mut cursor, source, &mut functions); - - // Sort by source position for determinism - functions.sort_by_key(|f| match f { - FunctionBody::<Language> { body_node, .. } => *body_node, - _ => unreachable!(), - }); - - Ok(functions) - } - - fn visit_node(&self, cursor: &mut TreeCursor, source: &str, functions: &mut Vec) { - let node = cursor.node(); - - // Check if this node is a function declaration - match node.kind() { - "function_declaration" | "method_declaration" => { - if let Some(func) = self.extract_function(node, source) { - functions.push(func); - } - } - _ => {} - } - - // Recursively visit children - if cursor.goto_first_child() { - loop { - self.visit_node(cursor, source, functions); - if !cursor.goto_next_sibling() { - break; - } - } - cursor.goto_parent(); - } - } - - fn extract_function(&self, node: Node, source: &str) -> Option { - // Extract function name - let name_node = node.child_by_field_name("name")?; - let function_name = &source[name_node.byte_range()]; - - // Extract function body - let body_node = node.child_by_field_name("body")?; - let body_source = &source[body_node.byte_range()]; - - Some(FunctionBody::<Language> { - body_node: body_node.id(), - source: body_source.to_string(), - }) - } -} - -impl crate::language::LanguageParser for <Language>Parser { - fn discover_functions(&mut self, source: &str) -> Result> { - self.discover_functions(source) - } -} -``` - -**Key Considerations:** -- **Node kinds:** Check tree-sitter grammar for exact node type names -- **Field names:** Use `child_by_field_name()` for reliable access -- **Determinism:** Always sort functions by source position -- **Multiple function types:** Handle methods, closures, lambda, etc. - -#### 1.4 Add to Language Enum - -Edit `hotspots-core/src/language/mod.rs`: - -```rust -pub mod <language>; - -pub enum Language { - // ... existing languages ... - <Language>, -} - -impl Language { - pub fn from_extension(ext: &str) -> Option { - match ext { - // ... existing extensions ... - "<ext>" => Some(Language::<Language>), - _ => None, - } - } - - pub fn extensions(&self) -> &[&str] { - match self { - // ... existing languages ... - Language::<Language> => &["<ext>"], - } - } - - pub fn name(&self) -> &str { - match self { - // ... existing languages ... - Language::<Language> => "<Language>", - } - } -} - -pub use <language>::{<Language>Parser, <Language>CfgBuilder}; -``` - -#### 1.5 Add FunctionBody Variant - -Edit `hotspots-core/src/language/function_body.rs`: - -```rust -pub enum FunctionBody { - // ... existing variants ... - <Language> { - body_node: usize, - source: String, - }, -} - -impl FunctionBody { - pub fn is_<language>(&self) -> bool { - matches!(self, FunctionBody::<Language> { .. }) - } - - pub fn as_<language>(&self) -> Option<(&usize, &str)> { - if let FunctionBody::<Language> { body_node, source } = self { - Some((body_node, source)) - } else { - None - } - } -} -``` - ---- - -### Step 2: CFG Builder - -#### 2.1 Create CFG Builder - -Create `hotspots-core/src/language/<language>/cfg_builder.rs`: - -```rust -use crate::cfg::{Cfg, CfgNode}; -use anyhow::Result; - -pub struct <Language>CfgBuilder; - -impl <Language>CfgBuilder { - /// Build CFG from function body - pub fn build_cfg(source: &str) -> Result { - // Re-parse source to get tree-sitter tree - let mut parser = tree_sitter::Parser::new(); - parser.set_language(&tree_sitter_<language>::LANGUAGE.into())?; - let tree = parser.parse(source, None) - .ok_or_else(|| anyhow::anyhow!("Failed to parse function body"))?; - - let root = tree.root_node(); - - // Initialize CFG - let mut cfg = Cfg::new(); - let entry = cfg.add_node(CfgNode::Entry); - let exit = cfg.add_node(CfgNode::Exit); - - // Build CFG from AST - let last = Self::visit_block(&mut cfg, root, entry, exit)?; - cfg.add_edge(last, exit); - - Ok(cfg) - } - - fn visit_block(cfg: &mut Cfg, node: tree_sitter::Node, entry: usize, exit: usize) -> Result { - let mut current = entry; - - // Iterate through statements in block - let mut cursor = node.walk(); - if cursor.goto_first_child() { - loop { - current = Self::visit_statement(cfg, cursor.node(), current, exit)?; - if !cursor.goto_next_sibling() { - break; - } - } - } - - Ok(current) - } - - fn visit_statement(cfg: &mut Cfg, node: tree_sitter::Node, entry: usize, exit: usize) -> Result { - match node.kind() { - "if_statement" => Self::visit_if(cfg, node, entry, exit), - "while_statement" => Self::visit_while(cfg, node, entry, exit), - "for_statement" => Self::visit_for(cfg, node, entry, exit), - "return_statement" => { - // Early return - route directly to exit - cfg.add_edge(entry, exit); - Ok(entry) // Dead code after return - } - _ => { - // Simple statement - single node - let stmt_node = cfg.add_node(CfgNode::Statement); - cfg.add_edge(entry, stmt_node); - Ok(stmt_node) - } - } - } - - fn visit_if(cfg: &mut Cfg, node: tree_sitter::Node, entry: usize, exit: usize) -> Result { - // Create condition node - let cond = cfg.add_node(CfgNode::Condition); - cfg.add_edge(entry, cond); - - // Then branch - let then_node = node.child_by_field_name("consequence").unwrap(); - let then_last = Self::visit_block(cfg, then_node, cond, exit)?; - - // Else branch (optional) - let join = cfg.add_node(CfgNode::Join); - if let Some(else_node) = node.child_by_field_name("alternative") { - let else_last = Self::visit_block(cfg, else_node, cond, exit)?; - cfg.add_edge(else_last, join); - } else { - cfg.add_edge(cond, join); // No else - fall through - } - - cfg.add_edge(then_last, join); - Ok(join) - } - - fn visit_while(cfg: &mut Cfg, node: tree_sitter::Node, entry: usize, exit: usize) -> Result { - // Loop header (condition) - let header = cfg.add_node(CfgNode::LoopHeader); - cfg.add_edge(entry, header); - - // Loop body - let body_node = node.child_by_field_name("body").unwrap(); - let body_last = Self::visit_block(cfg, body_node, header, exit)?; - - // Back edge to loop header - cfg.add_edge(body_last, header); - - // Loop exit - let join = cfg.add_node(CfgNode::Join); - cfg.add_edge(header, join); // Exit when condition false - - Ok(join) - } - - // ... implement visit_for, visit_switch, etc. ... -} - -impl crate::language::cfg_builder::CfgBuilder for <Language>CfgBuilder { - fn build_cfg(body: &crate::language::function_body::FunctionBody) -> Result { - if let Some((_, source)) = body.as_<language>() { - Self::build_cfg(source) - } else { - anyhow::bail!("Expected <Language> function body") - } - } -} -``` - -**Key Considerations:** -- **Control flow constructs:** Handle if, while, for, switch, try/catch -- **Early exits:** return, throw, break, continue route to appropriate nodes -- **Loop context:** Track loop headers for break/continue routing -- **Nesting depth:** Track depth during traversal for ND metric -- **Language-specific:** Handle language-specific constructs (e.g., Python else-on-loops, Go defer) - -#### 2.2 Register CFG Builder - -Edit `hotspots-core/src/language/cfg_builder.rs`: - -```rust -pub fn create_cfg_builder(body: &FunctionBody) -> Box { - match body { - // ... existing languages ... - FunctionBody::<Language> { .. } => Box::new(<language>::<Language>CfgBuilder), - } -} -``` - -#### 2.3 Update Analysis Dispatcher - -Edit `hotspots-core/src/analysis.rs`: - -```rust -pub fn create_parser(lang: Language) -> Result> { - match lang { - // ... existing languages ... - Language::<Language> => { - Ok(Box::new(<language>::<Language>Parser::new() - .context("Failed to create <Language> parser")?)) - } - } -} -``` - ---- - -### Step 3: Testing - -#### 3.1 Create Test Fixtures - -Create `tests/fixtures/<language>/` directory with test files: - -**simple.ext** - Basic functions: -```<language> -// Simple function (low complexity) -function simpleFunction(x) { - return x + 1; -} - -// Function with early return -function withEarlyReturn(x) { - if (x < 0) { - return 0; - } - return x; -} -``` - -**loops.ext** - Loop constructs: -```<language> -// While loop -function whileLoop(n) { - int i = 0; - while (i < n) { - i++; - } - return i; -} - -// For loop with break -function forLoopWithBreak(items[]) { - for (item in items) { - if (item > 10) { - break; - } - } - return items[0]; -} - -// Nested loops -function nestedLoops(matrix[][]) { - for (row in matrix) { - for (col in row) { - if (col == 0) { - continue; - } - } - } -} -``` - -**branching.ext** - Conditional logic: -```<language> -// If/else chains -function ifElseChain(value) { - if (value < 0) { - return "negative"; - } else if (value == 0) { - return "zero"; - } else { - return "positive"; - } -} - -// Switch statement -function switchStatement(value) { - switch (value) { - case 0: - return "zero"; - case 1: - return "one"; - default: - return "other"; - } -} -``` - -**Create 5-7 test files covering:** -- Simple functions (low complexity) -- Loops (while, for, do-while) -- Conditionals (if, else, switch) -- Early exits (return, throw) -- Nesting (nested if, nested loops) -- Language-specific constructs - -#### 3.2 Generate Golden Files - -```bash -# Build release binary -cargo build --release - -# Generate golden output for each fixture -./target/release/hotspots analyze tests/fixtures/<language>/simple.ext --format json > tests/golden/<language>-simple.json -./target/release/hotspots analyze tests/fixtures/<language>/loops.ext --format json > tests/golden/<language>-loops.json -./target/release/hotspots analyze tests/fixtures/<language>/branching.ext --format json > tests/golden/<language>-branching.json - -# Verify output manually -cat tests/golden/<language>-simple.json | jq . -``` - -**Golden file checklist:** -- ✅ Includes all expected functions -- ✅ Metrics (CC, ND, FO, NS) look correct -- ✅ LRS calculations are reasonable -- ✅ Risk bands are appropriate -- ✅ No parsing errors - -#### 3.3 Add Unit Tests - -Create `hotspots-core/tests/<language>_tests.rs`: - -```rust -#[cfg(test)] -mod <language>_tests { - use hotspots_core::language::<language>::<Language>Parser; - use hotspots_core::analyze_function; - - #[test] - fn test_simple_function() { - let source = r#" - function simple(x) { - return x + 1; - } - "#; - - let mut parser = <Language>Parser::new().unwrap(); - let functions = parser.discover_functions(source).unwrap(); - - assert_eq!(functions.len(), 1); - - let report = analyze_function(&functions[0], "test.ext").unwrap(); - assert_eq!(report.metrics.cc, 1); // No branches - assert_eq!(report.metrics.nd, 0); // No nesting - } - - #[test] - fn test_if_statement() { - let source = r#" - function withIf(x) { - if (x > 0) { - return x; - } - return 0; - } - "#; - - let mut parser = <Language>Parser::new().unwrap(); - let functions = parser.discover_functions(source).unwrap(); - let report = analyze_function(&functions[0], "test.ext").unwrap(); - - assert_eq!(report.metrics.cc, 2); // if adds +1 - assert_eq!(report.metrics.nd, 1); // one level deep - assert_eq!(report.metrics.ns, 1); // early return - } - - // Add more tests... -} -``` - -#### 3.4 Run Tests - -```bash -# Run all tests -cargo test - -# Run language-specific tests -cargo test <language> - -# Run with output -cargo test <language> -- --nocapture - -# Verify golden tests -cargo test --test golden_tests -``` - ---- - -### Step 4: Documentation - -#### 4.1 Update Language Support Docs - -Edit `docs/reference/language-support.md`: - -```markdown -## <Language> - -**Supported:** Yes (v1.x.x+) -**File Extensions:** `.<ext>` -**tree-sitter Parser:** `tree-sitter-<language>` - -### Function Detection - -- Function declarations -- Method declarations -- Class methods (if applicable) -- Lambda expressions / closures - -### Complexity Metrics - -**Cyclomatic Complexity (CC):** -- `if`, `else if` statements (+1 each) -- `while`, `for`, `do-while` loops (+1 each) -- `switch` cases (+1 per case) -- Ternary operators `? :` (+1) -- Logical operators `&&`, `||` (+1 each) -- Language-specific constructs... - -**Nesting Depth (ND):** -- Maximum depth of nested control structures - -**Fan-Out (FO):** -- Function/method calls -- Constructor calls - -**Non-Structured Exits (NS):** -- `return` statements -- `throw` statements -- `break` statements -- `continue` statements - -### Language-Specific Behavior - -**[Document unique behaviors]** - -Example: Python's else-on-loops, Go's defer statements, etc. - -### Example Analysis - -```<language> -function complexFunction(data) { - if (data.type === 'A') { - for (item in data.items) { - if (item.active) { - return processItem(item); - } - } - } - return null; -} -``` - -**Metrics:** -- CC: 4 (if type, for loop, if active, implicit +1) -- ND: 2 (if nested in for) -- FO: 1 (processItem call) -- NS: 1 (early return) -- LRS: ~4.8 (moderate risk) -``` - -#### 4.2 Update README.md - -Update supported languages list and examples. - -#### 4.3 Add Examples - -Create `examples/<language>/` with sample projects. - ---- - -## Testing Checklist - -Before submitting PR: - -- [ ] Parser correctly discovers all function types -- [ ] CFG builder handles all control flow constructs -- [ ] Metrics match manual calculation -- [ ] All unit tests pass -- [ ] Golden tests pass (deterministic output) -- [ ] No compilation warnings -- [ ] Code follows Rust conventions (`cargo fmt`, `cargo clippy`) -- [ ] Documentation updated -- [ ] Examples added - ---- - -## Common Pitfalls - -### 1. Non-Deterministic Output - -**Problem:** Functions appear in random order. - -**Solution:** Always sort by source position: -```rust -functions.sort_by_key(|f| f.start_position()); -``` - -### 2. Missing Function Types - -**Problem:** Only detecting some functions (e.g., missing methods, closures). - -**Solution:** Check all function node kinds in tree-sitter grammar: -```bash -# Inspect grammar -tree-sitter parse examples/code.ext --debug -``` - -### 3. Incorrect CC Calculation - -**Problem:** CC doesn't match manual count. - -**Solution:** Debug CFG visualization, check: -- All decision points counted (+1 for each) -- No double-counting -- Implicit +1 for function entry - -### 4. Break/Continue Routing - -**Problem:** Break/continue go to wrong CFG nodes. - -**Solution:** Track loop context stack: -```rust -struct LoopContext { - header: usize, - exit: usize, -} - -let mut loop_stack: Vec = Vec::new(); -``` - -### 5. Parser Crashes - -**Problem:** Parser panics on certain code. - -**Solution:** Add error handling: -```rust -let body_node = node.child_by_field_name("body") - .ok_or_else(|| anyhow::anyhow!("Missing function body"))?; -``` - ---- - -## Reference Implementations - -**Good examples to study:** -1. **Go** (`hotspots-core/src/language/go/`) - Simple, clean implementation -2. **Python** (`hotspots-core/src/language/python/`) - Complex language features -3. **Rust** (`hotspots-core/src/language/rust/`) - Pattern matching, complex control flow - ---- - -## Getting Help - -- 💬 [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) -- 📧 [Open an Issue](https://github.com/Stephen-Collins-tech/hotspots/issues) -- 📖 [Architecture Docs](../architecture/overview.md) - ---- - -## After Implementation - -Once language support is complete: - -1. **Open PR** with all changes -2. **Update CHANGELOG.md** -3. **Announce** in discussions -4. **Add to roadmap** for future enhancements - -**Congratulations!** You've added language support to Hotspots. 🎉 diff --git a/docs/contributing/development.md b/docs/contributing/development.md deleted file mode 100644 index b195774..0000000 --- a/docs/contributing/development.md +++ /dev/null @@ -1,594 +0,0 @@ -# Development Setup - -Complete guide to setting up your development environment for Hotspots. - -## Prerequisites - -### Required - -- **Rust 1.75+** - Core implementation language -- **Git** - Version control -- **Cargo** - Comes with Rust - -### Optional (for full development) - -- **Node.js 18+** - For GitHub Action development -- **npm 8+** - For JavaScript packages -- **jq** - For JSON validation and testing - -## Quick Start - -### 1. Clone Repository - -```bash -# Fork the repository first (on GitHub) -# Then clone your fork -git clone https://github.com/YOUR_USERNAME/hotspots.git -cd hotspots -``` - -### 2. Install Rust (if needed) - -```bash -# Install rustup (Rust version manager) -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh - -# Verify installation -rustc --version # Should be 1.75 or higher -cargo --version -``` - -### 3. Build the Project - -```bash -# Build in debug mode (faster compilation) -cargo build - -# Build in release mode (optimized, slower compilation) -cargo build --release -``` - -**Output:** -- Debug binary: `./target/debug/hotspots` -- Release binary: `./target/release/hotspots` - -### 4. Run Tests - -```bash -# Run all tests -cargo test - -# Run specific package tests -cargo test --package hotspots-core -cargo test --package hotspots-cli - -# Run with output visible -cargo test -- --nocapture -``` - -### 5. Run the CLI - -```bash -# Using debug build -./target/debug/hotspots --help -./target/debug/hotspots analyze src/ - -# Using release build (faster) -./target/release/hotspots --help -./target/release/hotspots analyze src/ -``` - ---- - -## Project Structure - -``` -hotspots/ -├── hotspots-core/ # Core analysis library (Rust) -│ ├── src/ -│ │ ├── language/ # Language parsers & CFG builders -│ │ │ ├── typescript/ -│ │ │ ├── javascript/ -│ │ │ ├── go/ -│ │ │ ├── java/ -│ │ │ ├── python/ -│ │ │ └── rust/ -│ │ ├── metrics.rs # Metrics calculation (CC, ND, FO, NS) -│ │ ├── delta.rs # Delta mode logic -│ │ ├── snapshot.rs # Snapshot persistence -│ │ ├── policy.rs # Policy evaluation -│ │ ├── config.rs # Configuration loading -│ │ └── ... -│ └── tests/ # Integration & golden tests -│ ├── fixtures/ # Test code files -│ └── golden/ # Expected output files -│ -├── hotspots-cli/ # CLI binary (Rust) -│ └── src/ -│ └── main.rs # CLI entry point, argument parsing -│ -├── packages/ # TypeScript packages -│ └── types/ # TypeScript type definitions -│ └── package.json -│ -├── action/ # GitHub Action -│ ├── action.yml -│ └── dist/ -│ -├── docs/ # Documentation (VitePress) -│ ├── .vitepress/ -│ ├── index.md -│ └── ... -│ -├── examples/ # Example code and integrations -│ └── ai-agents/ -│ -├── tests/ # Repository-level tests -│ └── fixtures/ -│ -├── Cargo.toml # Workspace configuration -├── CLAUDE.md # Coding conventions -├── CONTRIBUTING.md # Contribution guide -└── README.md # Project overview -``` - ---- - -## Development Workflow - -### Making Changes - -1. **Create a feature branch:** - ```bash - git checkout -b feature/your-feature-name - ``` - -2. **Make your changes** in the appropriate directory - - Core logic: `hotspots-core/src/` - - CLI: `hotspots-cli/src/` - - Docs: `docs/` - -3. **Run formatting and linting:** - ```bash - cargo fmt - cargo clippy - ``` - -4. **Build and test:** - ```bash - cargo build - cargo test - ``` - -5. **Commit your changes:** - ```bash - git add . - git commit -m "feat: add your feature description" - ``` - -See [CLAUDE.md](../../CLAUDE.md) for commit message conventions. - -### Testing Your Changes - -#### Unit Tests - -```bash -# Run all tests -cargo test - -# Run specific module tests -cargo test metrics -cargo test snapshot - -# Run with output visible -cargo test test_name -- --nocapture - -# Run tests for a specific crate -cargo test --package hotspots-core -``` - -#### Integration Tests - -```bash -# Run integration tests -cargo test --test integration_tests - -# Run golden tests (deterministic output verification) -cargo test --test golden_tests -``` - -#### Manual Testing - -```bash -# Test on a real file -./target/debug/hotspots analyze tests/fixtures/typescript/simple.ts - -# Test JSON output -./target/debug/hotspots analyze src/ --format json | jq . - -# Test snapshot mode -./target/debug/hotspots analyze src/ --mode snapshot --format json - -# Test delta mode -./target/debug/hotspots analyze src/ --mode delta --policy --format text -``` - -### Adding Test Fixtures - -When adding new functionality: - -1. **Create test fixture** in `tests/fixtures//` - ```bash - echo 'function test() { if (x) { return 1; } return 0; }' > tests/fixtures/typescript/new_test.ts - ``` - -2. **Generate golden output:** - ```bash - cargo build --release - ./target/release/hotspots analyze tests/fixtures/typescript/new_test.ts --format json > tests/golden/new_test.json - ``` - -3. **Verify output manually** before committing - -4. **Add golden test case** to `tests/golden_tests.rs` - ---- - -## IDE Setup - -### VS Code - -**Recommended Extensions:** -- `rust-lang.rust-analyzer` - Rust language server -- `tamasfe.even-better-toml` - TOML syntax highlighting -- `vadimcn.vscode-lldb` - Debugger - -**Settings (`.vscode/settings.json`):** -```json -{ - "rust-analyzer.checkOnSave.command": "clippy", - "editor.formatOnSave": true, - "[rust]": { - "editor.defaultFormatter": "rust-lang.rust-analyzer" - } -} -``` - -**Launch Configuration (`.vscode/launch.json`):** -```json -{ - "version": "0.2.0", - "configurations": [ - { - "type": "lldb", - "request": "launch", - "name": "Debug hotspots", - "cargo": { - "args": ["build", "--bin=hotspots", "--package=hotspots-cli"] - }, - "args": ["analyze", "tests/fixtures/typescript/simple.ts"], - "cwd": "${workspaceFolder}" - } - ] -} -``` - -### Cursor - -Cursor inherits VS Code configuration. Same extensions and settings work. - -**Additional for AI-assisted development:** -- Use Hotspots MCP server for complexity analysis during development -- Configure `.cursorrules` for project-specific AI guidelines - -### IntelliJ IDEA / CLion - -**Recommended Plugins:** -- Rust (official JetBrains plugin) -- TOML - -**Run Configuration:** -- Name: `Hotspots CLI` -- Command: `run` -- Arguments: `--bin hotspots -- analyze tests/fixtures/typescript/simple.ts` - ---- - -## Building Components - -### Core Library - -```bash -# Build core library only -cargo build --package hotspots-core - -# Run core tests only -cargo test --package hotspots-core - -# Build with specific features (if any) -cargo build --package hotspots-core --features "feature-name" -``` - -### CLI Binary - -```bash -# Build CLI only -cargo build --package hotspots-cli - -# Run CLI directly with cargo -cargo run --package hotspots-cli -- analyze src/ - -# Install locally for testing -cargo install --path hotspots-cli -hotspots --version -``` - -### GitHub Action - -```bash -cd action - -# Install dependencies -npm install - -# Build -npm run build - -# Test locally (requires act) -act pull_request -``` - ---- - -## Debugging - -### Rust Debugging with LLDB - -```bash -# Debug with lldb -rust-lldb ./target/debug/hotspots - -# In lldb: -(lldb) run analyze tests/fixtures/typescript/simple.ts -(lldb) breakpoint set --name analyze_with_config -(lldb) continue -``` - -### Print Debugging - -```rust -// Use dbg! macro for quick debugging -dbg!(&some_variable); - -// Use eprintln! to print to stderr (doesn't pollute stdout) -eprintln!("Debug: value = {}", value); -``` - -### Environment Variables - -```bash -# Enable Rust backtrace -RUST_BACKTRACE=1 cargo test - -# Full backtrace -RUST_BACKTRACE=full cargo test - -# Enable logging (if using env_logger) -RUST_LOG=debug cargo run -- analyze src/ -``` - ---- - -## Performance Profiling - -### Cargo Flamegraph - -```bash -# Install flamegraph -cargo install flamegraph - -# Generate flamegraph -cargo flamegraph --bin hotspots -- analyze large-project/ - -# Output: flamegraph.svg -``` - -### Cargo Bench - -```bash -# Run benchmarks (if configured) -cargo bench - -# Run specific benchmark -cargo bench benchmark_name -``` - -### Manual Timing - -```bash -# Use time command -time ./target/release/hotspots analyze src/ - -# Or hyperfine for statistical analysis -brew install hyperfine # macOS -hyperfine './target/release/hotspots analyze src/' -``` - ---- - -## Common Tasks - -### Update Dependencies - -```bash -# Check for outdated dependencies -cargo outdated - -# Update dependencies -cargo update - -# Update specific dependency -cargo update -p tree-sitter -``` - -### Clean Build - -```bash -# Remove build artifacts -cargo clean - -# Rebuild from scratch -cargo build --release -``` - -### Generate Documentation - -```bash -# Generate Rust docs -cargo doc --open - -# Build VitePress docs -cd docs -npm install -npm run docs:dev # Development server -npm run docs:build # Production build -``` - -### Run Linters - -```bash -# Format code -cargo fmt - -# Check formatting without changing files -cargo fmt -- --check - -# Run clippy (linter) -cargo clippy - -# Clippy with all warnings -cargo clippy -- -W clippy::all - -# Fix clippy suggestions automatically -cargo clippy --fix -``` - ---- - -## Troubleshooting - -### "cargo: command not found" - -**Solution:** Install Rust via rustup: -```bash -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -source $HOME/.cargo/env -``` - -### "linker `cc` not found" - -**Solution (macOS):** -```bash -xcode-select --install -``` - -**Solution (Linux):** -```bash -# Debian/Ubuntu -sudo apt-get install build-essential - -# Fedora -sudo dnf install gcc -``` - -### "could not compile `tree-sitter-xyz`" - -**Solution:** Update tree-sitter parsers: -```bash -cargo update -cargo clean -cargo build -``` - -### Tests Failing - -```bash -# Run specific test to see output -cargo test test_name -- --nocapture - -# Check if golden files are outdated -# Regenerate golden files if needed: -./target/release/hotspots analyze tests/fixtures/typescript/simple.ts --format json > tests/golden/simple.json -``` - -### Slow Compilation - -**Solutions:** -- Use debug builds during development: `cargo build` (not `--release`) -- Enable incremental compilation (already default) -- Use `cargo check` instead of `cargo build` for quick error checking -- Consider using `sccache` for distributed compilation caching - -```bash -# Quick error checking (no binary output) -cargo check - -# Install sccache -cargo install sccache -export RUSTC_WRAPPER=sccache -``` - ---- - -## Continuous Integration - -Our CI runs: - -1. **Format Check:** `cargo fmt -- --check` -2. **Linting:** `cargo clippy -- -D warnings` -3. **Tests:** `cargo test` -4. **Build:** `cargo build --release` - -**Run locally before pushing:** -```bash -cargo fmt -- --check && cargo clippy -- -D warnings && cargo test && cargo build --release -``` - ---- - -## Release Builds - -For release builds with maximum optimization: - -```bash -# Build optimized binary -cargo build --release - -# Strip debug symbols (smaller binary) -strip ./target/release/hotspots - -# Check binary size -ls -lh ./target/release/hotspots -``` - ---- - -## Related Documentation - -- [Adding Language Support](./adding-languages.md) - Implement a new language parser -- [Architecture Overview](../architecture/overview.md) - System design and components -- [Testing Strategy](../architecture/testing.md) - Testing approach and patterns -- [Release Process](./releases.md) - How to create releases -- [CLAUDE.md](../../CLAUDE.md) - Coding conventions and rules - ---- - -## Getting Help - -- 💬 [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) -- 📧 [Open an Issue](https://github.com/Stephen-Collins-tech/hotspots/issues) -- 📖 [Documentation](https://docs.hotspots.dev) - ---- - -**Ready to contribute?** Check out [good first issues](https://github.com/Stephen-Collins-tech/hotspots/labels/good%20first%20issue) to get started! diff --git a/docs/contributing/index.md b/docs/contributing/index.md deleted file mode 100644 index 235db41..0000000 --- a/docs/contributing/index.md +++ /dev/null @@ -1,245 +0,0 @@ -# Contributing to Hotspots - -Thank you for your interest in contributing! This guide covers everything you need to get started. - -## Prerequisites - -- **Rust 1.75+** — Core implementation language (`rustc --version`) -- **Git** -- **Cargo** — Comes with Rust - -Optional for full development: -- **Node.js 18+** — For GitHub Action development -- **jq** — For JSON validation and testing - -## Quick Setup - -```bash -# 1. Fork the repository on GitHub, then clone your fork -git clone https://github.com/YOUR_USERNAME/hotspots.git -cd hotspots - -# 2. Install Rust if needed -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh - -# 3. Build -cargo build - -# 4. Run tests -cargo test - -# 5. Verify the CLI works -./target/debug/hotspots analyze tests/fixtures/typescript/simple.ts -``` - ---- - -## Project Structure - -``` -hotspots/ -├── hotspots-core/ # Core analysis library (Rust) -│ ├── src/ -│ │ ├── language/ # Language parsers & CFG builders -│ │ │ ├── typescript/ -│ │ │ ├── javascript/ -│ │ │ ├── go/ -│ │ │ ├── java/ -│ │ │ ├── python/ -│ │ │ └── rust/ -│ │ ├── metrics.rs # CC, ND, FO, NS calculation -│ │ ├── delta.rs # Delta mode logic -│ │ ├── snapshot.rs # Snapshot persistence -│ │ ├── policy.rs # Policy evaluation -│ │ └── config.rs # Configuration loading -│ └── tests/ -│ ├── fixtures/ # Test code files -│ └── golden/ # Expected output files -│ -├── hotspots-cli/ # CLI binary (Rust) -│ └── src/main.rs # CLI entry point -│ -├── packages/ # TypeScript packages -│ └── types/ # TypeScript type definitions -│ -├── action/ # GitHub Action -├── docs/ # VitePress documentation -├── Cargo.toml # Workspace configuration -├── CLAUDE.md # Coding conventions -└── CONTRIBUTING.md # Contribution guide -``` - ---- - -## Development Workflow - -### Making Changes - -```bash -# 1. Create a feature branch -git checkout -b feature/your-feature-name - -# 2. Make changes in the appropriate directory - -# 3. Check formatting and linting -cargo fmt --all -- --check -cargo clippy --all-targets --all-features -- -D warnings - -# 4. Build and test -cargo build -cargo test - -# 5. Commit (see commit conventions below) -git add . -git commit -m "feat: your feature description" -``` - -### Commit Conventions - -All commit messages must be a single line, under 72 characters: - -``` -: -``` - -**Types:** `feat`, `fix`, `refactor`, `test`, `docs`, `chore` - -**Examples:** -``` -feat: add suppression comments support -fix: correct ND calculation for try blocks -docs: update quick start for React projects -``` - -Never include a multi-paragraph body unless explicitly requested. - -### Before Committing - -Run these checks — they are enforced by pre-commit hooks: - -```bash -cargo fmt --all -- --check -cargo clippy --all-targets --all-features -- -D warnings -cargo test -``` - ---- - -## Running Tests - -```bash -# All tests -cargo test - -# Specific package -cargo test --package hotspots-core - -# With visible output -cargo test -- --nocapture - -# Golden tests (deterministic output verification) -cargo test --test golden_tests - -# Integration tests -cargo test --test integration_tests -``` - -### Adding Test Fixtures - -When adding new functionality: - -```bash -# 1. Create a test fixture -echo 'function test() { if (x) { return 1; } return 0; }' \ - > tests/fixtures/typescript/new_test.ts - -# 2. Generate golden output -cargo build --release -./target/release/hotspots analyze tests/fixtures/typescript/new_test.ts \ - --format json > tests/golden/new_test.json - -# 3. Verify output manually, then add a golden test case in tests/golden_tests.rs -``` - ---- - -## Adding a Language - -Adding a new language parser involves: - -1. **Create `hotspots-core/src/language//`** with: - - `parser.rs` — Tree-sitter or syn-based parser - - `cfg_builder.rs` — Control Flow Graph builder - -2. **Register in `hotspots-core/src/metrics.rs`** — add an `extract__metrics()` function - -3. **Add file extensions** to the language detection logic - -4. **Add test fixtures** in `tests/fixtures//` with golden files - -5. **Implement metric semantics** consistent with existing languages: - - CC via CFG formula, with language-specific increments (switch cases, catch clauses, boolean ops) - - ND as maximum depth of control structures - - FO as distinct function calls (deduplicated) - - NS as early exits excluding final tail return - -See `hotspots-core/src/language/go/` for a well-documented example. Full guide: [adding-languages.md](./adding-languages.md). - ---- - -## CI - -Our CI runs on every PR: - -1. `cargo fmt --all -- --check` — Format check -2. `cargo clippy --all-targets --all-features -- -D warnings` — Linting -3. `cargo test` — Full test suite -4. `cargo build --release` — Release build - -Run locally before pushing: -```bash -cargo fmt --all -- --check && \ -cargo clippy --all-targets --all-features -- -D warnings && \ -cargo test && \ -cargo build --release -``` - ---- - -## Release Checklist - -When cutting a release: - -- [ ] Update version in `Cargo.toml` -- [ ] Update version in `action/package.json` -- [ ] Update `CHANGELOG.md` with release notes -- [ ] Run full CI checks locally -- [ ] Tag the release: `git tag v1.x.x` -- [ ] Build binaries for all platforms (Linux x86_64, macOS x86_64, macOS ARM64, Windows x86_64) -- [ ] Create GitHub release with binaries attached -- [ ] Test the GitHub Action with the new release tag -- [ ] Update documentation if needed - -**Building release binaries:** - -```bash -# macOS ARM64 (Apple Silicon) -cargo build --release --target aarch64-apple-darwin -tar -czf hotspots-darwin-aarch64.tar.gz -C target/aarch64-apple-darwin/release hotspots - -# Linux x86_64 -cargo build --release --target x86_64-unknown-linux-gnu -tar -czf hotspots-linux-x86_64.tar.gz -C target/x86_64-unknown-linux-gnu/release hotspots -``` - -Full release process: [releases.md](./releases.md). - ---- - -## Getting Help - -- [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) — Questions and ideas -- [Open an Issue](https://github.com/Stephen-Collins-tech/hotspots/issues) — Bug reports and feature requests -- [Good First Issues](https://github.com/Stephen-Collins-tech/hotspots/labels/good%20first%20issue) — Start here - -See also: [CLAUDE.md](../../CLAUDE.md) for detailed coding conventions, [CONTRIBUTING.md](../../CONTRIBUTING.md) in the repository root. diff --git a/docs/contributing/releases.md b/docs/contributing/releases.md deleted file mode 100644 index eda9251..0000000 --- a/docs/contributing/releases.md +++ /dev/null @@ -1,280 +0,0 @@ -# Hotspots Release Process - -This document describes how to create releases for hotspots, including building binaries for the GitHub Action. - -## Prerequisites - -- Rust toolchain installed -- GitHub CLI (`gh`) installed and authenticated -- Write access to the repository - -## Release Checklist - -- [ ] Update version in `Cargo.toml` -- [ ] Update version in `action/package.json` -- [ ] Update `CHANGELOG.md` with release notes -- [ ] Build binaries for all platforms -- [ ] Create GitHub release with binaries -- [ ] Test GitHub Action with new release -- [ ] Update documentation - -## Building Release Binaries - -### Method 1: Using GitHub Actions (Recommended) - -We'll create a release workflow that builds for all platforms automatically. - -### Method 2: Manual Cross-Compilation - -#### Linux x86_64 - -```bash -cargo build --release --target x86_64-unknown-linux-gnu -strip target/x86_64-unknown-linux-gnu/release/hotspots -tar -czf hotspots-linux-x86_64.tar.gz \ - -C target/x86_64-unknown-linux-gnu/release hotspots -``` - -#### macOS x86_64 (Intel) - -```bash -cargo build --release --target x86_64-apple-darwin -strip target/x86_64-apple-darwin/release/hotspots -tar -czf hotspots-darwin-x86_64.tar.gz \ - -C target/x86_64-apple-darwin/release hotspots -``` - -#### macOS ARM64 (Apple Silicon) - -```bash -cargo build --release --target aarch64-apple-darwin -strip target/aarch64-apple-darwin/release/hotspots -tar -czf hotspots-darwin-aarch64.tar.gz \ - -C target/aarch64-apple-darwin/release hotspots -``` - -#### Windows x86_64 - -```bash -cargo build --release --target x86_64-pc-windows-msvc -# Or cross-compile from Linux: -cargo build --release --target x86_64-pc-windows-gnu -zip hotspots-windows-x86_64.zip \ - target/x86_64-pc-windows-*/release/hotspots.exe -``` - -## Creating a Release - -### 1. Update Version Numbers - -```bash -# Update Cargo.toml -sed -i '' 's/version = ".*"/version = "1.0.0"/' Cargo.toml - -# Update action/package.json -sed -i '' 's/"version": ".*"/"version": "1.0.0"/' action/package.json - -# Update CHANGELOG.md -echo "## [1.0.0] - $(date +%Y-%m-%d)" >> CHANGELOG.md -``` - -### 2. Build Action - -```bash -cd action -npm install -npm run package -git add dist/ -cd .. -``` - -### 3. Commit and Tag - -```bash -git add Cargo.toml action/package.json CHANGELOG.md action/dist/ -git commit -m "chore: release v1.0.0" -git tag v1.0.0 -git push origin main -git push origin v1.0.0 -``` - -### 4. Create GitHub Release with Binaries - -```bash -# Create release -gh release create v1.0.0 \ - --title "v1.0.0" \ - --notes-file CHANGELOG.md \ - hotspots-linux-x86_64.tar.gz \ - hotspots-darwin-x86_64.tar.gz \ - hotspots-darwin-aarch64.tar.gz \ - hotspots-windows-x86_64.zip -``` - -## Automated Release Workflow - -Create `.github/workflows/release.yml`: - -```yaml -name: Release - -on: - push: - tags: - - 'v*' - -permissions: - contents: write - -jobs: - build-binaries: - name: Build ${{ matrix.target }} - runs-on: ${{ matrix.os }} - strategy: - matrix: - include: - - os: ubuntu-latest - target: x86_64-unknown-linux-gnu - archive: tar.gz - - os: macos-latest - target: x86_64-apple-darwin - archive: tar.gz - - os: macos-latest - target: aarch64-apple-darwin - archive: tar.gz - - os: windows-latest - target: x86_64-pc-windows-msvc - archive: zip - - steps: - - uses: actions/checkout@v4 - - - name: Install Rust - uses: dtolnay/rust-toolchain@stable - with: - targets: ${{ matrix.target }} - - - name: Build - run: cargo build --release --target ${{ matrix.target }} --bin hotspots - - - name: Create archive (Unix) - if: matrix.archive == 'tar.gz' - run: | - cd target/${{ matrix.target }}/release - tar -czf ../../../hotspots-${{ matrix.target }}.${{ matrix.archive }} hotspots - cd ../../.. - - - name: Create archive (Windows) - if: matrix.archive == 'zip' - shell: pwsh - run: | - cd target/${{ matrix.target }}/release - Compress-Archive -Path hotspots.exe -DestinationPath ../../../hotspots-${{ matrix.target }}.${{ matrix.archive }} - cd ../../.. - - - name: Upload artifact - uses: actions/upload-artifact@v4 - with: - name: hotspots-${{ matrix.target }} - path: hotspots-${{ matrix.target }}.${{ matrix.archive }} - - create-release: - name: Create Release - needs: build-binaries - runs-on: ubuntu-latest - - steps: - - uses: actions/checkout@v4 - - - name: Download artifacts - uses: actions/download-artifact@v4 - with: - path: artifacts - - - name: Create Release - uses: softprops/action-gh-release@v1 - with: - files: artifacts/**/* - generate_release_notes: true - draft: false - prerelease: false -``` - -## Post-Release Testing - -After creating a release, test the GitHub Action: - -```yaml -# In a test repository -- uses: yourorg/hotspots@v1.0.0 -``` - -Verify: -- [ ] Binary downloads successfully -- [ ] Analysis runs correctly -- [ ] PR comments are posted -- [ ] HTML report is generated -- [ ] Job summary is displayed - -## Version Numbering - -We follow [Semantic Versioning](https://semver.org/): - -- **MAJOR** (1.0.0): Breaking API changes -- **MINOR** (0.1.0): New features, backwards compatible -- **PATCH** (0.0.1): Bug fixes, backwards compatible - -## Release Cadence - -- **Patch releases**: As needed for bug fixes -- **Minor releases**: Monthly for new features -- **Major releases**: Annually or when breaking changes are necessary - -## Changelog Format - -Follow [Keep a Changelog](https://keepachangelog.com/): - -```markdown -## [1.0.0] - 2026-02-04 - -### Added -- GitHub Action for CI/CD integration -- HTML report generation -- Proactive warning system - -### Changed -- Improved policy engine performance - -### Fixed -- CFG routing for break/continue statements -``` - -## Rolling Back a Release - -If a release has critical bugs: - -```bash -# Delete the release -gh release delete v1.0.0 --yes - -# Delete the tag -git tag -d v1.0.0 -git push origin :refs/tags/v1.0.0 - -# Create a new patch release with fixes -``` - -## Updating the Action After Release - -Users can reference the action by major version: - -```yaml -- uses: yourorg/hotspots@v1 # Automatically uses latest v1.x.x -``` - -To update the major version pointer: - -```bash -git tag -fa v1 -m "Update v1 to v1.2.0" -git push origin v1 --force -``` diff --git a/docs/diff-command.md b/docs/diff-command.md deleted file mode 100644 index ea22dfd..0000000 --- a/docs/diff-command.md +++ /dev/null @@ -1,204 +0,0 @@ -# `hotspots diff` — Feature Requirements - -## Overview - -`hotspots diff ` shows only the functions whose LRS or CC changed between two git refs. It is the PR-focused complement to `hotspots analyze --mode delta`, which compares HEAD against its immediate parent. - -Intended workflows: -- **Local review:** `hotspots diff main HEAD` before opening a PR -- **CI/PR check:** `hotspots diff $BASE_SHA $HEAD_SHA` in GitHub Actions -- **Historical comparison:** `hotspots diff v1.0.0 v2.0.0` - ---- - -## CLI Interface - -``` -hotspots diff [OPTIONS] - -Arguments: - Git ref for the baseline (branch name, tag, or SHA) - Git ref for the comparison point (branch name, tag, or SHA) - -Options: - --format Output format: text (default), json, html [default: text] - --output Write output to file instead of stdout - --policy Evaluate policy rules; exit 1 on blocking failures - --auto-analyze Analyze missing refs automatically using git worktrees - --top Limit output to top N changed functions (by |ΔLRS|) - --config Path to hotspots config file -``` - -### Argument forms accepted for `` and `` - -- Branch names: `main`, `origin/main` -- Tags: `v1.2.0` -- Commit SHAs (full or abbreviated): `abc1234`, `abc1234def5678` -- Relative refs: `HEAD`, `HEAD~1`, `HEAD~3` - -Resolution: all refs are passed through `git rev-parse ` to obtain the canonical full SHA before snapshot lookup. - ---- - -## Behavior - -### Snapshot lookup - -Both refs are resolved to full SHAs. Snapshots are loaded from `.hotspots/snapshots/.json.zst`. - -**If a snapshot is missing (default — no `--auto-analyze`):** - -Print a clear error for each missing ref and exit non-zero: - -``` -error: no snapshot found for base ref 'main' (abc1234ef...) - → run: git checkout main && hotspots analyze - -error: no snapshot found for head ref 'HEAD' (def5678ab...) - → run: git checkout def5678ab && hotspots analyze - -Once both snapshots exist, re-run: hotspots diff main HEAD -``` - -**If `--auto-analyze` is set:** - -For each missing snapshot, analyze that ref in an isolated git worktree: - -1. `git worktree add ` -2. Run the same analysis that `hotspots analyze` would run, using the main repo's config -3. Persist the resulting snapshot to the main repo's `.hotspots/snapshots/.json.zst` -4. `git worktree remove --force ` - -Progress is printed to stderr: -``` -[hotspots] no snapshot for 'main' (abc1234) — analyzing in temp worktree... -[hotspots] no snapshot for 'HEAD' (def5678) — analyzing in temp worktree... -[hotspots] computing diff... -``` - -The current working tree is never modified. If worktree creation or analysis fails, any created worktrees are cleaned up before exiting. - -### Delta computation - -Once both snapshots are loaded, call `Delta::new(head_snapshot, Some(&base_snapshot))`. This is the same engine used by `hotspots analyze --mode delta`. - -The delta contains: -- `New` — functions present in head but not base -- `Deleted` — functions present in base but not head -- `Modified` — functions present in both with changed metrics -- `Unchanged` — functions present in both with identical metrics - -### Filtering - -By default, `Unchanged` functions are omitted from output. All other statuses are shown. - -`--top N` retains only the N functions with the largest `|ΔLRS|` (absolute value), after the `Unchanged` filter. - -### Output formats - -| Format | Content | -|--------|---------| -| `text` (default) | Human-readable table of changed functions, printed to stdout | -| `json` | Full `Delta` struct as pretty-printed JSON | -| `jsonl` | One JSON object per changed function, newline-delimited | -| `html` | Interactive HTML report via `render_html_delta()` — same as delta mode | -| `sarif` | Not yet implemented for diff — returns an error (deferred to a future release) | - -Text format columns: `STATUS`, `FUNCTION`, `FILE`, `LRS (before→after)`, `CC (before→after)`, `BAND`. Note: LINE is not available in text output because `FunctionState` does not carry a line number; use `--format json` to get line numbers. - -`--top N` limits the number of functions shown across all formats. Selection uses a risk-aware sort key: New functions rank by `after.lrs`, Deleted by `before.lrs`, Modified by `|Δlrs|`. This ensures a newly-introduced critical function is never buried below a trivial modification. - -### Policy evaluation - -When `--policy ` is provided, policy rules are evaluated against the delta (same as `hotspots analyze --mode delta --policy`). Exit code 1 if any blocking failures. - -### Exit codes - -| Code | Meaning | -|------|---------| -| 0 | Success; no policy failures | -| 1 | Policy failure (blocking rule violated) | -| 2 | Usage error (bad args, ref doesn't resolve, worktree failure) | -| 3 | Snapshot not found — re-run with `--auto-analyze` to generate missing snapshots | - -Exit code 3 is intentionally distinct from 2 so CI scripts can detect "snapshots not yet generated" and either fail with a clear message or retry with `--auto-analyze`. - ---- - -## `--auto-analyze` Details - -### Worktree lifecycle - -``` -tmp_dir = tempdir() in system temp (e.g. /tmp/hotspots-diff-abc1234) -git worktree add -run analysis with repo_root = tmp_dir, config from main repo -write snapshot to /.hotspots/snapshots/.json.zst -git worktree remove --force -``` - -### Config resolution - -The config file used for auto-analysis is resolved from the **main repo root**, not the worktree. This ensures consistent thresholds and weights across both refs. - -### Touch/churn metrics - -Churn and per-function touch metrics require git history, which is fully available in a normal worktree. However, **shallow clones** (e.g. CI pipelines using `fetch-depth: 1`) will have truncated history, causing touch/churn metrics to silently degrade or return zero. When a shallow clone is detected (via `git rev-parse --is-shallow-repository`), print a warning to stderr: - -``` -warning: shallow clone detected — touch/churn metrics may be incomplete - → consider: fetch-depth: 0 in your GitHub Actions checkout step -``` - -### Failure handling - -If worktree creation fails (e.g. detached HEAD, ref doesn't exist), print a clear error and exit code 2. If analysis within the worktree fails, clean up the worktree before exiting. - -### Dirty working tree - -`git worktree` does not touch the current checkout. The user's working tree, index, and stash are unaffected regardless of their state. - ---- - -## Implementation Plan - -### Files to create -- `hotspots-cli/src/cmd/diff.rs` — new subcommand handler - -### Files to modify -- `hotspots-cli/src/main.rs` — add `Diff` variant to `Commands` enum -- `hotspots-cli/src/cmd/mod.rs` — `pub mod diff` -- `hotspots-core/src/git.rs` — add `resolve_ref_to_sha(repo_root, ref) -> Result` -- `hotspots-core/src/lib.rs` — export new public API if needed - -### Reused without modification -- `hotspots-core/src/delta.rs` — `Delta::new()`, `DeltaAggregates` -- `hotspots-core/src/snapshot.rs` — `load_snapshot()`, `persist_snapshot()` -- `hotspots-core/src/html.rs` — `render_html_delta()` -- `hotspots-core/src/report.rs` — `render_json()` -- `hotspots-core/src/policy.rs` — policy evaluation -- `hotspots-cli/src/cmd/analyze.rs` — reference for delta output + policy wiring - -### Phase 1 — Core diff (no `--auto-analyze`) -1. Add `resolve_ref_to_sha()` to `git.rs` -2. Add `Diff` subcommand to CLI with `base`, `head`, `--format`, `--output`, `--policy`, `--top` -3. Implement `cmd/diff.rs`: resolve refs → load snapshots → error if missing → compute delta → render -4. Wire exit codes -5. Tests: unit tests for ref resolution; integration test with two pre-built snapshots - -### Phase 2 — `--auto-analyze` -1. Implement `analyze_ref_in_worktree(repo_root, sha, config) -> Result` -2. Integrate into diff flow: check for missing snapshots before erroring, run worktree analysis -3. Cleanup guard (ensure worktree removed even on panic/early return) -4. Tests: integration test that triggers auto-analysis for a missing ref - ---- - -## Resolved Design Decisions - -- **Summary line in text output:** Yes — implemented as `N modified, N new, N deleted` before the table. -- **`--mode` flag:** Not added — diff is always a delta by definition. -- **`--auto-analyze` head ref persistence:** Snapshots are persisted under the resolved SHA, which is correct regardless of whether head is a branch or detached. -- **`--output` scope:** Works for all formats (text, json, jsonl, html), not just HTML. -- **SARIF for diff:** Deferred. Requires a new `render_sarif_delta()` function. Currently returns exit code 2 with a clear message. -- **LINE column in text output:** Omitted — `FunctionState` does not carry line numbers. Use `--format json` to get line numbers. diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md deleted file mode 100644 index 6b03212..0000000 --- a/docs/getting-started/installation.md +++ /dev/null @@ -1,132 +0,0 @@ -# Installation - -## Install options - -### macOS (Homebrew) - -```bash -brew install Stephen-Collins-tech/tap/hotspots -``` - -### npm - -```bash -npm install -g @stephencollinstech/hotspots -``` - -Works on macOS, Linux, and Windows. No Rust toolchain required. - -### pip - -```bash -pip install hotspots-cli -``` - -Available on [PyPI](https://pypi.org/project/hotspots-cli/). Works on macOS, Linux, and Windows. - -### Linux - -```bash -curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh -``` - -Installs to `~/.local/bin/hotspots`. The script checks your current version first — if you're already up to date, it exits immediately. If an update is available, it shows the versions and asks before installing. - -Install a specific version: - -```bash -HOTSPOTS_VERSION=v1.23.0 curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh -``` - -### Any platform (Rust required) - -```bash -cargo install hotspots-cli -``` - -Available on [crates.io](https://crates.io/crates/hotspots-cli). Works on macOS, Linux, and Windows. - -### Windows - -Download the latest binary from [GitHub Releases](https://github.com/Stephen-Collins-tech/hotspots/releases/latest) and add it to your PATH. - -### GitHub Action - -Use Hotspots in CI without installing anything locally: - -```yaml -- uses: Stephen-Collins-tech/hotspots@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -See [CI/CD Guide](/guide/ci-cd) for full usage. - -### Build from source - -```bash -git clone https://github.com/Stephen-Collins-tech/hotspots.git -cd hotspots -cargo build --release -cp target/release/hotspots ~/.local/bin/ -``` - -Requires Rust 1.75+. - ---- - -## Verify - -```bash -hotspots --version -hotspots analyze --help -``` - ---- - -## Upgrade - -**macOS:** -```bash -brew upgrade Stephen-Collins-tech/tap/hotspots -``` - -**npm:** -```bash -npm update -g @stephencollinstech/hotspots -``` - -**pip:** -```bash -pip install --upgrade hotspots-cli -``` - -**Linux:** -```bash -curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh -``` - ---- - -## Troubleshoot - -**`command not found` after Linux install** — `~/.local/bin` may not be in your PATH: - -```bash -export PATH="$HOME/.local/bin:$PATH" -``` - -Add that line to your `~/.zshrc` or `~/.bashrc` to make it permanent. - -**Build from source fails** — check your Rust version: - -```bash -rustc --version # need 1.75+ -rustup update stable -``` - ---- - -## Next - -[Quick Start →](/getting-started/quick-start) — analyze your first codebase in 5 minutes. diff --git a/docs/getting-started/quick-start-react.md b/docs/getting-started/quick-start-react.md deleted file mode 100644 index 20a8a8c..0000000 --- a/docs/getting-started/quick-start-react.md +++ /dev/null @@ -1,221 +0,0 @@ -# Quick Start: Analyzing React Projects - -Hotspots now fully supports React projects in both TypeScript and JavaScript! - -## Analyze React Components - -```bash -# Single React component -hotspots analyze src/components/Button.tsx -hotspots analyze src/components/Modal.jsx - -# Entire React project -hotspots analyze src/ - -# Get JSON output for CI -hotspots analyze src/ --format json -``` - -## What Gets Analyzed - -**Main Component Functions:** -```tsx -function UserProfile({ user }) { - // This function analyzed: UserProfile - return
{user.name}
; -} -``` - -**Event Handlers:** -```tsx -function TodoList({ items }) { - // This function analyzed separately: handleDelete - const handleDelete = (id) => { - if (confirm("Delete?")) { - deleteItem(id); - } - }; - - return ( -
    - {items.map(item => ( - // This arrow function analyzed separately -
  • handleDelete(item.id)}> - {item.text} -
  • - ))} -
- ); -} -// Result: 3 functions analyzed independently -``` - -## How JSX Affects Complexity - -**✅ JSX Elements Don't Add Complexity:** -```tsx -function SimpleComponent() { - return ( -
-

Title

-

Lots of nested JSX here...

-
-
-
Even more JSX...
-
-
-
- ); -} -// Metrics: CC=1, LRS=1.0 (simple function despite lots of JSX) -``` - -**✅ Control Flow in JSX DOES Add Complexity:** -```tsx -function ConditionalComponent({ isActive, hasData }) { - return ( -
- {isActive && Active} - {hasData ? : } -
- ); -} -// Metrics: CC=2 (one for && operator) -// Note: && operator increases CC, but ternary in JSX may not -``` - -## Example Analysis Output - -**Input: UserProfile.tsx** -```tsx -function UserProfile({ user, onUpdate }) { - const handleStatusToggle = () => { - if (!user) { - console.error("No user"); - return; - } - if (user.role === 'admin') { - const confirmed = confirm("Toggle admin?"); - if (!confirmed) return; - } - onUpdate({ ...user, isActive: !user.isActive }); - }; - - return ( -
-

{user.name}

- -
- ); -} -``` - -**Output:** -``` -LRS Function Metrics Band -7.57 UserProfile CC=4 ND=2 FO=5 NS=3 high -7.37 handleStatusToggle CC=8 ND=2 FO=3 NS=2 high -``` - -**Interpretation:** -- `UserProfile` component: Moderate complexity (LRS 7.57) -- `handleStatusToggle` event handler: High complexity (LRS 7.37) -- Consider refactoring the event handler (multiple nested conditions) - -## Common Patterns - -**1. Simple Component (Good!)** -```tsx -function Badge({ type, children }) { - return {children}; -} -// LRS: 1.0 ✅ -``` - -**2. Component with Logic (Watch)** -```tsx -function UserCard({ user }) { - const displayName = user.firstName + " " + user.lastName; - const role = user.isAdmin ? "Admin" : "User"; - - return ( -
-

{displayName}

- {role} -
- ); -} -// LRS: ~2-3 ✅ (some logic, still reasonable) -``` - -**3. Complex Component (Refactor!)** -```tsx -function DataTable({ data, sortBy, filters }) { - const filteredData = data.filter(row => { - for (const [key, value] of Object.entries(filters)) { - if (!value) continue; - if (row[key] !== value) return false; - } - return true; - }); - - const sortedData = [...filteredData].sort((a, b) => { - if (sortBy.direction === 'asc') { - return a[sortBy.key] < b[sortBy.key] ? -1 : 1; - } else { - return a[sortBy.key] > b[sortBy.key] ? -1 : 1; - } - }); - - return {/* render sortedData */}
; -} -// LRS: 8-10 ⚠️ (consider extracting filtering/sorting logic) -``` - -## CI/CD Integration - -**GitHub Actions:** -```yaml -- name: Analyze complexity - run: | - cargo install hotspots - hotspots analyze src/ --format json > complexity.json - - # Check for critical complexity - hotspots analyze src/ --min-lrs 9.0 -``` - -**Pre-commit Hook:** -```bash -#!/bin/bash -# .git/hooks/pre-commit -hotspots analyze src/ --min-lrs 9.0 || { - echo "❌ Critical complexity detected!" - exit 1 -} -``` - -## Tips - -1. **Focus on High LRS Functions**: Prioritize refactoring functions with LRS > 9.0 -2. **Event Handlers Often Complex**: Event handlers frequently have higher complexity - this is normal -3. **Extract Helper Functions**: Break complex logic into smaller, testable functions -4. **JSX Doesn't Count**: Don't worry about JSX structure - focus on logic complexity -5. **Anonymous Functions**: Use meaningful function names instead of inline arrows when complexity is high - -## Next Steps - -- Run `hotspots analyze src/ --top 10` to see your most complex functions -- Set up CI integration to track complexity over time -- Use `--format json` for programmatic analysis -- Check out `docs/language-support.md` for more details - -## Supported File Types - -All of these work out of the box: -- `.ts` `.tsx` (TypeScript) -- `.js` `.jsx` (JavaScript) -- `.mts` `.mtsx` `.cts` `.ctsx` (Module formats) -- `.mjs` `.mjsx` `.cjs` `.cjsx` (Module formats) - -**12 file extensions total** - just point it at your `src/` directory! diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md deleted file mode 100644 index 1d86634..0000000 --- a/docs/getting-started/quick-start.md +++ /dev/null @@ -1,94 +0,0 @@ -# Quick Start - -Get your first results in under 5 minutes. - -## 1. Install - -```bash -# macOS -brew install Stephen-Collins-tech/tap/hotspots - -# Linux -curl -fsSL https://raw.githubusercontent.com/Stephen-Collins-tech/hotspots/main/install.sh | sh - -# Any platform with Rust -cargo install hotspots-cli -``` - -Verify it worked: - -```bash -hotspots --version -``` - -## 2. Analyze your codebase - -Navigate to any git repository and run: - -```bash -hotspots analyze src/ -``` - -You'll see output like this: - -``` -CRITICAL (LRS ≥ 9.0) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -src/auth/validateUser.ts:142 validateUser LRS: 12.4 -src/api/billing.ts:89 processPlanUpgrade LRS: 10.1 - -HIGH (LRS 6.0–9.0) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -src/db/migrations.ts:203 applySchema LRS: 8.1 -``` - -**Critical** means: refactor this now, or at minimum don't make it worse. -**High** means: refactor the next time you touch it. -**Moderate/Low** means: not worth the risk of disturbing it unless you have a specific reason. - -## 3. Understand a result - -LRS is a single number that combines four dimensions of structural complexity: - -- **CC** — how many decision branches (if, switch, loops, try/catch) -- **ND** — how deep the nesting goes -- **FO** — how many other functions this one calls -- **NS** — non-structured exits (early returns, throws) - -A function with LRS 12 isn't just "complex" — it's complex in ways that make it hard to test, hard to review, and likely to hide bugs. That's the starting point for a refactor conversation. - -See [Metrics Reference](/reference/metrics) for how LRS is calculated. - ---- - -## What next - -**Focus on Critical first.** Pick the top offender. One critical function refactored per sprint moves the number. - -**Set up CI** to block new critical functions from merging: - -```bash -hotspots analyze src/ --mode delta --policy -``` - -Exit code 1 if a function crosses the critical threshold — CI fails, the author knows immediately. See [CI/CD Setup](/guide/ci-cd). - -**Track progress over time** with snapshot mode: - -```bash -hotspots analyze src/ --mode snapshot -# ... make changes ... -hotspots analyze src/ --mode delta -# Shows exactly what improved and what got worse -``` - -**Generate an HTML report** to share with your team: - -```bash -hotspots analyze src/ --mode snapshot --format html -open .hotspots/report.html -``` - ---- - -Stuck? [Open an issue](https://github.com/Stephen-Collins-tech/hotspots/issues) or check the [full CLI reference](/reference/cli). diff --git a/docs/guide/ci-cd.md b/docs/guide/ci-cd.md deleted file mode 100644 index 5d9a154..0000000 --- a/docs/guide/ci-cd.md +++ /dev/null @@ -1,440 +0,0 @@ -# CI/CD & GitHub Action - -Integrate Hotspots into your CI/CD pipeline to catch complexity regressions before they reach production. - -## GitHub Action (Recommended) - -### Quick Start - -Create `.github/workflows/hotspots.yml`: - -```yaml -name: Hotspots - -on: - pull_request: - push: - branches: [main] - -jobs: - analyze: - runs-on: ubuntu-latest - permissions: - contents: read - pull-requests: write # For PR comments - - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 # Required for delta analysis - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -**That's it!** The action will: -- Analyze your code on every PR -- Post results as PR comments -- Generate HTML reports -- Fail builds on policy violations - ---- - -## Action Inputs - -### Required - -| Input | Description | Default | -|-------|-------------|---------| -| `github-token` | GitHub token for posting PR comments | `github.token` (auto) | - -### Optional - -| Input | Description | Default | -|-------|-------------|---------| -| `path` | Path to analyze | `.` (repo root) | -| `policy` | Policy to enforce | `critical-introduction` | -| `min-lrs` | Minimum LRS threshold (overrides policy) | - | -| `config` | Path to config file | Auto-discover | -| `fail-on` | When to fail: `error`, `warn`, `never` | `error` | -| `version` | Hotspots version to use | `latest` | -| `post-comment` | Post PR comment | `true` | - -**Available policies:** `critical-introduction` (default), `strict`, `moderate`, `custom` - ---- - -## Action Outputs - -| Output | Type | Description | -|--------|------|-------------| -| `violations` | JSON array | Policy violations | -| `passed` | boolean | Whether analysis passed | -| `summary` | string | Markdown summary | -| `report-path` | string | Path to HTML report | -| `json-output` | string | Path to JSON output | - ---- - -## How It Works - -### PR Context (Delta Mode) - -When run on a pull request: -1. Detects merge-base automatically -2. Analyzes only modified functions -3. Compares complexity before vs. after -4. Evaluates policies and checks for violations -5. Posts results as a PR comment (updates on each push) - -### Push Context (Snapshot Mode) - -When run on the main branch: -1. Analyzes entire codebase -2. Creates a snapshot (stored in `.hotspots/snapshots/`) -3. Reports violations in job summary -4. Snapshot used as baseline for future PRs - ---- - -## Workflow Examples - -### Basic PR Check - -```yaml -name: Complexity Check - -on: [pull_request] - -jobs: - hotspots: - runs-on: ubuntu-latest - permissions: - contents: read - pull-requests: write - - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### With Custom Config - -```yaml -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - config: .hotspots.ci.json - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -`.hotspots.ci.json`: -```json -{ - "exclude": ["**/*.test.ts", "**/__tests__/**"], - "min_lrs": 6.0, - "thresholds": { - "moderate": 5.0, - "high": 8.0, - "critical": 10.0 - } -} -``` - -### Monorepo Setup - -```yaml -jobs: - frontend: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: packages/frontend - github-token: ${{ secrets.GITHUB_TOKEN }} - - backend: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: packages/backend - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### Upload HTML Report as Artifact - -```yaml -- uses: Stephen-Collins-tech/hotspots-action@v1 - id: hotspots - with: - github-token: ${{ secrets.GITHUB_TOKEN }} - -- uses: actions/upload-artifact@v4 - if: always() - with: - name: hotspots-report-${{ github.sha }} - path: ${{ steps.hotspots.outputs.report-path }} - retention-days: 30 -``` - -### Warning Mode (Don't Fail Builds) - -```yaml -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - fail-on: never - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### Scheduled Analysis - -```yaml -name: Weekly Complexity Report - -on: - schedule: - - cron: '0 0 * * 0' # Every Sunday at midnight - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - fail-on: never - - uses: actions/upload-artifact@v4 - with: - name: weekly-complexity-report - path: .hotspots/report.html -``` - ---- - -## Permissions - -```yaml -permissions: - contents: read # Checkout code - pull-requests: write # Post PR comments -``` - -If you don't want PR comments, use `post-comment: false` and only `contents: read`. - ---- - -## Troubleshooting (GitHub Actions) - -### "failed to extract git context" - -Use `fetch-depth: 0` in checkout: -```yaml -- uses: actions/checkout@v4 - with: - fetch-depth: 0 -``` - -### "merge-base not found" - -Fetch the base branch explicitly: -```yaml -- uses: actions/checkout@v4 - with: - fetch-depth: 0 - ref: ${{ github.event.pull_request.head.ref }} - -- name: Fetch base branch - run: git fetch origin ${{ github.event.pull_request.base.ref }} -``` - -### PR Comments Not Posting - -Ensure `pull-requests: write` permission and `github-token` is provided. - ---- - -## Other CI Systems - -### GitLab CI - -```yaml -stages: - - analyze - -hotspots: - stage: analyze - image: rust:latest - before_script: - - cargo install hotspots-cli - script: - - hotspots analyze src/ --mode delta --policy --format json - artifacts: - paths: - - .hotspots/report.html - expire_in: 1 week - rules: - - if: '$CI_PIPELINE_SOURCE == "merge_request_event"' -``` - -**Tip:** Use a pre-built Docker image for faster startup: -```yaml - image: ghcr.io/stephen-collins-tech/hotspots:latest -``` - -### CircleCI - -```yaml -version: 2.1 - -jobs: - hotspots: - docker: - - image: rust:latest - steps: - - checkout - - restore_cache: - keys: - - cargo-v1-{{ checksum "Cargo.lock" }} - - run: - name: Install Hotspots - command: | - if ! command -v hotspots &> /dev/null; then - cargo install hotspots-cli - fi - - save_cache: - key: cargo-v1-{{ checksum "Cargo.lock" }} - paths: - - ~/.cargo - - run: - name: Run Analysis - command: hotspots analyze src/ --mode delta --policy - - store_artifacts: - path: .hotspots/report.html - -workflows: - analyze: - jobs: - - hotspots -``` - -### Travis CI - -```yaml -language: rust -rust: - - stable - -install: - - cargo install hotspots-cli - -script: - - hotspots analyze src/ --mode delta --policy --format json -``` - -### Jenkins - -```groovy -pipeline { - agent any - - stages { - stage('Setup') { - steps { - sh 'cargo install hotspots-cli || true' - } - } - - stage('Analyze') { - steps { - sh 'hotspots analyze src/ --mode delta --policy --format json > hotspots-output.json' - } - } - - stage('Report') { - steps { - archiveArtifacts artifacts: '.hotspots/report.html', fingerprint: true - publishHTML([ - reportDir: '.hotspots', - reportFiles: 'report.html', - reportName: 'Hotspots Report' - ]) - } - } - } - - post { - failure { - echo 'Hotspots found blocking violations' - } - } -} -``` - -### Bitbucket Pipelines - -```yaml -pipelines: - pull-requests: - '**': - - step: - name: Hotspots PR Check - image: rust:latest - caches: - - cargo - script: - - cargo install hotspots-cli - - hotspots analyze src/ --mode delta --policy - artifacts: - - .hotspots/report.html -``` - ---- - -## Exit Codes - -| Code | Meaning | -|------|---------| -| `0` | Success (no violations, or only warnings) | -| `1` | Failure (blocking policy violations) | - -Control exit behavior: -```bash -# Never fail (reporting only) -hotspots analyze src/ --mode delta --policy --fail-on never - -# Ignore exit code in CI script -hotspots analyze src/ --mode delta --policy || true -``` - ---- - -## Environment Variables for PR Detection - -When the following env vars are set, delta mode compares against the merge-base instead of direct parent: - -- **GitHub Actions:** `GITHUB_EVENT_NAME=pull_request` -- **GitLab CI:** `CI_MERGE_REQUEST_IID` -- **CircleCI:** `CIRCLE_PULL_REQUEST` -- **Travis CI:** `TRAVIS_PULL_REQUEST` - ---- - -## Best Practices - -1. **Use delta mode for PRs** — Only shows what changed, fast and focused -2. **Persist snapshots on main** — Creates baselines for future PRs -3. **Start with `fail-on: never`** — Observe without blocking, then tighten -4. **Archive HTML reports** — Keep 30-day history for trend review -5. **Separate dev and CI configs** — Stricter thresholds in CI, lenient locally diff --git a/docs/guide/configuration.md b/docs/guide/configuration.md deleted file mode 100644 index 9dc5022..0000000 --- a/docs/guide/configuration.md +++ /dev/null @@ -1,504 +0,0 @@ -# Configuration - -Configure Hotspots behavior using JSON configuration files. - -## Config File Locations - -Hotspots searches for configuration in the following order: - -1. **Explicit path** - Via `--config` CLI flag -2. **`.hotspotsrc.json`** - Recommended location (project root) -3. **`hotspots.config.json`** - Alternative location (project root) -4. **`package.json`** - Under `"hotspots"` key - -The first config found wins. If no config is found, default values are used. - -## Basic Example - -Create `.hotspotsrc.json` in your project root: - -```json -{ - "exclude": [ - "**/*.test.ts", - "**/*.test.tsx", - "**/__tests__/**", - "**/node_modules/**" - ], - "min_lrs": 3.0, - "top": 50 -} -``` - -## Complete Configuration Reference - -### File Filtering - -#### `include` -Glob patterns for files to include (default: all supported extensions). - -```json -{ - "include": [ - "src/**/*.ts", - "lib/**/*.js" - ] -} -``` - -**Type:** `string[]` -**Default:** `[]` (include all supported files) - -#### `exclude` -Glob patterns for files to exclude. - -```json -{ - "exclude": [ - "**/*.test.ts", - "**/*.test.tsx", - "**/*.test.js", - "**/*.test.jsx", - "**/*.spec.ts", - "**/*.spec.tsx", - "**/*.spec.js", - "**/*.spec.jsx", - "**/node_modules/**", - "**/__tests__/**", - "**/__mocks__/**", - "**/dist/**", - "**/build/**", - "**/vendor/**", - "**/*.pb.go", - "**/zz_generated*.go" - ] -} -``` - -**Type:** `string[]` -**Default:** Test files, node_modules, dist, build, Go vendor/generated - -### Risk Band Thresholds - -#### `thresholds` -Customize LRS (Leverage Risk Score) thresholds for risk bands. - -```json -{ - "thresholds": { - "moderate": 3.0, - "high": 6.0, - "critical": 9.0 - } -} -``` - -**Fields:** -- `moderate` - LRS threshold for moderate risk (default: `3.0`) -- `high` - LRS threshold for high risk (default: `6.0`) -- `critical` - LRS threshold for critical risk (default: `9.0`) - -**Validation:** -- All thresholds must be positive -- Must be ordered: `moderate < high < critical` - -**Risk Bands:** -- **Low:** LRS < `moderate` -- **Moderate:** `moderate` ≤ LRS < `high` -- **High:** `high` ≤ LRS < `critical` -- **Critical:** LRS ≥ `critical` - -### Metric Weights - -#### `weights` -Customize how metrics contribute to LRS calculation. - -```json -{ - "weights": { - "cc": 1.0, - "nd": 0.8, - "fo": 0.6, - "ns": 0.7 - } -} -``` - -**Fields:** -- `cc` - Cyclomatic Complexity weight (default: `1.0`) -- `nd` - Nesting Depth weight (default: `0.8`) -- `fo` - Fan-Out weight (default: `0.6`) -- `ns` - Non-Structured exits weight (default: `0.7`) - -**Validation:** -- All weights must be non-negative (≥ 0.0) -- At least one weight must be positive (> 0.0) -- Weights cannot exceed 10.0 - -Weights scale the log-transformed contribution of each metric to LRS. See [LRS Specification](/reference/lrs-spec) for the full formula. - -### Warning Thresholds - -#### `warning_thresholds` -Configure proactive warning thresholds for policy engine. - -```json -{ - "warning_thresholds": { - "watch_min": 2.5, - "watch_max": 3.0, - "attention_min": 5.5, - "attention_max": 6.0, - "rapid_growth_percent": 50.0 - } -} -``` - -**Fields:** -- `watch_min` - Lower bound for "watch" range (default: `2.5`) -- `watch_max` - Upper bound for "watch" range (default: `3.0`) -- `attention_min` - Lower bound for "attention" range (default: `5.5`) -- `attention_max` - Upper bound for "attention" range (default: `6.0`) -- `rapid_growth_percent` - Percent increase threshold (default: `50.0`) - -**Validation:** -- All thresholds must be positive -- Must be ordered: `watch_min < watch_max ≤ moderate < attention_min < attention_max ≤ high` - -### Output Filtering - -#### `min_lrs` -Minimum LRS to report (filter out low-complexity functions). - -```json -{ - "min_lrs": 3.0 -} -``` - -**Type:** `number` -**Default:** `0.0` (report all functions) - -#### `top` -Maximum number of functions to show. - -```json -{ - "top": 50 -} -``` - -**Type:** `number` -**Default:** No limit (show all) - -### Co-Change Analysis - -#### `co_change_window_days` -Number of days of git history to mine for co-change pairs. - -```json -{ - "co_change_window_days": 180 -} -``` - -**Type:** `number` (integer ≥ 1) -**Default:** `90` - -Projects with a slow commit cadence (e.g. once a week) benefit from a larger window. - -#### `co_change_min_count` -Minimum number of co-changes required to report a pair. Pairs that appear fewer times are -filtered out as noise. - -```json -{ - "co_change_min_count": 5 -} -``` - -**Type:** `number` (integer ≥ 1) -**Default:** `3` - -High-traffic repositories (50+ commits/day) may want a higher threshold to reduce noise. - -#### `driver_threshold_percentile` -Percentile of each metric that a function must exceed to receive a specific driver label. - -```json -{ - "driver_threshold_percentile": 75 -} -``` - -**Type:** integer 1–99 -**Default:** `75` - -At the default of 75, a function must have a cyclomatic complexity above the 75th percentile of -all functions in the snapshot to trigger the `high_complexity` label (i.e. top 25%). The same -percentile gate applies to `nd` (deep_nesting), `fan_out` (high_fanout_churning), `fan_in` -(high_fanin_complex), and `touch_count` (high_churn_low_cc, high_fanout_churning). - -Compound checks: -- `high_churn_low_cc`: touch above Pth percentile **and** cc below the (100-P)th percentile -- `high_fanout_churning`: fan_out above Pth percentile **and** touch above the 50th percentile - -`cyclic_dep` stays absolute — being in a cycle is binary, not distribution-relative. - -**When to tune:** -- Small or uniform repos → lower to 50–60 so more functions get specific labels -- Large repos with high median complexity → raise to 85–90 to reduce noise - -#### `per_function_touches` -Whether to use per-function `git log -L` for touch metrics instead of file-level batching. - -```json -{ - "per_function_touches": false -} -``` - -**Type:** `boolean` -**Default:** `true` - -Per-function touch metrics are more accurate (each function gets its own 30-day touch count -rather than sharing the file's count). Warm runs use the on-disk cache -(`.hotspots/touch-cache.json.zst`) and match file-level speed (~230 ms vs ~268 ms on this -repo). The first run on a new commit is slow (~6 s for ~200 functions); subsequent runs are -fast. Set to `false` to always use file-level batching (useful in CI without a persistent -cache layer). - -For very large repositories (50k+ functions), consider skipping touch metrics entirely with -the `--skip-touch-metrics` CLI flag. This avoids all git log I/O and can reduce analysis time -significantly (e.g. ~66 s savings on expo/expo). Touch counts will be reported as `0`. - -## Complete Example - -```json -{ - "include": [ - "src/**/*.ts", - "src/**/*.tsx" - ], - "exclude": [ - "**/*.test.ts", - "**/*.test.tsx", - "**/__tests__/**", - "**/node_modules/**", - "**/dist/**", - "**/coverage/**" - ], - "thresholds": { - "moderate": 4.0, - "high": 8.0, - "critical": 12.0 - }, - "weights": { - "cc": 1.0, - "nd": 0.9, - "fo": 0.5, - "ns": 0.8 - }, - "warning_thresholds": { - "watch_min": 3.5, - "watch_max": 4.0, - "attention_min": 7.5, - "attention_max": 8.0, - "rapid_growth_percent": 40.0 - }, - "min_lrs": 3.0, - "top": 100 -} -``` - -## Using in package.json - -Add configuration under `"hotspots"` key: - -```json -{ - "name": "my-project", - "version": "1.0.0", - "hotspots": { - "exclude": [ - "**/*.test.ts", - "**/node_modules/**" - ], - "min_lrs": 3.0 - } -} -``` - -## CLI Override - -Config file settings can be overridden by CLI flags: - -```bash -# Config file says min_lrs: 3.0, but CLI overrides to 5.0 -hotspots analyze src/ --min-lrs 5.0 - -# Use specific config file -hotspots analyze src/ --config custom-config.json -``` - -**CLI flags take precedence over config file values.** - -## Environment-Specific Configs - -### Development - -`.hotspotsrc.json`: -```json -{ - "exclude": ["**/*.test.ts"], - "min_lrs": 0.0, - "top": 20 -} -``` - -### CI/CD - -`hotspots.ci.json`: -```json -{ - "exclude": ["**/*.test.ts"], - "min_lrs": 5.0, - "thresholds": { - "moderate": 5.0, - "high": 8.0, - "critical": 10.0 - } -} -``` - -Use in CI: -```yaml -- run: hotspots analyze src/ --config hotspots.ci.json --policy --fail-on blocking -``` - -## Configuration Validation - -Hotspots validates configuration on load: - -**Valid:** -```json -{ - "thresholds": { - "moderate": 3.0, - "high": 6.0, - "critical": 9.0 - } -} -``` - -**Invalid (not ordered):** -```json -{ - "thresholds": { - "moderate": 6.0, - "high": 3.0, // ❌ Error: high must be > moderate - "critical": 9.0 - } -} -``` - -**Invalid (negative weight):** -```json -{ - "weights": { - "cc": -1.0 // ❌ Error: weights must be non-negative - } -} -``` - -## Default Values - -If no config file is found, these defaults are used: - -```json -{ - "include": [], - "exclude": [ - "**/*.test.ts", - "**/*.test.tsx", - "**/*.test.js", - "**/*.test.jsx", - "**/*.spec.ts", - "**/*.spec.tsx", - "**/*.spec.js", - "**/*.spec.jsx", - "**/node_modules/**", - "**/__tests__/**", - "**/__mocks__/**", - "**/dist/**", - "**/build/**", - "**/vendor/**", - "**/*.pb.go", - "**/zz_generated*.go" - ], - "thresholds": { - "moderate": 3.0, - "high": 6.0, - "critical": 9.0 - }, - "weights": { - "cc": 1.0, - "nd": 0.8, - "fo": 0.6, - "ns": 0.7 - }, - "warning_thresholds": { - "watch_min": 2.5, - "watch_max": 3.0, - "attention_min": 5.5, - "attention_max": 6.0, - "rapid_growth_percent": 50.0 - }, - "min_lrs": 0.0, - "top": null -} -``` - -## Troubleshooting - -### Config not being loaded - -Check the config file is in project root: -```bash -ls -la .hotspotsrc.json -``` - -Verify JSON syntax: -```bash -cat .hotspotsrc.json | jq . -``` - -### Unknown fields error - -Hotspots rejects unknown fields to catch typos: - -```json -{ - "min_lrs": 3.0, - "minLRS": 5.0 // ❌ Error: unknown field -} -``` - -Use exact field names from this guide. - -### Config validation fails - -Read the error message carefully - it tells you exactly what's wrong: - -``` -Error: thresholds.moderate (6.0) must be less than thresholds.high (5.0) -``` - -Fix the ordering and try again. - -## Related Documentation - -- [CLI Reference](../reference/cli.md) - Command-line options -- [Metrics & LRS](../reference/metrics.md) - How LRS is calculated -- [Suppression Comments](../guide/suppression.md) - Excluding functions from policy -- [Policy Engine](../guide/usage.md#policy-engine) - Using policies diff --git a/docs/guide/github-action.md b/docs/guide/github-action.md deleted file mode 100644 index be419bd..0000000 --- a/docs/guide/github-action.md +++ /dev/null @@ -1,716 +0,0 @@ -# GitHub Action - -Complete guide to using the Hotspots GitHub Action in your workflows. - -## Overview - -The Hotspots GitHub Action provides zero-config complexity analysis for pull requests and commits. - -**Key Features:** -- 🚀 **Zero configuration** - Works out of the box -- 🎯 **PR-aware** - Automatically detects PRs and runs delta analysis -- 📊 **HTML Reports** - Interactive reports as workflow artifacts -- 💬 **PR Comments** - Posts results directly to pull requests -- ⚡ **Fast** - Cached binary downloads, incremental analysis -- 🔒 **Deterministic** - Byte-for-byte reproducible results - -**Supported Languages:** TypeScript, JavaScript, Go, Java, Python, Rust - ---- - -## Quick Start - -### Basic Setup - -Create `.github/workflows/hotspots.yml`: - -```yaml -name: Hotspots - -on: - pull_request: - push: - branches: [main] - -jobs: - analyze: - runs-on: ubuntu-latest - permissions: - contents: read - pull-requests: write # For PR comments - - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 # Required for delta analysis - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -**That's it!** The action will: -- Analyze your code on every PR -- Post results as PR comments -- Generate HTML reports -- Fail builds on policy violations - ---- - -## Inputs - -### Required Inputs - -| Input | Description | Default | -|-------|-------------|---------| -| `github-token` | GitHub token for posting PR comments | `github.token` (auto) | - -### Optional Inputs - -| Input | Description | Default | -|-------|-------------|---------| -| `path` | Path to analyze | `.` (repo root) | -| `policy` | Policy to enforce | `critical-introduction` | -| `min-lrs` | Minimum LRS threshold (overrides policy) | - | -| `config` | Path to config file | Auto-discover | -| `fail-on` | When to fail: `error`, `warn`, `never` | `error` | -| `version` | Hotspots version to use | `latest` | -| `post-comment` | Post PR comment | `true` | - -### Input Details - -#### `path` - -Path to analyze (file or directory). - -```yaml -# Analyze specific directory -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: src/ - -# Analyze entire repository -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: . -``` - -#### `policy` - -Policy mode for enforcement. - -**Available policies:** -- `critical-introduction` (default) - Block new critical-risk functions -- `strict` - Block any complexity increase -- `moderate` - Allow moderate increases, block high/critical -- `custom` - Use config file thresholds - -```yaml -# Strict policy (no regressions) -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - policy: strict - -# Custom policy via config file -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - policy: custom - config: .hotspots.ci.json -``` - -#### `min-lrs` - -Override policy with minimum LRS threshold. - -```yaml -# Only flag functions with LRS ≥ 8.0 -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - min-lrs: 8.0 -``` - -#### `fail-on` - -Control when the action fails the build. - -**Options:** -- `error` (default) - Fail on blocking policy violations -- `warn` - Fail on warnings too -- `never` - Never fail (reporting only) - -```yaml -# Warning mode (don't fail builds) -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - fail-on: never -``` - -#### `version` - -Specify Hotspots version. - -```yaml -# Pin to specific version -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - version: 1.2.3 - -# Use latest (default) -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - version: latest -``` - ---- - -## Outputs - -The action provides structured outputs for use in subsequent steps. - -| Output | Type | Description | -|--------|------|-------------| -| `violations` | JSON array | Policy violations | -| `passed` | boolean | Whether analysis passed | -| `summary` | string | Markdown summary | -| `report-path` | string | Path to HTML report | -| `json-output` | string | Path to JSON output | - -### Using Outputs - -```yaml -- uses: Stephen-Collins-tech/hotspots-action@v1 - id: hotspots - with: - github-token: ${{ secrets.GITHUB_TOKEN }} - -- name: Check Results - run: | - echo "Passed: ${{ steps.hotspots.outputs.passed }}" - echo "Summary: ${{ steps.hotspots.outputs.summary }}" - -- name: Upload Report - uses: actions/upload-artifact@v4 - if: always() - with: - name: hotspots-report - path: ${{ steps.hotspots.outputs.report-path }} -``` - ---- - -## How It Works - -### PR Context (Delta Mode) - -When run on a pull request: - -1. **Detects merge-base** - Automatically finds common ancestor -2. **Analyzes changes** - Only checks modified functions -3. **Compares complexity** - Before vs. after -4. **Evaluates policies** - Checks for violations -5. **Posts PR comment** - Results directly in PR -6. **Updates on push** - Edits existing comment - -**Behavior:** -- First run: Creates new comment -- Subsequent runs: Updates existing comment -- Multiple commits: Shows latest analysis - -### Push Context (Snapshot Mode) - -When run on main/default branch: - -1. **Analyzes entire codebase** - All functions -2. **Creates snapshot** - Stores baseline -3. **Reports violations** - All high-complexity functions -4. **Shows job summary** - In workflow run - -**Behavior:** -- Snapshots stored in `.hotspots/snapshots/` -- Used as baseline for future PRs -- No PR comments (not applicable) - ---- - -## PR Comments - -### Example Comment - -```markdown -# 🔍 Hotspots Analysis - -**Status:** ❌ 2 blocking violation(s) - -### Summary -- **Mode:** Delta (comparing vs. `main`) -- **Changed functions:** 5 -- **New functions:** 2 -- **Removed functions:** 1 - -### ❌ Blocking Violations - -| Function | File | LRS | Change | Policy | -|----------|------|-----|--------|--------| -| `processPayment` | `src/payment.ts:120` | 9.2 | +1.5 | Critical introduction | -| `validateOrder` | `src/orders.ts:45` | 8.7 | +2.1 | Band transition (moderate → high) | - -### ⚠️ Warnings - -| Function | File | LRS | Reason | -|----------|------|-----|--------| -| `parseInput` | `src/parser.ts:78` | 5.8 | Approaching high threshold | - -### 👀 Watch - -3 function(s) approaching moderate threshold - ---- - -[View full HTML report](https://github.com/yourorg/repo/actions/runs/123456789) -``` - -### Comment Behavior - -- **Single comment per PR** - Updates existing comment -- **Collapsible details** - Large reports are collapsed -- **Direct links** - Click function names to jump to code -- **Color-coded** - Risk levels visually distinct -- **Dismissable** - Can be minimized if needed - ---- - -## Workflow Examples - -### Basic PR Check - -```yaml -name: Complexity Check - -on: [pull_request] - -jobs: - hotspots: - runs-on: ubuntu-latest - permissions: - contents: read - pull-requests: write - - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### With Custom Config - -```yaml -name: Hotspots - -on: [pull_request, push] - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - config: .hotspots.ci.json - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -Create `.hotspots.ci.json`: -```json -{ - "exclude": ["**/*.test.ts", "**/__tests__/**"], - "min_lrs": 6.0, - "thresholds": { - "moderate": 5.0, - "high": 8.0, - "critical": 10.0 - } -} -``` - -### Monorepo Setup - -```yaml -name: Hotspots - -on: [pull_request] - -jobs: - frontend: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: packages/frontend - github-token: ${{ secrets.GITHUB_TOKEN }} - - backend: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: packages/backend - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### Upload HTML Report as Artifact - -```yaml -name: Hotspots - -on: [pull_request] - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - id: hotspots - with: - github-token: ${{ secrets.GITHUB_TOKEN }} - - - uses: actions/upload-artifact@v4 - if: always() - with: - name: hotspots-report-${{ github.sha }} - path: ${{ steps.hotspots.outputs.report-path }} - retention-days: 30 -``` - -### Warning Mode (Don't Fail Builds) - -```yaml -name: Hotspots - -on: [pull_request] - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - fail-on: never - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### Custom Failure Handling - -```yaml -name: Hotspots - -on: [pull_request] - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - id: hotspots - with: - fail-on: never - github-token: ${{ secrets.GITHUB_TOKEN }} - - - name: Custom Failure Logic - if: steps.hotspots.outputs.passed == 'false' - run: | - echo "::warning::Hotspots found violations" - echo "Summary: ${{ steps.hotspots.outputs.summary }}" - # Custom notification, Slack message, etc. -``` - -### Multi-Language Project - -```yaml -name: Hotspots - -on: [pull_request] - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: . # Analyzes all supported files - config: .hotspotsrc.json - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -`.hotspotsrc.json`: -```json -{ - "include": [ - "src/**/*.{ts,js}", - "backend/**/*.go", - "scripts/**/*.py" - ], - "exclude": [ - "**/*.test.*", - "**/node_modules/**" - ] -} -``` - ---- - -## Permissions - -### Required Permissions - -```yaml -permissions: - contents: read # Checkout code - pull-requests: write # Post PR comments -``` - -### Minimal Permissions - -If you don't want PR comments: - -```yaml -permissions: - contents: read - -jobs: - analyze: - steps: - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - post-comment: false -``` - ---- - -## Troubleshooting - -### "failed to extract git context" - -**Cause:** Shallow git clone. - -**Fix:** Use `fetch-depth: 0`: -```yaml -- uses: actions/checkout@v4 - with: - fetch-depth: 0 -``` - -### "merge-base not found" - -**Cause:** Base branch not available. - -**Fix:** Fetch base branch: -```yaml -- uses: actions/checkout@v4 - with: - fetch-depth: 0 - ref: ${{ github.event.pull_request.head.ref }} - -- name: Fetch base branch - run: git fetch origin ${{ github.event.pull_request.base.ref }} -``` - -### PR Comments Not Posting - -**Causes:** -1. Missing `pull-requests: write` permission -2. `github-token` not provided -3. `post-comment: false` - -**Fix:** -```yaml -permissions: - pull-requests: write - -steps: - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - github-token: ${{ secrets.GITHUB_TOKEN }} - post-comment: true -``` - -### Binary Download Fails - -**Cause:** Network issues or unsupported platform. - -**Fix:** Build from source: -```yaml -- uses: dtolnay/rust-toolchain@stable - -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - version: build-from-source -``` - -### "Path does not exist" - -**Cause:** Invalid `path` input. - -**Fix:** Verify path exists: -```yaml -- name: List directory - run: ls -la src/ - -- uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: src/ -``` - ---- - -## Advanced Usage - -### Conditional Execution - -```yaml -# Only on specific branches -on: - pull_request: - branches: [main, develop] - -# Only on specific paths -on: - pull_request: - paths: - - 'src/**' - - '**.ts' - - '**.js' -``` - -### Matrix Strategy - -```yaml -jobs: - analyze: - runs-on: ubuntu-latest - strategy: - matrix: - path: [frontend, backend, shared] - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - path: packages/${{ matrix.path }} - github-token: ${{ secrets.GITHUB_TOKEN }} -``` - -### Scheduled Analysis - -```yaml -name: Weekly Complexity Report - -on: - schedule: - - cron: '0 0 * * 0' # Every Sunday at midnight - -jobs: - analyze: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - with: - fail-on: never # Just report, don't fail - - - uses: actions/upload-artifact@v4 - with: - name: weekly-complexity-report - path: .hotspots/report.html -``` - ---- - -## Performance - -### Caching - -The action automatically caches: -- Hotspots binary (per version) -- Analysis snapshots (`.hotspots/`) - -**Cache behavior:** -- Binary: Cached for 7 days -- Snapshots: Persisted in repository - -### Execution Time - -**Typical execution times:** -- Small project (<100 files): 10-30 seconds -- Medium project (100-500 files): 30-60 seconds -- Large project (500+ files): 1-3 minutes - -**Optimization tips:** -- Use `path` to analyze specific directories -- Exclude test files with config -- Pin to specific `version` (avoids version checks) - ---- - -## Related Documentation - -- [CLI Reference](../reference/cli.md) - Command-line interface -- [CI/CD Guide](./ci-cd.md) - CI/CD setup for all systems -- [Configuration](./configuration.md) - Config file options -- [Output Formats](./output-formats.md) - JSON, HTML, Text formats -- [Policy Engine](./usage.md#policy-engine) - Policy rules - ---- - -## Getting Help - -- 💬 [GitHub Discussions](https://github.com/Stephen-Collins-tech/hotspots/discussions) -- 📧 [Open an Issue](https://github.com/Stephen-Collins-tech/hotspots/issues) -- 📖 [Documentation](https://docs.hotspots.dev) - ---- - -## Next Steps - -After setting up the action: - -1. **Tune thresholds** - Adjust for your codebase -2. **Add config file** - Customize behavior -3. **Review reports** - Understand complexity patterns -4. **Refactor hotspots** - Reduce high-complexity functions -5. **Monitor trends** - Track complexity over time - -**Happy analyzing!** 🚀 diff --git a/docs/guide/output-formats.md b/docs/guide/output-formats.md deleted file mode 100644 index 2ec9761..0000000 --- a/docs/guide/output-formats.md +++ /dev/null @@ -1,838 +0,0 @@ -# Output Formats - -Hotspots supports five output formats: JSON (machine-readable), HTML (interactive reports), Text (human-readable terminal output), SARIF (GitHub code scanning), and JSONL (streaming). - -## Format Overview - -| Format | Use Case | Features | -|--------|----------|----------| -| **JSON** | CI/CD, tooling, AI agents | Structured, versioned schema, machine-parseable | -| **HTML** | Reports, dashboards, sharing | Interactive, charts, filterable, standalone | -| **Text** | Terminal, quick inspection | Color-coded, compact, human-friendly | -| **SARIF** | GitHub code scanning | SARIF 2.1.0, annotations on PRs and files | -| **JSONL** | Streaming pipelines | One JSON object per line | - ---- - -## JSON Format - -Machine-readable structured output for programmatic consumption. - -### Basic Usage - -```bash -# Snapshot mode JSON -hotspots analyze src/ --format json - -# Delta mode JSON -hotspots analyze src/ --mode delta --format json - -# With policy evaluation -hotspots analyze src/ --mode delta --policy --format json -``` - -### Output Structure - -```json -{ - "schema_version": 2, - "commit": { - "sha": "abc123def456...", - "parents": ["def456..."], - "timestamp": 1704067200, - "branch": "main" - }, - "analysis": { - "scope": "full", - "tool_version": "1.0.0" - }, - "functions": [ - { - "function_id": "/path/to/file.ts::functionName", - "file": "/absolute/path/to/file.ts", - "line": 42, - "metrics": { - "cc": 8, - "nd": 2, - "fo": 4, - "ns": 2 - }, - "lrs": 7.2, - "band": "high", - "patterns": ["complex_branching", "god_function"], - "pattern_details": [ - { - "id": "complex_branching", - "tier": 1, - "kind": "primitive", - "triggered_by": [ - { "metric": "cc", "op": ">=", "value": 8, "threshold": 10 }, - { "metric": "nd", "op": ">=", "value": 2, "threshold": 4 } - ] - } - ] - } - ], - "aggregates": { - "total_functions": 150, - "by_band": { - "low": 80, - "moderate": 45, - "high": 20, - "critical": 5 - }, - "average_lrs": 4.3 - }, - "policy_results": { - "failed": [], - "warnings": [ - { - "id": "watch-threshold", - "level": "info", - "function_id": "/path/to/file.ts::someFunction", - "message": "Function approaching moderate threshold", - "metadata": { - "current_lrs": 2.8, - "threshold": 3.0 - } - } - ] - } -} -``` - -### Schema Documentation - -Complete JSON schema definitions available: - -- **`hotspots-output.schema.json`** - Full output structure -- **`function-report.schema.json`** - Function analysis format -- **`metrics.schema.json`** - Raw metrics (CC, ND, FO, NS) -- **`policy-result.schema.json`** - Policy violations/warnings - -See the [JSON Schema Reference section](#json-schema-reference) below for complete documentation. - -### Fields Reference - -#### Top-Level Fields - -| Field | Type | Description | -|-------|------|-------------| -| `schema_version` | number | Schema version (currently `2` for snapshot/delta output) | -| `commit` | object | Git commit metadata | -| `analysis` | object | Analysis metadata | -| `functions` | array | Function analysis results | -| `aggregates` | object | Optional aggregated statistics | -| `policy_results` | object | Optional policy evaluation results | - -#### Commit Metadata - -| Field | Type | Description | -|-------|------|-------------| -| `sha` | string | Git commit SHA (40 chars) | -| `parents` | string[] | Parent commit SHAs | -| `timestamp` | number | Unix timestamp | -| `branch` | string | Branch name (optional) | - -#### Function Report - -| Field | Type | Description | -|-------|------|-------------| -| `function_id` | string | Unique identifier: `file::functionName` | -| `file` | string | Absolute file path | -| `line` | number | Line number where function starts | -| `metrics` | object | Raw complexity metrics | -| `lrs` | number | Leverage Risk Score | -| `band` | string | Risk band: `low`, `moderate`, `high`, `critical` | -| `suppression_reason` | string | Optional suppression comment | -| `patterns` | string[] | Code patterns detected (e.g. `"god_function"`, `"complex_branching"`). Omitted when empty. | -| `pattern_details` | object[] | Per-pattern trigger details. Present only when `--explain-patterns` is set. | - -**`pattern_details` entry:** - -| Field | Type | Description | -|-------|------|-------------| -| `id` | string | Pattern identifier (e.g. `"complex_branching"`) | -| `tier` | number | `1` = structural (always available), `2` = enriched (snapshot mode only) | -| `kind` | string | `"primitive"` or `"derived"` | -| `triggered_by` | object[] | Each condition that caused the pattern to fire: `metric`, `op`, `value`, `threshold` | - -#### Metrics Object - -| Field | Type | Description | -|-------|------|-------------| -| `cc` | number | Cyclomatic Complexity | -| `nd` | number | Nesting Depth | -| `fo` | number | Fan-Out (function calls) | -| `ns` | number | Non-Structured exits | - -### jq Examples - -```bash -# Extract high-risk functions -jq '.functions[] | select(.band == "high" or .band == "critical")' output.json - -# Count by risk band -jq '.aggregates.by_band' output.json - -# Top 10 by LRS -jq '.functions | sort_by(.lrs) | reverse | .[0:10]' output.json - -# Average LRS -jq '[.functions[].lrs] | add / length' output.json - -# Functions with policy violations -jq '.policy_results.failed[] | .function_id' output.json - -# Functions carrying any pattern -jq '.functions[] | select(.patterns | length > 0) | {id: .function_id, patterns}' output.json - -# Count functions per pattern -jq '[.functions[].patterns[]?] | group_by(.) | map({pattern: .[0], count: length})' output.json - -# Functions with god_function pattern -jq '.functions[] | select(.patterns[]? == "god_function") | .function_id' output.json -``` - ---- - -## HTML Format - -Interactive HTML reports with charts and visualizations. - -### Basic Usage - -```bash -# Snapshot mode HTML -hotspots analyze src/ --mode snapshot --format html - -# Snapshot HTML with model risk map -hotspots analyze src/ --mode snapshot --format html --include-models - -# Delta mode HTML -hotspots analyze src/ --mode delta --format html - -# Custom output path -hotspots analyze src/ --mode snapshot --format html --output reports/complexity.html -``` - -**Default output:** `.hotspots/report.html` - -### Report Features - -**Overview Dashboard:** -- Total functions analyzed -- Risk band distribution (pie chart) -- Average LRS -- Policy violations summary - -**Pattern Breakdown Panel** (shown above the function table when any patterns are detected): -- Frequency chips for each detected pattern, sorted by count descending -- One chip per pattern: count, name, and short description -- Per-pattern color coding — warm reds/ambers for Tier 1 structural patterns, cool blues/purples for Tier 2 enriched patterns, dark crimson for `volatile_god` -- Dark mode support - -**Function Table:** -- Sortable by LRS, CC, ND, FO, NS -- Filterable by risk band and driver label -- Searchable by function name -- Color-coded risk levels and driver badges -- Pattern column: colored pill badges per detected pattern -- Action column: per-function refactoring recommendation (driver × quadrant) - -**Charts (snapshot mode):** -- Risk band distribution (donut chart) -- **Risk Landscape scatter plot:** all functions plotted by Complexity (LRS, x-axis) vs Change Frequency (recent touches, y-axis). Dots are color-coded by risk band (critical = orange, high = amber, moderate = teal, low = grey). Dashed median lines divide the chart into four quadrants — the top-right quadrant (high complexity + high churn) is the classic Tornhill hotspot zone. Hover any dot to see function name, file path, LRS, and touch count. -- Historical trend charts: stacked band count, activity risk line, top-1% share line - (requires ≥2 prior snapshots; up to 30 history points; hover for per-bar detail) - -**Model Risk Map** (when generated with `--include-models`): -- Draggable relationship graph of top data/control models -- Node size reflects concentrated associated function risk -- Edge width reflects shared AST/CFG-derived associated references -- Collapsed raw model table for audit detail - -**Delta Mode Additions:** -- Before/after comparison -- Complexity changes (Δ LRS) -- New functions highlighted -- Removed functions shown -- Policy violation details - -### Opening Reports - -```bash -# Generate and open in browser -hotspots analyze src/ --mode snapshot --format html -open .hotspots/report.html # macOS -xdg-open .hotspots/report.html # Linux -start .hotspots/report.html # Windows -``` - -### CI/CD Artifacts - -**GitHub Actions:** -```yaml -- uses: Stephen-Collins-tech/hotspots-action@v1 - id: hotspots - with: - github-token: ${{ secrets.GITHUB_TOKEN }} - -- uses: actions/upload-artifact@v4 - if: always() - with: - name: hotspots-report - path: ${{ steps.hotspots.outputs.report-path }} - retention-days: 30 -``` - -**GitLab CI:** -```yaml -artifacts: - paths: - - .hotspots/report.html - expire_in: 1 week -``` - -### Sharing Reports - -HTML reports are self-contained (embedded CSS/JS): - -```bash -# Email report -echo "See attached complexity report" | mail -a .hotspots/report.html team@example.com - -# Upload to S3 -aws s3 cp .hotspots/report.html s3://my-bucket/reports/complexity-$(date +%Y%m%d).html - -# Serve with HTTP server -python3 -m http.server 8000 --directory .hotspots -# Open http://localhost:8000/report.html -``` - ---- - -## Text Format - -Human-readable terminal output with color coding. - -### Basic Usage - -```bash -# Default text output -hotspots analyze src/ - -# Explicit text format -hotspots analyze src/ --format text - -# Delta mode with policy -hotspots analyze src/ --mode delta --policy --format text - -# Show pattern trigger details inline -hotspots analyze src/ --explain-patterns -hotspots analyze src/ --mode snapshot --format text --explain --explain-patterns -``` - -### Snapshot Mode Output - -``` -Hotspots Analysis -================================================================================ - -Functions by Risk Band: - -Critical (LRS ≥ 9.0): -processComplexOrder /src/orders.ts:142 LRS 10.2 CC 15 ND 4 FO 8 NS 3 -handlePaymentFlow /src/payments.ts:89 LRS 9.5 CC 12 ND 3 FO 6 NS 4 - -High (6.0 ≤ LRS < 9.0): -validateUserInput /src/validation.ts:23 LRS 7.8 CC 10 ND 2 FO 5 NS 2 -generateReport /src/reports.ts:156 LRS 6.5 CC 8 ND 3 FO 4 NS 1 - -Moderate (3.0 ≤ LRS < 6.0): -formatDate /src/utils.ts:45 LRS 4.2 CC 5 ND 1 FO 2 NS 1 - -Summary: - Total functions: 150 - Critical: 2 - High: 2 - Moderate: 45 - Low: 101 - Average LRS: 3.2 -``` - -### Snapshot Mode with `--explain` - -The `--explain` flag adds per-function detail: driver label, recommended action, and — for -`composite`-labeled functions — the top near-miss dimensions with their percentile ranks. - -``` -hotspots analyze src/ --mode snapshot --format text --explain -``` - -``` -processComplexOrder /src/orders.ts:142 - LRS: 10.2 | Band: critical | Driver: cc - CC: 15, ND: 4, FO: 8, NS: 3 - Action: Reduce branching; extract sub-functions - -handlePaymentFlow /src/payments.ts:89 - LRS: 9.5 | Band: critical | Driver: composite - CC: 12, ND: 3, FO: 6, NS: 4 - Action: Multiple complexity dimensions — address the highest first - Near-threshold: fan_out (P78), cc (P72), nd (P61) - -validateUserInput /src/validation.ts:23 - LRS: 7.8 | Band: high | Driver: nd - CC: 10, ND: 2, FO: 5, NS: 2 - Action: Reduce nesting depth; early returns help -``` - -`Near-threshold` appears only for `composite` functions and lists up to 3 dimensions at -or above the 40th percentile (i.e., above median across all analysed functions), sorted -by percentile rank descending. This makes multi-factor functions interpretable without -changing how the driver label is computed. - -### Delta Mode Output - -``` -Delta Analysis -================================================================================ - -Changed Functions: - -processOrder /src/orders.ts:42 - Before: LRS 6.5 (high) CC 8 ND 2 FO 4 NS 1 - After: LRS 7.2 (high) CC 10 ND 2 FO 5 NS 2 - Change: +0.7 LRS (+10.8%) ⚠️ Regression - -validateInput /src/validation.ts:15 - Before: LRS 5.2 (moderate) CC 6 ND 2 FO 3 NS 1 - After: LRS 4.8 (moderate) CC 5 ND 2 FO 3 NS 1 - Change: -0.4 LRS (-7.7%) ✅ Improvement - -New Functions: -handleNewFeature /src/features.ts:89 LRS 3.2 (moderate) - -Removed Functions: -deprecatedFunction /src/legacy.ts:123 LRS 8.5 (high) - -Summary: - Changed: 2 - New: 1 - Removed: 1 - Net ΔLRS: +0.3 -``` - -### Delta Mode with Policy Output - -``` -Policy Evaluation Results -================================================================================ - -Policy failures: -- no-regressions: /src/orders.ts::processOrder - -Violating functions: -Function Before After ΔLRS Policy ----------------------------------------------------------------------------------------------- -processOrder high high +0.70 no-regressions - -Watch Level (approaching moderate threshold): -Function Current LRS Band ----------------------------------------------------------------- -formatCurrency 2.8 low - -Attention Level (approaching high threshold): -Function Current LRS Band ----------------------------------------------------------------- -validateEmail 5.7 moderate - -Summary: - Blocking failures: 1 - Watch warnings: 1 - Attention warnings: 1 -``` - -### Color Coding - -Terminal output uses ANSI colors: - -- **Critical (red):** LRS ≥ 9.0 -- **High (yellow):** 6.0 ≤ LRS < 9.0 -- **Moderate (blue):** 3.0 ≤ LRS < 6.0 -- **Low (green):** LRS < 3.0 - -**Disable colors:** -```bash -NO_COLOR=1 hotspots analyze src/ --format text -``` - ---- - -## SARIF Format - -SARIF 2.1.0 output for GitHub code scanning. Produces inline annotations on pull requests and file views in the GitHub Security tab. - -### Basic Usage - -```bash -# Snapshot mode SARIF — write to a file with --output -hotspots analyze . --mode snapshot --format sarif --output .hotspots/results.sarif - -# Or capture stdout directly -hotspots analyze . --mode snapshot --format sarif > results.sarif -``` - -**Requires:** `--mode snapshot`. SARIF is not supported with `--mode delta`. - -**Note:** Unlike HTML, SARIF has no default output file. Without `--output`, SARIF is written to stdout. Always pass `--output ` or redirect stdout when using `upload-sarif`. - -### GitHub Code Scanning Integration - -Upload the SARIF file as a GitHub code scanning result using the `upload-sarif` action: - -```yaml -- uses: actions/checkout@v4 - with: - fetch-depth: 0 - -- name: Run Hotspots (snapshot) - run: hotspots analyze . --mode snapshot --format sarif --output .hotspots/results.sarif - -- name: Upload SARIF to GitHub - uses: github/codeql-action/upload-sarif@v3 - with: - sarif_file: .hotspots/results.sarif -``` - -This produces: -- **Inline annotations** on changed files in pull requests -- **Security tab alerts** grouped by rule (critical / high / moderate) -- **Dismissable alerts** for acknowledged risk - -### Risk Band Mapping - -| Hotspots band | SARIF level | GitHub display | -|---------------|-------------|----------------| -| `critical` | `error` | Red annotation | -| `high` | `warning` | Yellow annotation | -| `moderate` | `note` | Blue annotation | -| `low` | — | Not emitted | - -Only functions at `moderate` risk or above are included in the SARIF output. - -### Output Structure - -The emitted file follows SARIF 2.1.0 with `uriBaseId: "%SRCROOT%"` so GitHub resolves file paths correctly: - -```json -{ - "$schema": "https://docs.oasis-open.org/sarif/sarif/v2.1.0/...", - "version": "2.1.0", - "runs": [{ - "tool": { - "driver": { - "name": "hotspots", - "version": "1.9.0", - "informationUri": "https://hotspots.dev", - "rules": [ - { "id": "hotspots/critical-risk", "defaultConfiguration": { "level": "error" } }, - { "id": "hotspots/high-risk", "defaultConfiguration": { "level": "warning" } }, - { "id": "hotspots/moderate-risk", "defaultConfiguration": { "level": "note" } } - ] - } - }, - "results": [ - { - "ruleId": "hotspots/high-risk", - "level": "warning", - "message": { "text": "Function `processOrder` has a high risk score (LRS=7.20, CC=10)." }, - "locations": [{ - "physicalLocation": { - "artifactLocation": { "uri": "src/orders.ts", "uriBaseId": "%SRCROOT%" }, - "region": { "startLine": 42 } - } - }] - } - ] - }] -} -``` - ---- - -## Format Comparison - -### When to Use Each Format - -**JSON:** -- ✅ CI/CD pipelines -- ✅ Tooling integration -- ✅ AI agent consumption -- ✅ Data analysis -- ✅ Long-term storage -- ❌ Human inspection (use HTML/Text) - -**HTML:** -- ✅ Sharing reports -- ✅ Dashboards -- ✅ Historical tracking -- ✅ Presentations -- ✅ Non-technical stakeholders -- ❌ Programmatic parsing (use JSON) - -**Text:** -- ✅ Terminal inspection -- ✅ Quick checks -- ✅ Git hooks -- ✅ Local development -- ❌ Automation (use JSON) -- ❌ Visual analysis (use HTML) - -**SARIF:** -- ✅ GitHub code scanning annotations -- ✅ Security tab integration -- ✅ PR inline comments on risk functions -- ❌ Programmatic analysis (use JSON) -- ❌ Human inspection (use Text/HTML) - ---- - -## Output Redirection - -### Save to File - -```bash -# JSON -hotspots analyze src/ --format json > analysis.json - -# Text -hotspots analyze src/ --format text > analysis.txt - -# HTML (use --output instead) -hotspots analyze src/ --mode snapshot --format html --output report.html -``` - -### Pipe to Tools - -```bash -# Parse with jq -hotspots analyze src/ --format json | jq '.functions[] | select(.band == "critical")' - -# Filter with grep -hotspots analyze src/ --format text | grep "Critical" - -# Count lines -hotspots analyze src/ --format text | wc -l -``` - ---- - -## Related Documentation - -- [CLI Reference](../reference/cli.md) - Command-line options -- [CI/CD & GitHub Action](./ci-cd.md) - Using output in pipelines -- [Configuration](./configuration.md) - Filter and threshold options - ---- - -## JSON Schema Reference - -Hotspots produces versioned JSON output. The `schema_version` field indicates the format. - -### Schema Versions - -| Version | Scope | Added fields | -|---------|-------|--------------| -| **v4** (current default snapshot JSON) | Agent-optimized snapshot output | `summary`, `fire`/`debt`/`watch`/`ok` triage buckets, per-function `action` text, `architecture` aggregates, optional `architecture.models` | -| **v3** | Agent-optimized snapshot output | triage buckets and per-function `action` text before architecture aggregates were nested | -| **v2** | Full snapshot JSON (`--all-functions`) and delta output | `driver`, `driver_detail`, `patterns`, `pattern_details`, enriched `aggregates` (file_risk, co_change, modules) | -| **v1** | Delta output (legacy constant) | — | - -Always check `schema_version` before consuming output in tooling. - -### Driver Labels - -Each function includes an optional `driver` string identifying the primary source of risk: - -| Label | Condition | Recommended action | -|-------|-----------|-------------------| -| `cyclic_dep` | Function is in a dependency cycle (SCC size > 1) | Break the cycle before adding more callers | -| `high_complexity` | CC above the Pth percentile | Schedule a refactor; extract sub-functions | -| `high_churn_low_cc` | touch_count above Pth percentile and CC below (100-P)th | Add regression tests before next change | -| `high_fanout_churning` | fan_out above Pth percentile and touch above 50th | Extract an interface boundary | -| `deep_nesting` | ND above the Pth percentile | Flatten with early returns or guard clauses | -| `high_fanin_complex` | fan_in above Pth percentile and CC above 50th | Extract and stabilize; wide blast radius | -| `composite` | None of the above specific drivers | Monitor complexity trends | - -Thresholds are percentile-relative (default P=75, configurable via `driver_threshold_percentile`). `cyclic_dep` is the sole absolute check. - -### `driver_detail` — Near-miss context for composite functions - -When a function receives the `composite` label, `driver_detail` lists the top dimensions (up to 3) that came closest to firing a specific label, with their percentile rank. Example: `"cc (P72), nd (P68)"` means CC is at the 72nd percentile and ND at the 68th — notable but below the P75 threshold. Only dimensions above the 40th percentile are included. - -`driver_detail` is omitted from JSON when null (forward-compatible). - -### Architecture Aggregates (Default Snapshot JSON) - -Default snapshot JSON groups architectural concentration under `architecture`: -`architecture.file_risk`, `architecture.modules`, and, when requested, -`architecture.models`. - -### Aggregates (`--all-functions` Snapshot JSON) - -Snapshot output includes an `aggregates` object with these arrays: - -#### `aggregates.file_risk` — File-Level Risk - -Each entry covers one source file, ranked by `file_risk_score` descending. - -```typescript -{ - file: "src/api.ts", - function_count: 12, - loc: 340, - max_cc: 14, - avg_cc: 6.8, - critical_count: 2, - file_churn: 180, - file_risk_score: 8.3 // max_cc×0.4 + avg_cc×0.3 + log2(fn_count+1)×0.2 + churn×0.1 -} -``` - -#### `aggregates.co_change` — Co-Change Coupling - -Pairs of files that frequently change together in the same commit. High coupling with no static dependency = hidden implicit coupling. - -```typescript -{ - file_a: "hotspots-cli/src/main.rs", - file_b: "hotspots-core/src/aggregates.rs", - co_change_count: 14, - coupling_ratio: 0.78, - has_static_dep: false, - risk: "high" // "high" | "moderate" | "expected" | "low" -} -``` - -`risk: "expected"` means a static import exists between the files — the co-change is explained. Default window: 90 days; minimum count: 3. - -#### `aggregates.modules` — Module Instability - -Robert Martin's instability metric at the directory level: `instability = efferent / (afferent + efferent)`. - -```typescript -{ - module: "hotspots-core/src", - file_count: 12, - function_count: 409, - avg_complexity: 3.2, - afferent: 8, // external modules depending on this one - efferent: 3, // external modules this one depends on - instability: 0.27, // near 0 = risky to change; near 1 = safe - module_risk: "high" -} -``` - -#### `aggregates.models` / `architecture.models` — Model Risk Map - -Present only when snapshot JSON is generated with `--include-models`. - -```typescript -{ - items: [{ - name: "Snapshot", - file: "hotspots-core/src/snapshot.rs", - line: 219, - kind: "struct", - score: 52.11, - critical: 4, - high: 15, - moderate: 17, - functions: [{ - function: "Snapshot::populate_callgraph", - file: "hotspots-core/src/snapshot.rs", - line: 661, - lrs: 9.24, - quadrant: "debt", - association: "same-file" - }] - }], - links: [{ - source: 0, - target: 2, - shared_functions: 15, - shared_risk: 83.53, - functions: [ - "GoCfgBuilderState::visit_switch", - "GoCfgBuilderState::visit_select" - ] - }] -} -``` - -### Schema Files - -JSON Schema definitions are available in the `schemas/` directory: - -- `hotspots-output.schema.json` — Complete output schema -- `function-report.schema.json` — Individual function analysis -- `metrics.schema.json` — Raw metrics (CC, ND, FO, NS) -- `policy-result.schema.json` — Policy violation/warning format - -All schemas follow JSON Schema Draft 07. - ---- - -## Examples - -### Extract Critical Functions (JSON) - -```bash -hotspots analyze src/ --format json | \ - jq -r '.functions[] | select(.band == "critical") | "\(.function_id): \(.lrs)"' -``` - -### Generate Report Card (Text) - -```bash -hotspots analyze src/ --format text | \ - grep -A 10 "Summary:" -``` - -### Track Complexity Over Time (JSON) - -```bash -# Daily snapshots -hotspots analyze src/ --mode snapshot --format json > "reports/$(date +%Y%m%d).json" - -# Compare last 7 days -for file in reports/*.json; do - avg_lrs=$(jq '.aggregates.average_lrs' "$file") - echo "$(basename $file .json): $avg_lrs" -done -``` - -### Custom HTML Dashboard - -Embed HTML report in iframe: - -```html - - - - Complexity Dashboard - - -

Latest Complexity Report

- - - -``` - ---- - -**Need more examples?** Check out [examples/output-formats/](https://github.com/Stephen-Collins-tech/hotspots/tree/main/examples/output-formats). - -**Using in CI/CD?** See [CI/CD & GitHub Action](./ci-cd.md) for pipeline integration examples. diff --git a/docs/guide/suppression.md b/docs/guide/suppression.md deleted file mode 100644 index c2b74bb..0000000 --- a/docs/guide/suppression.md +++ /dev/null @@ -1,304 +0,0 @@ -# Suppression Comments - -Suppression comments allow you to exclude specific functions from policy violations while keeping them tracked in reports. This is useful for handling false positives, legacy code, and intentionally complex algorithms. - -## Quick Start - -Place a comment immediately before the function: - -```typescript -// hotspots-ignore: legacy code, refactor planned for Q2 2026 -function complexLegacyParser(input: string) { - // High complexity code... -} -``` - -## Comment Format - -**Required format:** -``` -// hotspots-ignore: -``` - -**Rules:** -1. Comment must be on the line **immediately before** the function -2. Format starts with `// hotspots-ignore:` -3. Reason is **required** after the colon (warning if missing) -4. Blank lines between comment and function break the suppression - -## Examples - -### Valid Suppressions - -```typescript -// hotspots-ignore: complex algorithm with proven test coverage -function fibonacci(n: number): number { - // ... -} - -// hotspots-ignore: generated code from protocol buffers -class MessageHandler { - handle() { /* ... */ } -} - -// hotspots-ignore: legacy parser, migration to new implementation in progress -const parse = (input: string) => { - // ... -}; -``` - -### Invalid Suppressions - -```typescript -// hotspots-ignore: reason here - -function foo() { } // ❌ Blank line breaks suppression - -// This is a comment -// hotspots-ignore: reason -function bar() { } // ❌ Other comment in between - -// hotspots-ignore -function baz() { } // ⚠️ Missing colon - treated as missing reason - -// hotspots-ignore: -function qux() { } // ⚠️ Warning: suppression without reason -``` - -## What Suppressions Affect - -### Excluded From (Policy Filtering) - -Suppressed functions are **excluded** from: - -1. **Critical Introduction** - Won't fail if function becomes critical -2. **Excessive Risk Regression** - Won't fail if LRS increases by ≥1.0 -3. **Watch Threshold** - No warning when entering watch range -4. **Attention Threshold** - No warning when entering attention range -5. **Rapid Growth** - No warning for rapid LRS increases - -### Included In (Still Tracked) - -Suppressed functions are **included** in: - -1. **Analysis Reports** - Visible with `suppression_reason` field -2. **Net Repo Regression** - Counted in total repository LRS -3. **Snapshots** - Persisted to `.hotspots/snapshots/` -4. **HTML Reports** - Displayed with suppression indicator -5. **JSON Output** - Contains `suppression_reason` field - -## Validation - -### Missing Reason Warning - -Functions suppressed without a reason trigger a warning: - -```typescript -// hotspots-ignore: -function foo() { } -``` - -**Policy output:** -```json -{ - "warnings": [ - { - "id": "suppression-missing-reason", - "severity": "warning", - "function_id": "src/foo.ts::foo", - "message": "Function src/foo.ts::foo suppressed without reason" - } - ] -} -``` - -This is a **warning only** (non-blocking), but encourages documenting suppressions. - -## JSON Output - -Suppressed functions include a `suppression_reason` field: - -```json -{ - "file": "src/legacy.ts", - "function": "oldParser", - "line": 42, - "metrics": { "cc": 15, "nd": 5, "fo": 8, "ns": 3 }, - "lrs": 12.5, - "band": "critical", - "suppression_reason": "legacy code, refactor planned for Q2 2026" -} -``` - -**Notes:** -- Functions without suppressions omit this field (not `null`) -- Empty reason shows as `"suppression_reason": ""` - -## Best Practices - -### When to Suppress - -✅ **Good reasons to suppress:** -- Complex algorithms with established test coverage -- Legacy code pending scheduled migration -- Generated code (protocol buffers, GraphQL, etc.) -- Intentionally complex code (e.g., optimized parsers, state machines) -- Well-documented algorithms (e.g., cryptographic functions) - -### When NOT to Suppress - -❌ **Bad reasons to suppress:** -- New code that should be refactored -- "I'll fix it later" without a concrete plan -- Code that could be simplified but you don't want to -- Avoiding code review feedback -- Hiding poor design choices - -### Documentation Guidelines - -**Good suppression reasons include:** - -1. **What** - What makes this complex -2. **Why** - Why it's intentionally complex or not being fixed now -3. **When** - When it will be addressed (if applicable) - -**Examples:** - -```typescript -// ✅ Good: Specific, actionable, dated -// hotspots-ignore: RSA encryption algorithm, well-tested, cannot be simplified - -// ✅ Good: Clear plan -// hotspots-ignore: legacy parser, migration to TreeSitter in Q2 2026 - -// ❌ Bad: Vague, no plan -// hotspots-ignore: TODO fix this later - -// ❌ Bad: No reason -// hotspots-ignore: -``` - -### Code Review Guidelines - -**Require review for suppressions:** -1. All new suppressions should be reviewed -2. Ensure reason is documented and valid -3. Verify suppression is the right choice (vs. refactoring) -4. Consider adding a tracking issue/ticket - -**Periodic audits:** -- Review suppressed functions quarterly -- Check if reasons are still valid -- Remove suppressions when code is refactored -- Update reasons if plans change - -## Technical Details - -### Determinism - -Suppression extraction is **fully deterministic**: -- Pure function of (source code, function span, source map) -- No I/O, randomness, or timestamps -- Same source → same suppression → same results -- Byte-for-byte identical snapshots - -### Persistence - -Suppressions are persisted in snapshots: -- `FunctionSnapshot.suppression_reason` field -- `FunctionDeltaEntry.suppression_reason` field -- Tracked across commits in delta mode -- Auditable in git history - -### Schema Compatibility - -Suppression fields are **backward compatible**: -- Optional fields with `skip_serializing_if = "Option::is_none"` -- Old snapshots work with new code (field is `None`) -- New snapshots work with old code (field is ignored) -- No schema version bump required - -## Examples in Different Contexts - -### CI/CD Integration - -```yaml -# .github/workflows/complexity.yml -- name: Check complexity - run: | - hotspots analyze . --mode delta --policies --format json > delta.json - - # Exit code 1 if any blocking policies failed - # Suppressed functions won't cause failures -``` - -### Delta Mode - -When comparing commits, suppression status comes from the **current** version: - -```typescript -// Commit A -function foo() { } // Not suppressed - -// Commit B -// hotspots-ignore: newly suppressed -function foo() { } // Now suppressed - won't trigger policies -``` - -### Refactoring Workflow - -```bash -# 1. Suppress complex function before refactor -# Add: // hotspots-ignore: refactoring in progress - -# 2. Refactor the function -# ... make changes ... - -# 3. Remove suppression comment -# Function now subject to policies again -``` - -## Troubleshooting - -### Suppression Not Working - -**Check:** -1. Comment is immediately before function (no blank lines) -2. Format is correct: `// hotspots-ignore: reason` -3. Running in delta mode with `--policies` flag -4. Verify in JSON output: `suppression_reason` field present - -### Warning About Missing Reason - -**Fix:** -```typescript -// Before (warning) -// hotspots-ignore: -function foo() { } - -// After (no warning) -// hotspots-ignore: legacy code, refactor planned -function foo() { } -``` - -### Suppression Applies to Wrong Function - -**Cause:** Comment on wrong line or blank line in between - -**Fix:** -```typescript -// Wrong -// hotspots-ignore: reason - -function foo() { } // Suppression not applied - -// Correct -// hotspots-ignore: reason -function foo() { } // Suppression applied -``` - -## See Also - -- [Policy Engine Documentation](USAGE.md#policy-engine) -- [Configuration Guide](USAGE.md#configuration) -- [CI/CD Integration](USAGE.md#cicd-integration) diff --git a/docs/guide/training.md b/docs/guide/training.md deleted file mode 100644 index 616ceec..0000000 --- a/docs/guide/training.md +++ /dev/null @@ -1,155 +0,0 @@ -# Training a Ranker - -By default, Hotspots ranks functions by **LRS** — a structural complexity score derived from the code itself. This works out of the box on any repo. - -`hotspots train` fits a local RandomForest model against your repo's own git history, learning which structural features correlate with bug-fix commits in *your* codebase. Once trained, `hotspots analyze` picks up the model automatically and uses it to re-rank functions and update triage quadrants. - ---- - -## When it's worth training - -Training is most valuable when: - -- Your repo has **at least a year of meaningful commit history** -- Fix commits follow recognizable conventions (`fix:`, `bug`, `patch`, `hotfix`, `regression`, `defect`) -- Bugs have been concentrated in specific files or functions rather than spread uniformly - -Training is unlikely to help on: - -- Library or framework repos with low bug rates and tiny base rates -- Repos with less than ~1 year of history -- Repos with poor commit hygiene (unflagged fix commits can't be scanned) -- Stable mature codebases where bugs are subtle and spread across the whole codebase - -Use `--eval` (described below) to verify the model actually learned something before relying on it. - ---- - -## Basic usage - -```bash -# Train with file-level labels (fast) -hotspots train . - -# Train with blame-based function-level labels (slower, more precise) -hotspots train . --blame -``` - -The model is saved to `.hotspots/ranker.json`. Once it exists, `hotspots analyze` loads it automatically on every run. - ---- - -## Label modes - -### File-level (default) - -Every function in a file touched by a fix commit is marked as a positive training example. Fast, but noisy — a bug in one function marks every function in that file. - -### Blame-based (`--blame`) - -`git diff-tree` hunk headers are parsed to identify which exact function owned the changed lines. Only that function is marked positive. - -More precise signal, but requires one subprocess per fix commit. Recommended for repos with large files where a single file can contain many unrelated functions. - -```bash -# Recommended for most production repos -hotspots train . --blame -``` - ---- - -## Checking whether training helped (`--eval`) - -`hotspots train` always produces a model, but that doesn't mean the model is better than the default LRS ranking. `--eval` measures this by computing **Precision@K** — how many of the top K ranked functions actually appeared in a bug-fix commit: - -```bash -hotspots train . --blame --eval -``` - -Example output: - -``` -P@K evaluation (365-day fix-label window): - K P@K base_rate - 10 0.400 0.084 - 20 0.300 0.084 - 50 0.200 0.084 - 100 0.150 0.084 - 200 0.110 0.084 -``` - -**How to read it:** - -- `base_rate` is the fraction of all functions that appeared in any fix commit. It's the score a random ranker would achieve. -- If `P@K` is meaningfully above `base_rate` (especially at low K), the model is surfacing real bug-prone functions at the top of the list. -- If `P@K ≈ base_rate`, the model learned nothing useful. In that case, stick with the default LRS ranking — applying a weak model can demote functions that are genuinely risky. - ---- - -## Label window - -By default, the scanner looks back 365 days for fix commits. Adjust with `--label-window`: - -```bash -# Use 6 months of history -hotspots train . --blame --label-window 180 - -# Use 2 years for repos with sparse bug fixes -hotspots train . --blame --label-window 730 -``` - -Larger windows provide more training examples but may include stale patterns that no longer reflect the codebase. - ---- - -## What the model trains on - -The v3 model trains on 8 structural features: `lrs`, `cc`, `nd`, `loc`, `fo`, `fan_in`, `total_churn`, `authors_90d`. - -Windowed activity signals (`touch_count_30d`, `days_since_last_change`, `activity_risk`) are deliberately excluded to prevent temporal leakage — these signals would be computed from the same time window being used for labels, inflating apparent performance. - ---- - -## How it integrates with `hotspots analyze` - -Once `.hotspots/ranker.json` exists, every `hotspots analyze` run automatically: - -1. Loads the ranker and scores every function -2. Uses RF scores to determine triage quadrants (in place of the activity heuristic) -3. Promotes high-probability functions from `debt` → `fire` where the model score warrants it - -A suppression gate runs on every analysis to detect poor model performance. If `P@10` is at or below the base rate, the output will note that the model is not performing above baseline and suggest re-training or relying on the default LRS ranking. - ---- - -## Minimum requirements - -Training will return an error if: - -- The snapshot has fewer than **50 functions**, or -- The fix scan yields fewer than **5 positive** or **10 negative** labels - -If this happens: - -- Try a larger `--label-window` -- Run `hotspots analyze . --mode snapshot --force` to regenerate a fresh snapshot -- Verify fix commits use recognizable keywords (`fix:`, `bug`, `patch`, `hotfix`, `regression`, `defect`) - ---- - -## Re-training - -The model reflects the git history at the time it was trained. Re-train periodically as new fix commits accumulate: - -```bash -# Re-train and immediately re-analyze -hotspots train . --blame && hotspots analyze . -``` - -There is no automatic re-training — the model file is updated only when you explicitly run `hotspots train`. - ---- - -## Full options reference - -See [`hotspots train`](/reference/cli#hotspots-train) in the CLI reference for all flags and defaults. diff --git a/docs/guide/usage.md b/docs/guide/usage.md deleted file mode 100644 index 4f9f6b9..0000000 --- a/docs/guide/usage.md +++ /dev/null @@ -1,619 +0,0 @@ -# Usage & Workflows - -## Basic Usage - -### Point-in-Time Analysis - -**Analyze a file or directory:** - -```bash -# Text output (default) -hotspots analyze src/ - -# JSON output -hotspots analyze src/ --format json - -# Analyze specific file -hotspots analyze src/api.ts -``` - -**Filter results:** - -```bash -# Show only top 10 most complex functions -hotspots analyze src/ --top 10 - -# Show only functions with LRS >= 5.0 -hotspots analyze src/ --min-lrs 5.0 - -# Combine filters -hotspots analyze src/ --top 10 --min-lrs 5.0 --format json -``` - -**Example output (text):** -``` -LRS File Line Function -11.2 src/api.ts 88 handleRequest -9.8 src/db/migrate.ts 41 runMigration -7.5 src/utils.ts 15 processData -``` - ---- - -## Git History Tracking - -### Prerequisites - -Hotspots must be run from within a git repository for snapshot/delta modes. - -### Creating Snapshots - -**Create a snapshot for the current commit:** - -```bash -# In a git repository -cd my-repo -hotspots analyze . --mode snapshot --format json -``` - -This will: -- Analyze all TypeScript files in the repository -- Create a snapshot with commit metadata (SHA, parents, timestamp, branch) -- Persist to `.hotspots/snapshots/.json` -- Update `.hotspots/index.json` - -**What gets stored:** -- All functions with their metrics (CC, ND, FO, NS, LRS, band) -- Commit information (SHA, parents, timestamp, branch) -- Function IDs (`::`) - -### Computing Deltas - -**Compare current state vs parent commit:** - -```bash -hotspots analyze . --mode delta --format json -``` - -This will: -- Load the parent snapshot (from `parents[0]`) -- Compare functions by `function_id` -- Show what changed: new, deleted, modified, unchanged -- Display metric deltas and band transitions - -**Delta output shows:** -- Function status (new/deleted/modified/unchanged) -- Before/after metrics and LRS -- Numeric deltas (cc, nd, fo, ns, lrs) -- Band transitions (e.g., "moderate" → "high") - -### Comparing Any Two Refs (`hotspots diff`) - -`hotspots diff` compares snapshots between any two git refs — not just a commit and its parent. Both refs must have existing snapshots. - -```bash -# Compare current branch against main -hotspots diff main HEAD - -# Compare two release tags -hotspots diff v1.0.0 v2.0.0 --format json - -# Review top 10 riskiest changes with policy check -hotspots diff main HEAD --top 10 --policy -``` - -**Prerequisites:** snapshots must exist for both refs. Create them with: - -```bash -# Create snapshot for each ref (use --force if one already exists) -git checkout main && hotspots analyze . --mode snapshot -git checkout my-branch && hotspots analyze . --mode snapshot -``` - -Policy is evaluated on the **full** changed set before any `--top` truncation, so violations in lower-ranked functions are never silently dropped. - -**Example delta output:** -```json -{ - "schema_version": 1, - "commit": { - "sha": "abc123", - "parent": "def456" - }, - "baseline": false, - "deltas": [ - { - "function_id": "src/api.ts::handleRequest", - "status": "modified", - "before": { - "metrics": {"cc": 5, "nd": 2, "fo": 3, "ns": 1}, - "lrs": 4.8, - "band": "moderate" - }, - "after": { - "metrics": {"cc": 7, "nd": 3, "fo": 3, "ns": 1}, - "lrs": 6.2, - "band": "high" - }, - "delta": { - "cc": 2, - "nd": 1, - "fo": 0, - "ns": 0, - "lrs": 1.4 - }, - "band_transition": { - "from": "moderate", - "to": "high" - } - } - ] -} -``` - ---- - -## Higher-Level Analysis - -In addition to per-function output, snapshot mode offers three higher-level views accessible -with `--level` or `--explain`. - -### File-Level Risk View (`--level file`) - -Aggregate per-function data up to the file level and see a ranked file table: - -```bash -# Ranked file risk table (requires --mode snapshot --format text) -hotspots analyze . --mode snapshot --format text --level file - -# Limit to top 20 files -hotspots analyze . --mode snapshot --format text --level file --top 20 -``` - -Columns: `#`, `file`, `fns`, `loc`, `max_cc`, `avg_cc`, `critical`, `churn`, `file_risk`. - -A file with 40 functions averaging cc=12 is a maintenance liability even if no single -function individually tops the per-function list. The composite `file_risk_score` captures this: - -``` -file_risk = max_cc × 0.4 + avg_cc × 0.3 + log2(fn_count + 1) × 0.2 + churn_factor × 0.1 -``` - -### Module Instability View (`--level module`) - -See Robert Martin's instability metric at the directory level: - -```bash -hotspots analyze . --mode snapshot --format text --level module -``` - -Columns: `#`, `module`, `files`, `fns`, `avg_cc`, `afferent`, `efferent`, `instability`, `risk`. - -- **Instability near 0.0** — everything depends on this module; risky to change -- **Instability near 1.0** — depends on others but nothing depends on it; safe to change -- **`module_risk = high`** — when `instability < 0.3` AND `avg_complexity > 10` - -The interesting hotspots are high-complexity modules with low instability (hard to change -AND everything depends on them). - -### Per-Function Explanations (`--explain`) - -See a human-readable breakdown of each function's risk score including individual metric -contributions, activity signals, and a co-change coupling section: - -```bash -hotspots analyze . --mode snapshot --format text --explain -hotspots analyze . --mode snapshot --format text --explain --top 10 -``` - -The co-change section at the bottom shows pairs of files that frequently change together -in the same commit. High co-change with no static dependency = hidden implicit coupling. -This signal is mined from the last 90 days of git history. - -### Snapshot Without Persisting (`--no-persist`) - -Run analysis in snapshot mode without writing to disk — useful for one-off inspection: - -```bash -hotspots analyze . --mode snapshot --no-persist --format json | jq .aggregates.file_risk -``` - -### Regenerating a Snapshot (`--force`) - -Snapshots are immutable by default. Use `--force` if you need to regenerate one: - -```bash -hotspots analyze . --mode snapshot --force -``` - -### Precise Per-Function Touch Metrics (`--per-function-touches`) - -By default, touch metrics are file-level. For more accurate per-function activity signals: - -```bash -hotspots analyze . --mode snapshot --per-function-touches -``` - -**Warning:** Approximately 50× slower. Only use when precise per-function touch counts are needed. - ---- - -## Common Workflows - -### Daily Development - -```bash -# 1. Check current complexity -hotspots analyze . --format text - -# 2. Before making changes, create snapshot -hotspots analyze . --mode snapshot --format json - -# 3. Make your changes... - -# 4. See what changed -hotspots analyze . --mode delta --format json - -# 5. Commit changes -git commit -m "Add feature" - -# 6. Create snapshot for new commit -hotspots analyze . --mode snapshot --format json -``` - -### CI/CD Integration - -**Mainline branch (persist snapshot for the merge commit):** - -```yaml -- name: Create snapshot - run: hotspots analyze . --mode snapshot --force -- name: Cache snapshot - uses: actions/cache/save@v4 - with: - path: .hotspots/snapshots - key: hotspots-snapshot-${{ github.sha }} -``` - -**PR branch (diff against base snapshot):** - -```yaml -- name: Restore base snapshot - uses: actions/cache/restore@v4 - with: - path: .hotspots/snapshots - key: hotspots-snapshot-${{ github.event.pull_request.base.sha }} -- name: Create HEAD snapshot - run: hotspots analyze . --mode snapshot --force -- name: Diff PR vs base - run: | - hotspots diff \ - ${{ github.event.pull_request.base.sha }} \ - ${{ github.sha }} \ - --format text --policy -``` - -The `--policy` flag evaluates policy rules on the full changed set and exits 1 on blocking failures. For a ready-made GitHub Action see [GitHub Action](github-action.md). - -### Refactoring Validation - -```bash -# Before refactoring -hotspots analyze . --mode snapshot --format json > before.json - -# Make refactoring changes... - -# After refactoring -hotspots analyze . --mode snapshot --format json > after.json - -# See the improvement -hotspots analyze . --mode delta --format json -``` - -Look for: -- Negative deltas (CC, ND, LRS decreased) -- Band transitions to lower risk (e.g., "high" → "moderate") -- Overall LRS reduction - ---- - -## History Management - -### Prune Unreachable Snapshots - -After force-pushes or branch deletions, clean up orphaned snapshots: - -```bash -# Dry-run: see what would be pruned -hotspots prune --unreachable --dry-run - -# Prune unreachable snapshots older than 30 days -hotspots prune --unreachable --older-than 30 - -# Prune all unreachable snapshots -hotspots prune --unreachable -``` - -**Safety:** Only prunes snapshots unreachable from `refs/heads/*` (local branches). Never prunes reachable snapshots. - -### Set Compaction Level - -```bash -# Set compaction level (currently only Level 0 is implemented) -hotspots compact --level 0 -``` - -**Note:** Levels 1-2 are metadata placeholders for future implementation. - ---- - -## Output Formats - -### Text Format (Default) - -``` -LRS File Line Function -11.2 src/api.ts 88 handleRequest -9.8 src/db/migrate.ts 41 runMigration -``` - -### JSON Format - -```bash -hotspots analyze src/ --format json -``` - -Outputs structured JSON with all metrics, risk components, LRS, and band. - ---- - -## Examples - -### Find Most Complex Functions - -```bash -hotspots analyze src/ --top 5 --format text -``` - -### Find Functions Needing Refactoring - -```bash -hotspots analyze src/ --min-lrs 9.0 --format json -``` - -### Track Complexity Over Time - -```bash -# On every commit (e.g., in pre-commit hook or CI) -hotspots analyze . --mode snapshot --format json -``` - -Then use deltas to see trends: -```bash -hotspots analyze . --mode delta --format json -``` - -### Compare Two Commits - -```bash -# Checkout first commit -git checkout -hotspots analyze . --mode snapshot --format json > commit1.json - -# Checkout second commit -git checkout -hotspots analyze . --mode snapshot --format json > commit2.json - -# Compare manually or use delta mode -hotspots analyze . --mode delta --format json -``` - ---- - -## Troubleshooting - -### "Path does not exist" - -Make sure you're pointing to a valid file or directory: -```bash -hotspots analyze ./src # Correct -hotspots analyze src # Also correct (relative path) -``` - -### "failed to extract git context" - -Snapshot/delta modes require a git repository: -```bash -# Make sure you're in a git repo -cd my-git-repo -hotspots analyze . --mode snapshot -``` - -### "snapshot already exists and differs" - -Snapshots are immutable by default. This error means a snapshot already exists for the -current commit but its content differs from the freshly-computed result. - -Repeated `analyze` runs on the same commit should be idempotent. If you see this error, -common causes are: -- A config change between runs that altered scores -- A tool version upgrade that changed metric computation - -To regenerate the snapshot intentionally: -```bash -hotspots analyze . --mode snapshot --force -``` - -### No output in delta mode - -If delta shows no changes: -- Check that parent snapshot exists (should be in `.hotspots/snapshots/`) -- Verify you're comparing against the correct parent (uses `parents[0]`) -- First commit will show `baseline: true` with all functions marked `new` - ---- - -## Development Mode - -For development, use the `dev` script: - -```bash -# Run without building -./dev analyze src/ - -# Equivalent to: cargo run -- analyze src/ -``` - ---- - -## Policy Engine - -The policy engine evaluates complexity regressions and enforces quality gates in CI/CD. - -### Running Policy Checks - -```bash -# Analyze with policy evaluation -hotspots analyze . --mode delta --policy --format json -``` - -Output includes a `policy` section with failures and warnings. - -### Built-in Policies - -**Blocking Policies (cause non-zero exit code):** - -1. **Critical Introduction** - Triggers when a function becomes Critical - - New functions introduced as Critical - - Existing functions crossing into Critical band - -2. **Excessive Risk Regression** - Triggers when LRS increases by ≥1.0 - - Modified functions only - - Threshold: +1.0 LRS (fixed) - -**Warning Policies (informational only):** - -3. **Watch Threshold** - Functions entering the "watch zone" - - Range: `watch_min` to `watch_max` (default: 4.0-6.0) - - Proactive alert before functions become high-risk - -4. **Attention Threshold** - Functions entering the "attention zone" - - Range: `attention_min` to `attention_max` (default: 6.0-9.0) - - Alerts for functions approaching critical complexity - -5. **Rapid Growth** - Functions with high percentage LRS increase - - Threshold: `rapid_growth_percent` (default: 50%) - - Detects sudden complexity spikes - -6. **Suppression Missing Reason** - Suppressions without documentation - - Warns when `// hotspots-ignore:` has no reason - - Encourages documenting why functions are suppressed - -7. **Net Repo Regression** - Overall repository complexity increase - - Sum of all function LRS scores increased - - Warning only (allows controlled growth) - -**Example policy output:** - -```json -{ - "policy": { - "failed": [ - { - "id": "critical-introduction", - "severity": "blocking", - "function_id": "src/api.ts::handleRequest", - "message": "Function src/api.ts::handleRequest introduced as Critical" - } - ], - "warnings": [ - { - "id": "watch-threshold", - "severity": "warning", - "function_id": "src/db.ts::query", - "message": "Function src/db.ts::query entered watch threshold range (LRS: 4.5)" - }, - { - "id": "net-repo-regression", - "severity": "warning", - "message": "Repository total LRS increased by 3.20", - "metadata": { - "total_delta": 3.20 - } - } - ] - } -} -``` - -### Configuring Warning Thresholds - -Customize warning ranges in your config file: - -```json -{ - "warnings": { - "watch": { - "min": 5.0, - "max": 7.0 - }, - "attention": { - "min": 7.0, - "max": 10.0 - }, - "rapid_growth_percent": 75.0 - } -} -``` - -To exclude specific functions from policy checks, use [Suppression Comments](/guide/suppression). - ---- - -## HTML Reports - -Generate interactive HTML reports for better visualization: - -```bash -hotspots analyze . --mode snapshot --format html -``` - -**HTML report features:** -- Interactive sorting by any column -- Filter by risk band and driver label -- Search by function name -- Color-coded risk bands and driver badges -- **Action column** in triage table: per-function refactoring recommendation (driver × quadrant) -- **Trend charts** (snapshot mode, requires ≥2 prior snapshots): - - Stacked bar chart: band-count distribution over time (up to 30 snapshots) - - Line charts: activity risk and top-1% concentration over time - - Hover tooltip on band chart with per-band counts -- Responsive design -- Self-contained (no external dependencies) - -**Delta mode HTML:** - -```bash -hotspots analyze . --mode delta --format html > delta-report.html -``` - -Shows function changes with: -- Status badges (new/modified/deleted/unchanged) -- Before/after metrics -- Band transitions -- Policy violations highlighted - -**Open in browser:** - -```bash -hotspots analyze . --mode snapshot --format html -open .hotspots/report.html # macOS -xdg-open .hotspots/report.html # Linux -start .hotspots/report.html # Windows -``` - ---- - -## See Also - -- [Metrics & LRS](../reference/metrics.md) - Local Risk Score details diff --git a/docs/index.md b/docs/index.md index 584dcc9..5f6cd3f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,73 +1,23 @@ -# Hotspots - -**Find where your engineering attention has the highest expected value.** - -Most of the pain in a codebase comes from a small fraction of it — the same files that generate the most incidents, the slowest reviews, and the hardest bugs to track down. Hotspots finds that fraction before it costs you. - -```bash -hotspots analyze src/ - -LRS File Line Function -12.4 src/api/billing.ts 142 processPlanUpgrade critical - 9.8 src/auth/session.ts 67 validateSession high - 3.2 src/utils/format.ts 12 formatDate low -``` - -Each result is ranked by **Local Risk Score (LRS)** — a weighted combination of cyclomatic complexity, nesting depth, fan-out, and exit paths. High LRS means the function is structurally hard to reason about, test, and safely change. - ---- - -## Get Started - - - ---- - -## What it gives you - -**An objective refactor list.** Stop debating what to clean up. The highest-LRS functions in your codebase are the ones most likely to slow your next feature and hide your next bug. - -**CI enforcement.** Block new critical-complexity functions before they merge. Delta mode shows exactly which functions got worse in a PR — and by how much. - -**Progress you can show.** "We dropped from 31 critical functions to 18 this quarter" is a concrete metric. Hotspots makes that trackable without extra tooling. - -**Everything stays local.** Analysis runs on your machine. No source code leaves, no account required, no data sent anywhere. - --- - -## Supported Languages - -TypeScript · JavaScript · Go · Python · Rust · Java · C - -All languages produce the same metrics with consistent semantics. See [Language Support](/reference/language-support). - ---- - -## How scoring works - -Each function receives a **Local Risk Score (LRS)** derived from four structural metrics (CC, ND, FO, NS), then optionally enriched with git-based activity signals to produce an **Activity Risk Score**. Functions are grouped into quadrants (fire / debt / watch / ok) and assigned a **driver label** naming the dominant risk dimension. - -See [Scoring Methodology](/reference/scoring) for the full pipeline — transforms, weights, pattern detection, quadrant logic, and ranking order. - +layout: home + +hero: + name: Hotspots + text: Find the code that's actually causing problems. + tagline: Multi-language complexity analysis — single binary, zero config, results in seconds. + actions: + - theme: brand + text: Get Started + link: /quickstart + - theme: alt + text: CLI Reference + link: /REFERENCE + +features: + - title: LRS Scoring + details: Local Risk Score combines cyclomatic complexity, nesting depth, fan-out, and non-structured exits into one actionable number per function. + - title: Git-enriched triage + details: Snapshot mode adds churn, touch frequency, and call-graph centrality — placing every function in a fire/debt/watch/ok quadrant. + - title: Policy engine + details: Block PRs that introduce critical-risk functions or regress LRS. Works with GitHub Actions, GitLab CI, and any CI that checks exit codes. --- - -## Where this is going - -**Risk Hotspots** — structurally complex, frequently changed code — are shipped. Several more categories are in active research: - -- **Review Hotspots** — changes that need senior eyes, not rubber stamps -- **Test Hotspots** — coverage gaps where CI misses real failures -- **Ownership Hotspots** — knowledge silos and review bottlenecks -- **Impact Hotspots** — code with outsized blast radius (auth, billing, schema contracts) - -The goal is a continuous picture of where your engineering attention matters most — before something ships, not after it pages you. - ---- - -## For Contributors - -Start with the [Codebase Guide](/code-architecture/) for the implementation map, then the [Contributing Guide](/contributing/) for dev workflow. diff --git a/docs/integrations/ai-agents.md b/docs/integrations/ai-agents.md deleted file mode 100644 index b416928..0000000 --- a/docs/integrations/ai-agents.md +++ /dev/null @@ -1,245 +0,0 @@ -# AI Agent Examples - -Example AI agent implementations and workflows using Hotspots for automated code analysis. - -## Overview - -Hotspots produces structured JSON output, runs deterministically, and completes in seconds — making it a natural fit for automated code review and refactoring workflows. - -This guide covers practical patterns for agents using Hotspots: CI enforcement, pre-commit hooks, and LLM-assisted refactoring. - ---- - -## Quick Start - -### Claude Code (Recommended) - -Claude Code can invoke Hotspots CLI commands directly in your project: - -```bash -# Analyze current changes -hotspots analyze . --mode delta --format json - -# Full snapshot with agent-optimized output -hotspots analyze . --mode snapshot --all-functions --format json -``` - -Ask Claude Code: *"Run hotspots analyze and show me which functions to refactor first."* - -### Running Agents via CLI - -For standalone scripts and CI agents, use the CLI directly: - -```bash -# Get structured JSON for any agent to parse -hotspots analyze src/ --format json > analysis.json - -# Delta analysis for PR review agents -hotspots analyze . --mode delta --policy --format json > delta.json -``` - ---- - -## Common patterns - -### Get critical functions from any script - -```python -import json, subprocess - -result = subprocess.run( - ['hotspots', 'analyze', 'src/', '--format', 'json'], - capture_output=True, text=True -) -data = json.loads(result.stdout) -critical = [f for f in data['functions'] if f['band'] == 'critical'] -``` - -### Pass hotspot context to an LLM - -```python -import anthropic, json, subprocess - -result = subprocess.run( - ['hotspots', 'analyze', 'src/', '--min-lrs', '9.0', '--format', 'json'], - capture_output=True, text=True -) -hotspots = json.loads(result.stdout)['functions'] - -client = anthropic.Anthropic() -response = client.messages.create( - model="claude-sonnet-4-6", - max_tokens=4096, - messages=[{ - "role": "user", - "content": f"These are my critical functions. Suggest specific refactoring strategies:\n\n{json.dumps(hotspots, indent=2)}" - }] -) -print(response.content[0].text) -``` - -### PR delta check - -```python -import json, subprocess, sys - -result = subprocess.run( - ['hotspots', 'analyze', '.', '--mode', 'delta', '--policy', '--format', 'json'], - capture_output=True, text=True -) -delta = json.loads(result.stdout) -violations = delta.get('policy_results', {}).get('failed', []) - -if violations: - for v in violations: - print(f"FAIL: {v['message']}") - sys.exit(1) -``` - ---- - -## Common Workflows - -### Pre-Commit Hook - -Analyze changes before commit: - -```bash -#!/bin/bash -# .git/hooks/pre-commit - -# Analyze staged changes -hotspots analyze . --mode delta --policy --format json > /tmp/hotspots-delta.json - -# Check for violations -violations=$(jq '.policy_results.failed | length' /tmp/hotspots-delta.json) - -if [ "$violations" -gt 0 ]; then - echo "❌ Complexity violations detected!" - jq -r '.policy_results.failed[] | " - \(.message)"' /tmp/hotspots-delta.json - echo "" - echo "Run 'hotspots analyze . --mode delta --policy' for details" - exit 1 -fi - -echo "✅ No complexity violations" -``` - -### Continuous Monitoring - -Track complexity in CI: - -```yaml -# .github/workflows/complexity.yml -name: Complexity Tracking - -on: [push, pull_request] - -jobs: - track: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - - - uses: Stephen-Collins-tech/hotspots-action@v1 - id: hotspots - - - name: Upload Results - uses: actions/upload-artifact@v4 - with: - name: complexity-report-${{ github.sha }} - path: ${{ steps.hotspots.outputs.json-output }} -``` - -### AI Code Review - -Integrate with AI code review service: - -```python -# ai_code_review.py -def review_with_ai(delta_json): - """Send delta to AI for review""" - violations = delta_json['policy_results']['failed'] - - if not violations: - return "✅ No issues found" - - prompt = f""" -Review these complexity violations and provide guidance: - -{json.dumps(violations, indent=2)} - -For each violation: -1. Explain why it's problematic -2. Suggest specific refactoring -3. Estimate effort (S/M/L) -""" - - # Call AI service (Claude, GPT, etc.) - return ai_service.complete(prompt) -``` - ---- - -## Best Practices - -### 1. Set Appropriate Thresholds - -Don't start too strict - iterate: - -```json -{ - "thresholds": { - "moderate": 5.0, // Start conservative - "high": 8.0, - "critical": 10.0 - } -} -``` - -### 2. Focus on High-Risk Functions - -Prioritize refactoring high-LRS functions: - -```python -# Focus on top 10 highest-risk functions -hotspots = sorted(functions, key=lambda f: f['lrs'], reverse=True)[:10] -``` - -### 3. Measure Improvement - -Track before/after metrics: - -```python -before = analyze_function(source_before) -after = analyze_function(source_after) - -improvement = before['lrs'] - after['lrs'] -print(f"LRS improved by {improvement:.1f} ({improvement/before['lrs']*100:.0f}%)") -``` - -### 4. Automate Workflows - -Use GitHub Actions, GitLab CI, or pre-commit hooks. - ---- - -## Related Documentation - -- [CI/CD Guide](../guide/ci-cd.md) - Automate in pipelines -- [Output Formats](../guide/output-formats.md) - JSON schema for parsing -- [CLI Reference](../reference/cli.md) - Command-line usage - ---- - -## Example Repository - -See [examples/ai-agents/](https://github.com/Stephen-Collins-tech/hotspots/tree/main/examples/ai-agents) for: -- Complete agent implementations -- GitHub Action workflows -- Pre-commit hook examples -- AI integration scripts - ---- - -**Build your own AI agents with Hotspots!** 🤖 diff --git a/docs/integrations/ai-integration.md b/docs/integrations/ai-integration.md deleted file mode 100644 index 7a46fbf..0000000 --- a/docs/integrations/ai-integration.md +++ /dev/null @@ -1,255 +0,0 @@ -# AI Integration - -Hotspots is designed with AI-assisted development in mind. Its structured JSON output, deterministic analysis, and fast execution make it a natural fit for AI code review and refactoring workflows. - -## Claude Code / CLI Workflows - -Claude Code can invoke Hotspots CLI commands directly in your project — no setup required. - -### Analyze current changes - -```bash -# Ask Claude Code to run this and explain results -hotspots analyze . --mode delta --format json - -# Or with agent-optimized output (quadrant buckets + action text) -hotspots analyze . --mode snapshot --all-functions --format json -``` - -**Example prompts for Claude Code:** -- *"Run hotspots analyze and show me which functions to refactor first."* -- *"Check if my recent changes increased complexity."* -- *"Find the most complex functions in src/ and suggest refactorings."* - -Claude Code will execute the command, parse the JSON, and provide actionable insights. - ---- - -## Common AI Workflows - -### 1. Pre-Commit Code Review - -Catch complexity regressions before code is committed. - -```bash -# Analyze changes vs parent commit -hotspots analyze . --mode delta --policy --format json > delta.json - -# Review delta.json with an AI assistant, then fix and re-analyze -``` - -**AI prompt template:** -``` -Review this complexity delta and identify concerns: - -[paste delta.json content] - -Focus on: -- New critical functions (LRS ≥ 9.0) -- Functions where LRS increased > 1.0 -- Band transitions to higher risk -- Policy violations - -Suggest specific refactorings for each issue. -``` - -### 2. Refactoring Loop - -Iteratively reduce complexity with AI assistance. - -```bash -# 1. Identify high-complexity targets -hotspots analyze . --min-lrs 9.0 --format json > targets.json - -# 2. AI analyzes and suggests refactorings -# 3. Apply refactorings -# 4. Measure improvement -hotspots analyze . --mode delta --format json > improvement.json - -# 5. Repeat until complexity is acceptable -``` - -**Measuring success:** Look for `delta.lrs < 0`, lower `band_transition.to`, and decreased `metrics.cc` / `metrics.nd`. - -### 3. Complexity-Aware Code Generation - -Generate code that meets complexity constraints from the start. - -``` -Generate a TypeScript function that [description]. - -Constraints: -- LRS must be < 6.0 (moderate complexity or lower) -- Prefer multiple small functions over one large function -- Avoid deep nesting (ND ≤ 2) - -After generating, I'll run: -hotspots analyze src/new-feature.ts --format json - -And share the results. Iterate if LRS > 6.0. -``` - -**Iterative example:** -``` -Iteration 1: Monolithic function → LRS 8.5 (high) -Iteration 2: Split into 3 functions → LRS 4.2, 3.8, 2.9 ✅ -``` - -### 4. Automated PR Review (GitHub Actions) - -```yaml -name: AI Complexity Review - -on: pull_request - -jobs: - review: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - name: Analyze complexity - run: hotspots analyze . --mode delta --policy --format json > delta.json - - - name: AI Review - run: | - # Send delta.json to AI API and post comment to PR - # See examples/ai-agents/ for reference implementation -``` - ---- - -## Agent Examples - -### Refactoring Assistant (Python) - -```python -import json -import subprocess - -class RefactoringAssistant: - def __init__(self, threshold=8.0): - self.threshold = threshold - - def analyze(self, path): - result = subprocess.run( - ['hotspots', 'analyze', path, '--format', 'json'], - capture_output=True, text=True - ) - data = json.loads(result.stdout) - return [fn for fn in data['functions'] if fn['lrs'] >= self.threshold] - - def suggest_refactorings(self, function): - suggestions = [] - if function['metrics']['cc'] > 10: - suggestions.append('High CC — extract sub-functions to reduce branching') - if function['metrics']['nd'] > 4: - suggestions.append('Deep nesting — use early returns or guard clauses') - if function['metrics']['fo'] > 8: - suggestions.append('High fan-out — consider a facade or coordinator pattern') - return suggestions - -assistant = RefactoringAssistant(threshold=8.0) -targets = assistant.analyze('src/') - -for fn in targets: - print(f"\n{fn['function_id']} (LRS: {fn['lrs']:.1f}, band: {fn['band']})") - for s in assistant.suggest_refactorings(fn): - print(f" - {s}") -``` - -### AI-Guided Refactoring (TypeScript / Claude API) - -```typescript -import Anthropic from '@anthropic-ai/sdk'; -import { execa } from 'execa'; - -const client = new Anthropic(); - -async function suggestRefactoring(filePath: string, functionName: string) { - const { stdout } = await execa('hotspots', ['analyze', filePath, '--format', 'json']); - const output = JSON.parse(stdout); - const fn = output.functions.find((f: any) => f.function_id.endsWith(functionName)); - - if (!fn) return 'Function not found'; - - const response = await client.messages.create({ - model: 'claude-sonnet-4-6', - max_tokens: 2048, - messages: [{ - role: 'user', - content: `Refactor this function to reduce complexity: - -Function: ${fn.function_id} -LRS: ${fn.lrs} (${fn.band}) -CC: ${fn.metrics.cc}, ND: ${fn.metrics.nd}, FO: ${fn.metrics.fo}, NS: ${fn.metrics.ns} - -Target: LRS < 6.0. Provide specific refactoring with before/after code.` - }] - }); - - return response.content[0].type === 'text' ? response.content[0].text : ''; -} -``` - -### Pre-Commit Hook - -```bash -#!/bin/bash -# .git/hooks/pre-commit - -hotspots analyze . --mode delta --policy --format json > /tmp/delta.json - -violations=$(jq '.policy_results.failed | length' /tmp/delta.json) - -if [ "$violations" -gt 0 ]; then - echo "Complexity violations detected:" - jq -r '.policy_results.failed[] | " - \(.message)"' /tmp/delta.json - exit 1 -fi - -echo "Complexity check passed" -``` - ---- - -## JSON Output for AI Consumption - -Hotspots produces structured JSON suitable for AI consumption. Key fields: - -```json -{ - "function_id": "src/api.ts::handleRequest", - "file": "src/api.ts", - "line": 88, - "metrics": { - "cc": 15, - "nd": 4, - "fo": 8, - "ns": 3 - }, - "lrs": 11.2, - "band": "critical", - "driver": "high_complexity", - "quadrant": "fire" -} -``` - -**Tips for AI workflows:** -- Use `--mode delta` to reduce payload size (only changed functions) -- Use `--min-lrs 6.0` to focus on high/critical functions -- Use `--all-functions` for the agent-optimized v3 schema with quadrant buckets (`fire`/`debt`/`watch`/`ok`) and per-function `action` text - -See [Output Formats](../guide/output-formats) for complete JSON schema documentation. - ---- - -## Best Practices - -1. **Use deterministic analysis** — Hotspots produces byte-for-byte identical output for identical input. Cache results by file hash. -2. **Prefer delta mode** — Faster and lower AI token usage; only changed functions sent. -3. **Batch analysis** — Analyze entire directory at once rather than file-by-file. -4. **Close the feedback loop** — Re-analyze after refactoring to verify improvement. -5. **Include metric context in prompts** — Specify which metrics (CC, ND, FO, NS) are high for targeted suggestions. diff --git a/docs/patterns.md b/docs/patterns.md deleted file mode 100644 index dd2e3b0..0000000 --- a/docs/patterns.md +++ /dev/null @@ -1,397 +0,0 @@ -# Code Pattern Reference - -This document is the single source of truth for the code patterns that Hotspots detects. Each entry defines what the pattern means, which signals trigger it, what the industry calls it, and why it matters for maintenance risk. - -Patterns are **informational labels** — they do not affect the LRS score or risk band. A function can carry multiple patterns simultaneously. - ---- - -## How patterns work - -Patterns are derived from the same signals Hotspots already measures: cyclomatic complexity (CC), nesting depth (ND), fan-out (FO), non-structured exits (NS), lines of code (LOC), and — in snapshot mode — call graph data (fan-in, SCC membership, dependency depth) and git data (churn, touch frequency, recency). - -Two tiers exist based on what data is available: - -| Tier | Available in | Requires | -|---|---|---| -| **Tier 1 — Structural** | All modes | CC, ND, FO, NS, LOC only | -| **Tier 2 — Enriched** | Snapshot mode only | Call graph + git history | - -Tier 2 patterns are a superset: a function can carry both structural and enriched patterns at once. - -Some Tier 2 patterns are **derived** — they are pure conjunctions of other named patterns and reuse those patterns' thresholds directly rather than defining independent thresholds. This is noted in each entry. All other patterns are **primitive**. - ---- - -## Metric definitions - -All thresholds reference the following metrics. Definitions are locked here to ensure consistent implementation and testing. - -**`LOC`** — physical lines of code for the function body, excluding blank lines and comment-only lines. - -**`CC`** — cyclomatic complexity (McCabe 1976): number of linearly independent paths, equivalent to the number of binary decision points + 1. - -**`ND`** — maximum nesting depth within the function body: the deepest level of nested control structures (if/else, loops, try/catch, match arms, closures). - -**`FO`** — fan-out: number of distinct functions called directly within the function body, excluding calls to the standard library or external dependencies unless they cross a module boundary. - -**`NS`** — non-structured exits: count of early returns, thrown exceptions, and multi-level breaks or continues within the function body. - -**`fan_in`** — number of distinct functions in the repo that contain a direct call to this function. Computed from the snapshot call graph. External callers (outside the repo) are not counted. - -**`scc_size`** — size of the strongly connected component containing this function in the call graph. A value of 1 means the function is not in a cycle. Requires snapshot mode. - -**`churn_lines`** — cumulative sum of lines added plus lines deleted touching this function across all commits in the git history, or within a configurable window (default: lifetime). Renames are followed. Merge commits are excluded. Whitespace-only changes are excluded by default. - -**`touch_frequency`** — number of distinct commits touching this function. Default window: lifetime. A configurable rolling window (e.g. last 90 days) is a planned extension. - -**`days_since_last_change`** — calendar days between today and the most recent commit that modified at least one non-whitespace line in this function body. - -**`neighbor_churn`** — sum of `churn_lines` for all direct callees (outgoing call graph edges, 1-hop only). Excludes stdlib and external dependencies. If a callee has no git history entry (e.g. it was added and never changed), it contributes 0. - ---- - -## Default thresholds and configuration - -All thresholds listed in this document are **defaults**. They are deliberately conservative: it is better to surface a genuine pattern and have a developer dismiss it than to miss a real structural problem. - -Thresholds can be tuned per project via the `patterns` section of `.hotspotsrc.json` (planned — not yet implemented). When implemented, each threshold will support two forms: - -- **Absolute**: `LOC >= 80` — fixed value, language- and repo-size-agnostic -- **Percentile**: `LOC >= p95` — computed within a configurable scope (repo, language, module) — planned extension - -The absolute form is used for all defaults today. Percentile-based thresholds are a planned extension and will not require a schema change when added. - ---- - -## Exclusions and guards - -The following exclusions apply by default to reduce noise. They can be overridden in `.hotspotsrc.json` (planned): - -- **Generated files** — files matching common generated-code patterns (e.g. `*.pb.go`, `*_generated.*`, files containing a `// Code generated` header) are excluded from pattern detection. -- **Vendored directories** — `vendor/`, `node_modules/`, and similar third-party trees are excluded. -- **Test files** — files matching language-standard test patterns (e.g. `*_test.go`, `*.spec.ts`, `test_*.py`) are excluded. This avoids false positives on test helpers and fixtures. -- **Path globs** — arbitrary glob patterns can be added to `.hotspotsrc.json` to exclude generated, legacy, or scaffolding code. - -Note: `middle_man` and `neighbor_risk` are particularly sensitive to entrypoint functions (web handlers, CLI commands, dispatchers) that are expected to have high fan-in and fan-out by design. Until explicit entrypoint exclusion is implemented, these patterns may fire on legitimate dispatch code and should be reviewed with that context in mind. - ---- - -## Output contract - -Patterns appear in all output modes under a stable schema: - -``` -patterns: string[] -``` - -Detailed trigger information is available with `--explain-patterns`: - -```json -"pattern_details": [ - { - "id": "complex_branching", - "tier": 1, - "kind": "primitive", - "triggered_by": [ - { "metric": "cc", "op": ">=", "value": 12, "threshold": 10 }, - { "metric": "nd", "op": ">=", "value": 5, "threshold": 4 } - ] - } -] -``` - -`pattern_details` is omitted from JSON output when `--explain-patterns` is not set (forward-compatible field). - -In tabular text output, patterns appear as a comma-separated list in a `PATTERNS` column. With `--explain-patterns`, each function row is followed by indented detail lines showing the exact metric values that triggered each pattern: - -``` -14.10 critical src/imports.rs 622 resolve_cargo_workspace_edges complex_branching, deeply_nested - complex_branching: cc=15 (>=10), nd=5 (>=4) - deeply_nested: nd=5 (>=5) -``` - -In the HTML report, patterns appear as colored pill badges inline in the function table. A pattern breakdown panel above the table shows pattern frequencies across the entire repository, sorted by count. See [HTML Output Format](./guide/output-formats.md#html-format) for details. - -Ordering within the patterns list is deterministic: Tier 1 patterns before Tier 2, then alphabetical within each tier. - ---- - -## Tier 1 — Structural Patterns - -These fire from static metrics alone and are available in every output mode. - ---- - -### `complex_branching` - -**Kind:** Primitive -**Industry names:** Complex Method, Arrow Anti-Pattern, Spaghetti Logic -**Source:** McCabe cyclomatic complexity (1976); Fowler *Refactoring* (long conditional chains) - -**Signals:** High CC **and** high ND - -**Definition:** -High cyclomatic complexity (many independent paths) compounded by deep nesting (those paths are hard to visually parse). Either dimension alone is tolerable; together they produce functions that are genuinely difficult to reason about because the branching structure must be held in working memory while navigating multiple nesting levels. - -**Why it matters:** -CC directly predicts defect density and test case count. Deep nesting amplifies this by making the logic structure opaque. Functions with both properties consistently show higher bug rates in empirical studies. - -**Default thresholds:** CC ≥ 10 **and** ND ≥ 4 - ---- - -### `deeply_nested` - -**Kind:** Primitive -**Industry names:** Arrow Anti-Pattern, Pyramid of Doom -**Source:** Common style guidance across most languages; prominent in JavaScript community as "callback hell" - -**Signals:** High ND, regardless of CC - -**Definition:** -Excessive nesting depth even in the absence of high branch count. This typically appears as guard clauses that were never inverted, nested callbacks, or deeply nested loops. Readability degrades non-linearly with depth: at ND ≥ 5, the outermost context is genuinely difficult to hold in mind while reading inner code. - -**Why it matters:** -Beyond readability, deep nesting is a sign that early-exit refactoring opportunities have been missed. It correlates with higher defect rates independently of CC because the number of implicit assumptions grows with depth. - -**Default thresholds:** ND ≥ 5 - ---- - -### `exit_heavy` - -**Kind:** Primitive -**Industry names:** Multiple Return Points (contested — guard-clause style rehabilitates small counts) -**Source:** Structured programming tradition - -**Signals:** High NS - -**Definition:** -A function with many non-structured exit points: early returns, thrown exceptions, and break or continue statements that exit multiple levels. A small number of guard clauses at the top of a function is good style and will not trigger this pattern. The concern is exits *scattered throughout* the body, which make it hard to reason about the complete set of outcomes and to instrument uniformly (e.g. adding logging on all exit paths). - -What NS counts is language-specific. See the NS metric definition above. If your language's NS metric has additional detail (e.g. whether `break` inside a `match` arm counts), that definition takes precedence here. - -**Why it matters:** -Each exit point is a place where a caller's assumption about post-conditions can be violated. Many scattered exits also increase the number of paths that must be independently tested, compounding the effect of high CC. - -**Default thresholds:** NS ≥ 5 - ---- - -### `god_function` - -**Kind:** Primitive -**Industry names:** God Method, Brain Method, Long Method -**Source:** Fowler *Refactoring* (Long Method); Tornhill *Software Design X-Rays* (Brain Method) - -**Signals:** High LOC **and** high FO - -**Definition:** -A function that does too much *and* orchestrates too many other concerns. The size (LOC) indicates a Single Responsibility violation; the fan-out (FO) shows it is reaching into many other modules to do so. The combination is worse than either alone: a large function that is also deeply entangled is expensive to read, test, and change. This pattern often co-occurs with Feature Envy (Fowler), where a function accesses another module's data more than its own, but the two are not equivalent — `god_function` is defined by size and breadth of coupling, not by data access patterns. - -**Why it matters:** -God functions are the most common root cause of defect clusters. They are hard to unit test (too many paths, too many dependencies), hard to review (too much context), and attract further additions because "it already does everything." - -**Default thresholds:** LOC ≥ 60 **and** FO ≥ 10 - ---- - -### `long_function` - -**Kind:** Primitive -**Industry names:** Long Method -**Source:** Fowler *Refactoring* — one of the original code smells - -**Signals:** High LOC alone - -**Definition:** -A physically large function, irrespective of branching or coupling. Length alone is a proxy for Single Responsibility violations: functions that are hard to name concisely are usually doing more than one thing. LOC also directly predicts review time and the likelihood that a reviewer misses a subtle bug. - -Note: `long_function` will frequently co-occur with `god_function`. `god_function` is the stronger signal when both fire, but `long_function` alone is still meaningful — a large function with low fan-out may be a sequential pipeline that is easy to follow but should still be decomposed. - -**Why it matters:** -Short functions are easier to name, test, review, and reuse. Function length is one of the strongest simple predictors of bug probability per function in empirical research. - -**Default thresholds:** LOC ≥ 80 - ---- - -## Tier 2 — Enriched Patterns - -These require snapshot mode (`--mode snapshot`) because they combine structural metrics with call graph topology and git history. - ---- - -### `churn_magnet` - -**Kind:** Primitive -**Industry names:** Hotspot (Tornhill), Change-Prone Complex Method -**Source:** Tornhill *Your Code as a Crime Scene* — the core thesis: complexity × churn = maintenance risk - -**Signals:** High churn_lines **and** high CC - -**Definition:** -The canonical hotspot: a function that is both structurally complex and frequently changed. Complexity means each change is expensive and error-prone; churn means those expensive changes happen often. This is the primary signal for maintenance debt accumulation. - -**Why it matters:** -Tornhill's empirical research across many large codebases shows that the intersection of high complexity and high churn predicts defect density far better than either dimension alone. A complex function that never changes is legacy, but manageable. A complex function that changes constantly is actively generating defects. - -**Default thresholds:** churn_lines ≥ 200 **and** CC ≥ 8 - ---- - -### `cyclic_hub` - -**Kind:** Primitive -**Industry names:** Inappropriate Intimacy, Cyclic Dependency, Circular Coupling -**Source:** Fowler *Refactoring* (Inappropriate Intimacy); Martin — Acyclic Dependencies Principle - -**Signals:** scc_size > 1 **and** high fan_in - -**Definition:** -A function caught in a cyclic dependency (its SCC contains more than one node) that is also heavily depended upon from outside the cycle. This is the hardest structural problem to refactor: breaking the cycle requires decoupling functions that mutually depend on each other, and the high fan-in means many external callers are also affected by any interface change. - -**Why it matters:** -Cyclic dependencies prevent independent deployment, make testing harder (circular mocking), and are the primary obstacle to modularisation. When a cyclic function is also a hub, breaking it requires coordinated changes across multiple files. Detecting it early gives teams a chance to break cycles before the fan-in grows further. - -**Default thresholds:** scc_size ≥ 2 **and** fan_in ≥ 6 - ---- - -### `hub_function` - -**Kind:** Primitive -**Industry names:** Knowledge Concentration, Central Module -**Source:** Tornhill *Your Code as a Crime Scene* (knowledge concentration); object-oriented coupling literature (hub coupling) - -**Signals:** High fan_in **and** high CC - -**Definition:** -A function that many callers depend on and that is itself structurally complex. Fan-in measures how much of the codebase relies on this function; CC measures how hard it is to understand and change. The combination creates a structural bottleneck: any bug here has wide blast radius, and changing it requires understanding complex logic under pressure from many dependent callers. - -**Why it matters:** -Hub functions with high complexity are the highest-leverage refactoring targets in a codebase. Simplifying them reduces risk for all their callers simultaneously. They are also the most likely locus of defensive code that accumulates over time as callers make conflicting demands. - -**Default thresholds:** fan_in ≥ 10 **and** CC ≥ 8 - ---- - -### `middle_man` - -**Kind:** Primitive -**Industry names:** Middle Man, Dispatcher Anti-Pattern, God Router -**Source:** Fowler *Refactoring* (Middle Man) - -**Signals:** High fan_in **and** high FO **and** low CC - -**Definition:** -A function that many things call, which itself calls many other things, but contains little logic of its own. It is a routing layer that has grown beyond its original purpose. Low CC distinguishes this from `hub_function`: a middle man is structurally simple inside but topologically central. It may be acceptable as a legitimate dispatcher, but at scale it represents unnecessary coupling — all callers are coupled to a routing function they do not actually need. - -See the note in [Exclusions and guards](#exclusions-and-guards) about entrypoint functions, which can trigger this pattern legitimately. - -**Why it matters:** -Middle men add indirection without abstraction. They make the call graph harder to navigate, increase coupling, and become a maintenance problem when routing logic starts to grow. Large middle men are a sign that the dependency inversion principle has been bypassed. - -**Default thresholds:** fan_in ≥ 8 **and** FO ≥ 8 **and** CC ≤ 4 - ---- - -### `neighbor_risk` - -**Kind:** Primitive -**Industry names:** Instability by Proximity, Dependency Contamination -**Source:** New pattern — not in standard literature. Justified by Hotspots' unique ability to measure churn in the call graph neighbourhood, a signal no traditional static analysis tool provides. - -**Signals:** High neighbor_churn **and** high FO - -**Definition:** -A function whose own code is stable, but whose dependencies are churning heavily. Even if this function never changes, it is at risk from the instability of the code it calls. A function with many callees, most of which are actively being changed, is one dependency update away from needing to adapt. - -See the note in [Exclusions and guards](#exclusions-and-guards) about entrypoint-style functions that may have high FO by design. - -**Why it matters:** -Standard hotspot analysis focuses on a function's own churn. Hotspots' call graph data allows detection of *indirect* instability — functions that are stable islands in unstable neighbourhoods. These are the functions most likely to appear stable in reviews but break unexpectedly after a sprint of changes in adjacent code. Flagging them allows teams to add integration tests before the instability propagates. - -**Default thresholds:** neighbor_churn ≥ 400 **and** FO ≥ 8 - ---- - -### `shotgun_target` - -**Kind:** Primitive -**Industry names:** Shotgun Surgery target, High-Impact Churn, Ripple Effect Source -**Source:** Fowler *Refactoring* — Shotgun Surgery describes code where a single change requires many scattered edits; the *target* is the function receiving those changes under pressure from many callers - -**Signals:** High fan_in **and** high churn_lines - -**Definition:** -A function that is both heavily depended upon and frequently changed. Each change here has the potential to break any of its callers, and the frequency of change means risk materialises regularly. This is distinct from `hub_function` (which focuses on complexity as the amplifier) and `churn_magnet` (which focuses on CC × churn): here the risk is breadth of impact × change frequency, regardless of the function's internal complexity. - -**Why it matters:** -High fan-in with high churn is the signature of a function that has been modified without its interface being stabilised. It suggests the abstraction boundary is wrong — callers are depending on implementation details that keep shifting. Stabilising the interface (or breaking the function apart) is the primary remediation. - -**Default thresholds:** fan_in ≥ 8 **and** churn_lines ≥ 150 - ---- - -### `stale_complex` - -**Kind:** Primitive -**Industry names:** Untouchable Code, Fear Code, Frozen Complexity -**Source:** Tornhill *Software Design X-Rays* — the "fear" metric: complex code that nobody dares change - -**Signals:** High CC **and** high LOC **and** very low touch_frequency (days_since_last_change above threshold) - -**Definition:** -A complex, large function that has not been changed in an extended period — not because it is stable and well-understood, but because it is complex enough to be feared. This is the *inverse* of `churn_magnet`: rather than accumulating churn, it accumulates entropy through avoidance. Teams work around it, adding adapter layers rather than modifying the core logic. - -"Not changed" means no commit has modified at least one non-whitespace line in the function body within the threshold window. Renames are followed. Formatting-only commits (whitespace-only diffs) do not reset the clock. - -**Why it matters:** -Stale complex functions are a hidden liability. They appear stable in metrics but represent a significant risk when they *do* need to change — under urgency, with no recent institutional knowledge. Identifying them proactively allows teams to schedule incremental familiarisation and decomposition before a crisis forces a rushed change. - -**Default thresholds:** CC ≥ 10 **and** LOC ≥ 60 **and** days_since_last_change ≥ 180 - ---- - -### `volatile_god` - -**Kind:** Derived from `god_function` && `churn_magnet` -**Industry names:** Volatile God Method, Churning Monolith -**Source:** Composite of Fowler (God Method) + Tornhill (hotspot). Not separately named in the literature but represents the intersection of both seminal concerns. - -**Signals:** All conditions for `god_function` fire **and** churn_lines meets the `churn_magnet` threshold - -**Definition:** -A god function that is also frequently changed: the worst-case combination of structural debt and maintenance pressure. The size and coupling of a god function make every change expensive; the frequency of change means those costs are incurred repeatedly. Because this pattern is derived, its thresholds are exactly those of the two primitives it combines — there are no independent thresholds to keep in sync. - -**Why it matters:** -A god function that is rarely touched is legacy debt — expensive to pay down but not actively causing new problems. A god function that changes regularly is *active* debt: it is generating defects, slowing every sprint that touches it, and training developers to avoid it. `volatile_god` is the highest-priority pattern for targeted refactoring investment. - -**Derived thresholds (inherited):** LOC ≥ 60 **and** FO ≥ 10 **and** churn_lines ≥ 200 **and** CC ≥ 8 - ---- - -## Pattern combinations and escalation - -Some patterns frequently co-occur and together signal a higher-priority concern than either alone. `volatile_god` is already a named derived pattern covering the most common escalation; the combinations below are observed co-occurrences that are not yet named patterns but are worth noting in triage. - -| Combination | Escalated meaning | -|---|---| -| `hub_function` + `cyclic_hub` | Architecture-level coupling problem; cannot be safely refactored at the function level alone — cycle must be broken first | -| `complex_branching` + `exit_heavy` | Control flow is non-linear in two compounding ways; test coverage is likely incomplete — prioritise coverage before any refactor | -| `middle_man` + `churn_magnet` | A routing layer that keeps absorbing logic — the routing function is growing into the thing it was meant to dispatch away from | -| `stale_complex` + `hub_function` | Feared bottleneck — high blast radius if it needs to change, and it hasn't been touched in a long time; schedule familiarisation proactively | -| `shotgun_target` + `churn_magnet` | Interface instability compounded by internal complexity — the function is changing a lot and breaking callers; interface stabilisation is urgent | - ---- - -## Sources and further reading - -- Martin Fowler — *Refactoring: Improving the Design of Existing Code* (1999, 2nd ed. 2018) -- Adam Tornhill — *Your Code as a Crime Scene* (2015) -- Adam Tornhill — *Software Design X-Rays* (2018) -- Thomas J. McCabe — "A Complexity Measure" (IEEE Transactions on Software Engineering, 1976) -- Robert C. Martin — *Clean Code* (2008) and the Acyclic Dependencies Principle -- Empirical studies on CC and defect density: Gill & Kemerer (1991), Basili et al. (1996) diff --git a/docs/quickstart.md b/docs/quickstart.md new file mode 100644 index 0000000..d5fd56b --- /dev/null +++ b/docs/quickstart.md @@ -0,0 +1,62 @@ +# Quick Start + +Hotspots finds the **functions that are both complex and frequently changed** — the 20% of code causing 80% of bugs and slowdowns. + +## Install + +```bash +brew install Stephen-Collins-tech/tap/hotspots # macOS +npm install -g @stephencollinstech/hotspots # any platform +pip install hotspots-cli # any platform +cargo install hotspots-cli # Rust toolchain +``` + +## Run it + +```bash +hotspots analyze src/ +``` + +Output: +``` +Critical (LRS ≥ 9.0): + processPlanUpgrade src/api/billing.ts:142 LRS 12.4 CC 15 ND 4 FO 8 NS 3 + +High (6.0 ≤ LRS < 9.0): + validateSession src/auth/session.ts:67 LRS 9.8 CC 11 ND 3 FO 7 NS 2 +``` + +**Critical** = refactor now. **High** = refactor next time you touch it. **Low/Moderate** = leave it alone. + +## What the score means + +Each function gets a **Local Risk Score (LRS)** computed from four structural metrics: + +| Metric | Measures | +|---|---| +| **CC** — Cyclomatic Complexity | Independent decision paths | +| **ND** — Nesting Depth | Maximum nesting of control structures | +| **FO** — Fan-Out | Distinct functions called | +| **NS** — Non-Structured Exits | Early returns, throws, breaks | + +`LRS = 1.0×CC + 0.8×ND + 0.6×FO + 0.7×NS` (log-scaled, then summed) + +## Add to CI + +Block PRs that introduce new critical-risk functions: + +```yaml +- uses: actions/checkout@v4 + with: + fetch-depth: 0 +- uses: Stephen-Collins-tech/hotspots-action@v1 + with: + github-token: ${{ secrets.GITHUB_TOKEN }} +``` + +That's it — the action posts PR comments and exits 1 on policy violations. + +## Next steps + +- [Usage Guide](USAGE.md) — snapshot mode, delta diffs, output formats, policy config +- [CLI Reference](REFERENCE.md) — all flags, config schema, JSON schema diff --git a/docs/reference/cli.md b/docs/reference/cli.md deleted file mode 100644 index 0078c49..0000000 --- a/docs/reference/cli.md +++ /dev/null @@ -1,1238 +0,0 @@ -# CLI Reference - -Complete command-line reference for Hotspots. - -## Installation - -```bash -# Install from source -cargo install --path hotspots-cli - -# Or use pre-built binary -# Download from GitHub releases -``` - -## Global Options - -All commands support: - -```bash -hotspots --help # Show help -hotspots --version # Show version -``` - ---- - -## Commands - -### `hotspots analyze` - -Analyze source files for complexity metrics. - -**Supported Languages:** TypeScript, JavaScript, Go, Java, Python, Rust - -#### Basic Usage - -```bash -# Analyze a single file -hotspots analyze src/app.ts - -# Analyze a directory (recursive) -hotspots analyze src/ - -# Analyze with JSON output -hotspots analyze src/ --format json -``` - -#### Options - -##### `` -**Required.** Path to source file or directory to analyze. - -```bash -hotspots analyze src/ -hotspots analyze lib/utils.ts -``` - -##### `--format ` -**Optional.** Output format: `text`, `json`, `jsonl`, `html`, or `sarif`. -**Default:** `text` - -```bash -# Human-readable text output (default) -hotspots analyze src/ --format text - -# Machine-readable JSON output -hotspots analyze src/ --format json - -# Streaming JSONL output (one object per line) -hotspots analyze src/ --format jsonl - -# Interactive HTML report (requires --mode) -hotspots analyze src/ --format html --mode snapshot - -# SARIF 2.1.0 for GitHub code scanning (requires --mode snapshot) -hotspots analyze . --mode snapshot --format sarif --output .hotspots/results.sarif - -# Or capture stdout -hotspots analyze . --mode snapshot --format sarif > results.sarif -``` - -**Notes:** -- HTML format requires `--mode snapshot` or `--mode delta`. -- SARIF format requires `--mode snapshot`. Unlike HTML, SARIF has no default output file — without `--output`, SARIF is written to stdout. -- `text` output is color-coded when stdout is a TTY: `critical` rows are red bold, `high` are yellow bold, and `moderate`/`low` are green. Colors are automatically suppressed when piped, redirected, or when the `NO_COLOR` environment variable is set (per [no-color.org](https://no-color.org/)). - -##### `--mode ` -**Optional.** Output mode: `snapshot`, `delta`, or `models`. - -```bash -# Snapshot mode: capture current state -hotspots analyze src/ --mode snapshot --format json - -# Delta mode: compare against parent commit -hotspots analyze src/ --mode delta --format json - -# Models mode: rank data models by associated hotspot risk -hotspots analyze src/ --mode models --format text -hotspots analyze src/ --mode models --format json --top 10 -``` - -**Snapshot mode:** -- Captures current complexity state with git metadata -- Persists to `.hotspots/snapshots/` (mainline only) -- Updates `.hotspots/index.json` -- Computes aggregates for output - -**Delta mode:** -- Compares current state vs parent commit -- Supports policy evaluation with `--policy` -- Shows complexity changes (ΔLRS) -- PR mode: compares vs merge-base -- Mainline mode: compares vs direct parent - -**Models mode:** -- Extracts first-party model declarations from supported languages -- Associates functions in the same file or files that directly import the model file -- Ranks models by the sum of their top 5 associated function risk scores -- Supports text and JSON output - -##### `--policy` -**Optional.** Evaluate policies (only valid with `--mode delta`). - -```bash -# Run policy checks -hotspots analyze src/ --mode delta --policy --format text - -# Fail CI build on policy violations -hotspots analyze src/ --mode delta --policy --format json || exit 1 -``` - -**Policy Types:** -- **Blocking failures:** Exit code 1 on violations - - No regressions allowed (LRS must not increase) - - No band transitions to higher risk -- **Warnings:** Exit code 0, informational - - Watch threshold (approaching moderate) - - Attention threshold (approaching high) - - Rapid growth (>50% LRS increase) - - Net repository regression - -**Requires:** `--mode delta` - -##### `--top ` -**Optional.** Show only top N functions by LRS. -**Overrides:** Config file `top` value. - -```bash -# Show top 20 highest-complexity functions -hotspots analyze src/ --top 20 - -# No limit (show all) -hotspots analyze src/ -``` - -##### `--min-lrs ` -**Optional.** Filter functions below minimum LRS threshold. -**Overrides:** Config file `min_lrs` value. - -```bash -# Only show functions with LRS ≥ 5.0 -hotspots analyze src/ --min-lrs 5.0 - -# Show everything (no filter) -hotspots analyze src/ --min-lrs 0.0 -``` - -##### `--config ` -**Optional.** Path to configuration file. -**Default:** Auto-discover from project root. - -```bash -# Use specific config file -hotspots analyze src/ --config custom-config.json - -# Use CI-specific config -hotspots analyze src/ --config .hotspots.ci.json -``` - -See [Configuration](../guide/configuration.md) for config file format. - -##### `--output ` -**Optional.** Output file path (for HTML format). -**Default:** `.hotspots/report.html` - -```bash -# Write HTML report to custom location -hotspots analyze src/ --mode snapshot --format html --output reports/complexity.html -``` - -**Only applicable to HTML format.** - -##### `--explain` -**Optional.** Show human-readable per-function risk breakdown. -**Only valid with:** `--mode snapshot --format text` -**Mutually exclusive with:** `--level` - -```bash -# Show ranked functions with full risk factor explanations -hotspots analyze . --mode snapshot --format text --explain - -# Limit to top 10 functions -hotspots analyze . --mode snapshot --format text --explain --top 10 -``` - -Displays per-function metric contributions (CC, ND, FO, NS), activity signals (churn, -touch count, fan-in, SCC, depth), and a co-change coupling section at the end showing -the top 10 high/moderate source-file pairs. - -**Note:** In snapshot mode with `--format text`, you must specify either `--explain` -or `--level `. - -##### `--explain-patterns` -**Optional.** Populate and emit per-pattern trigger details (`pattern_details`). -**Valid with:** all modes and formats. -**Cannot be used with:** `--mode delta` when a mode is specified. - -```bash -# Basic mode: Tier 1 pattern details in text output -hotspots analyze src/ --explain-patterns - -# Snapshot mode: Tier 1 + Tier 2 details, JSON output -hotspots analyze src/ --mode snapshot --format json --explain-patterns - -# Combined with --explain for full per-function breakdown -hotspots analyze src/ --mode snapshot --format text --explain --explain-patterns -``` - -In **text** output, each function row is followed by indented lines showing the metric values that triggered each pattern: - -``` -14.10 critical src/imports.rs 622 resolve_cargo_workspace_edges complex_branching, god_function - complex_branching: cc=15 (>=10), nd=5 (>=4) - god_function: loc=120 (>=60), fo=14 (>=10) -``` - -In **JSON** output, the `pattern_details` array is populated on each function object (otherwise omitted). - -In **snapshot mode**, Tier 2 enriched patterns (those requiring call graph and git history) are also included in the details. - -##### `--level ` -**Optional.** Switch to a higher-level ranked view instead of per-function output. -**Only valid with:** `--mode snapshot --format text` -**Mutually exclusive with:** `--explain` - -| Value | Output | -|----------|----------------------------------------------------------------------| -| `file` | Ranked file risk table (max CC, avg CC, function count, LOC, churn) | -| `module` | Ranked module instability table (afferent, efferent, instability) | - -```bash -# File-level risk view (ranked by composite file_risk_score) -hotspots analyze . --mode snapshot --format text --level file - -# Module (directory) instability view -hotspots analyze . --mode snapshot --format text --level module - -# Limit to top 20 entries -hotspots analyze . --mode snapshot --format text --level file --top 20 -``` - -**Note:** In snapshot mode with `--format text`, you must specify either `--level` -or `--explain`. - -##### `--force` / `-f` -**Optional.** Overwrite an existing snapshot if one already exists for this commit. - -```bash -hotspots analyze . --mode snapshot --force -``` - -Snapshots are normally immutable (identified by commit SHA). Use `--force` to -regenerate a snapshot after a config change or to correct a prior run. - -**Mutually exclusive with:** `--no-persist` - -##### `--no-persist` -**Optional.** Analyze without writing the snapshot to disk. -**Only valid with:** `--mode snapshot` or `--mode delta` -**Mutually exclusive with:** `--force` - -```bash -# Run snapshot analysis without saving to .hotspots/ -hotspots analyze . --mode snapshot --no-persist --format json -``` - -Useful for one-off inspection or CI pipelines where snapshot history is not needed. - -##### `--per-function-touches` -**Optional.** Use `git log -L` to compute per-function touch counts instead of -file-level counts. -**Only valid with:** `--mode snapshot` or `--mode delta` - -```bash -hotspots analyze . --mode snapshot --per-function-touches -``` - -**Warning:** Approximately 50× slower on a cold run (no cache). An on-disk touch cache -is written after the first run; subsequent warm runs are significantly faster. Default -touch metrics are file-level (all functions in a file share the same `touch_count_30d`). -Use this flag when precise per-function activity signals are required. - -##### `--no-per-function-touches` -**Optional.** Force file-level touch batching, overriding a `per_function_touches: true` -config file setting. - -```bash -hotspots analyze . --mode snapshot --no-per-function-touches -``` - -##### `--skip-touch-metrics` -**Optional.** Skip all touch metric computation (no git log I/O). Touch counts are -reported as `0`. Recommended for very large repositories (50k+ functions) where git -history traversal dominates analysis time. - -```bash -hotspots analyze . --mode snapshot --skip-touch-metrics -``` - -##### `--callgraph-skip-above` -**Optional.** Skip call graph betweenness centrality computation when the number of -call graph edges exceeds this threshold. Betweenness centrality is O(V·E) and becomes -prohibitively slow on large call graphs. - -```bash -hotspots analyze . --mode snapshot --callgraph-skip-above 50000 -``` - -Fan-in and fan-out are still computed; only betweenness (and derived PageRank) is -skipped. Recommended for repositories with 50k+ functions where the full call graph -would otherwise take many minutes. - -##### `--skip-gate` -**Optional.** Disable the suppression gate P@10 calibration check that runs after every -`analyze --mode snapshot`. Use when you know the gate would fire spuriously (e.g. a -codebase with no conventional fix-commit keywords, or a greenfield repo with no bug history). - -```bash -hotspots analyze . --mode snapshot --skip-gate -``` - -##### `--all-functions` -**Optional.** Output all functions as a flat array instead of the default triage-first -structure (quadrant buckets). -**Only valid with:** `--mode snapshot --format json` - -```bash -hotspots analyze . --mode snapshot --format json --all-functions -``` - -Produces schema v2 full snapshot output: a flat `functions` array containing every -function, regardless of quadrant. The default triage-first structure is schema v4 and -groups functions into `fire`, `debt`, `watch`, and `ok` buckets. Use `--all-functions` -when consuming output in tooling that needs the complete snapshot shape. - -See [JSON Schema Reference](../guide/output-formats.md#schema-versions) for schema details. - -##### `--include-models` -**Optional.** Include the model risk map in snapshot JSON and HTML reports. -**Only valid with:** `--mode snapshot --format json` or `--mode snapshot --format html` - -```bash -hotspots analyze . --mode snapshot --format json --include-models -hotspots analyze . --mode snapshot --format html --include-models -``` - -Adds the top model risk concentrations and shared-reference `links` under -`architecture.models` in agent JSON output, `aggregates.models` in `--all-functions` -JSON output, and a Model Risk Map section in HTML reports. - -#### Examples - -**Basic analysis (text output):** -```bash -hotspots analyze src/ -``` - -**JSON output for CI:** -```bash -hotspots analyze src/ --format json --min-lrs 5.0 > analysis.json -``` - -**Snapshot with HTML report:** -```bash -hotspots analyze src/ --mode snapshot --format html -# Opens .hotspots/report.html -``` - -**Delta with policy enforcement:** -```bash -hotspots analyze src/ --mode delta --policy --format text -# Exit code 1 if blocking failures detected -``` - -**PR mode (automatic in CI):** -```bash -# In GitHub Actions with PR context -hotspots analyze src/ --mode delta --policy --format json -# Compares vs merge-base automatically -``` - -**Override config settings:** -```bash -hotspots analyze src/ --config .hotspots.ci.json --min-lrs 6.0 --top 50 -``` - -**File-level risk view:** -```bash -hotspots analyze . --mode snapshot --format text --level file -hotspots analyze . --mode snapshot --format text --level file --top 20 -``` - -**Module instability view:** -```bash -hotspots analyze . --mode snapshot --format text --level module -``` - -**Human-readable per-function explanations with co-change section:** -```bash -hotspots analyze . --mode snapshot --format text --explain -``` - -**Snapshot without persisting (read-only inspection):** -```bash -hotspots analyze . --mode snapshot --no-persist --format json -``` - -**Pattern trigger details:** -```bash -# Show what triggered each pattern (text) -hotspots analyze src/ --explain-patterns - -# Pattern details in JSON for tooling -hotspots analyze src/ --mode snapshot --format json --explain-patterns -``` - ---- - -### `hotspots train` - -Fit a local RandomForest ranker from the repo's own fix-commit git history. - -The trained model scores functions based on which structural features (complexity, churn, call graph) correlate with real bugs in your codebase — not a generic heuristic. Once trained, `hotspots analyze` picks up the ranker automatically and uses it to recompute triage quadrants. - -#### Usage - -```bash -# Train with file-level labels (fast) -hotspots train . - -# Train with blame-based function-level labels (more precise, slower) -hotspots train . --blame - -# Custom label window and output path -hotspots train . --blame --label-window 180 --output .hotspots/ranker.json -``` - -#### Arguments - -##### `[PATH]` -**Optional.** Path to repository root. -**Default:** `.` (current directory) - -```bash -hotspots train /path/to/repo --blame -``` - -#### Options - -##### `--blame` -**Optional.** Use blame-based function-level labelling instead of file-level labelling. - -Without `--blame`, every function in a file touched by a fix commit is marked positive. With `--blame`, `git diff-tree` hunk headers are parsed to identify the exact function that owned the changed lines — only that function is marked positive. - -**Trade-off:** More precise signal but one `git diff-tree` subprocess per fix commit. Recommended for repos with many large files. - -```bash -# File-level labels (default): fast, noisy -hotspots train . - -# Blame-based function labels: slower, more precise -hotspots train . --blame -``` - -##### `--label-window ` -**Optional.** Days of git history to scan for fix-commit labels. -**Default:** `365` - -```bash -# Use 6 months of history -hotspots train . --blame --label-window 180 -``` - -##### `--n-estimators ` -**Optional.** Number of trees in the RandomForest. -**Default:** `200` - -##### `--max-depth ` -**Optional.** Maximum tree depth. -**Default:** `6` - -##### `--output ` -**Optional.** Output path for the trained model JSON. -**Default:** `.hotspots/ranker.json` - -##### `--screen` -**Optional.** Run a pre-flight check before training to verify the repo has enough signal for a useful model. - -`--screen` inspects the repo's git history and snapshot without fitting any model. It checks three things: whether there are enough fix commits in the label window, whether the snapshot has enough labeled functions, and whether the positive/negative label ratio is workable. If any check fails it prints a clear reason and exits without training. - -Use it when you're not sure whether a repo is a good candidate for training, or to diagnose why a previous `hotspots train` produced a poor model. - -```bash -hotspots train . --screen # check only, no model written -hotspots train . --blame --screen # blame-mode label scan, then check -``` - -##### `--eval` -**Optional.** After training, evaluate the model using Precision@K and print a table. - -`hotspots train` always produces a model — it never tells you whether that model is actually better than the default LRS ranking. `--eval` answers that question by scoring every function with the freshly trained model, sorting them highest to lowest, and checking how many of the top K functions genuinely appeared in a bug-fix commit (Precision@K). It prints results for K = 10, 20, 50, 100, and 200 alongside the base rate — the fraction you'd get by picking functions at random. - -**How to read the output:** - -``` -P@K evaluation (365-day fix-label window): - K P@K base_rate - 10 0.400 0.084 - 20 0.300 0.084 - 50 0.200 0.084 - 100 0.150 0.084 - 200 0.110 0.084 -``` - -- **P@K well above base_rate** (e.g. P@10 = 0.40 vs base_rate = 0.08) — the model is finding real bugs at the top of the list. Apply it. -- **P@K ≈ base_rate** (e.g. P@10 = 0.09 vs base_rate = 0.08) — the model learned nothing useful from this repo's history. Do not apply it; the default LRS ranking is just as good and won't mislead you. - -**When training is not worth it:** - -- Library/framework repos — bugs are rare, base rate is tiny, signal is noise -- Repos with less than ~1 year of history — not enough fix commits to learn from -- Repos with poor commit hygiene — if fix commits don't use conventional keywords (`fix:`, `bug`, `patch`, `hotfix`), the label scanner misses them -- Mature stable codebases — bugs are subtle and spread across the whole codebase rather than concentrated in hot files; P@K can be zero even when the model looks fine by other measures - -**Note:** `--eval` re-scans git history for fix-commit labels, which adds a few seconds. Without `--eval`, training behavior is unchanged. - -```bash -# Train and immediately check whether it's worth using -hotspots train . --blame --eval -``` - -#### Training signal requirements - -Training returns an error if the snapshot has fewer than 50 functions, or if the fix scan yields fewer than 5 positive or 10 negative labels. If this happens: - -- Try a larger `--label-window` (e.g. `--label-window 730` for 2 years) -- Run `hotspots analyze . --mode snapshot --force` to regenerate a fresh snapshot -- Check that your fix commits use conventional keywords (`fix:`, `bug`, `patch`, `hotfix`, `regression`, `defect`) - -#### How it integrates with `hotspots analyze` - -Once `.hotspots/ranker.json` exists, `hotspots analyze` automatically: - -1. Loads the ranker and scores every function -2. Uses RF scores in place of the activity heuristic to determine triage quadrants -3. Promotes high-RF functions from `debt` → `fire` where warranted - -The suppression gate runs on every `analyze` to verify the heuristic ranker is working. If it detects poor P@10, it recommends running `hotspots train`. - -#### Feature set - -The v3 model trains on 8 structural features: `lrs`, `cc`, `nd`, `loc`, `fo`, `fan_in`, `total_churn`, `authors_90d`. Windowed activity signals (`touch_count_30d`, `days_since_last_change`, `activity_risk`) are deliberately excluded to prevent temporal leakage from the label scan window. - -#### Examples - -```bash -# Basic training (fast, file-level labels) -hotspots train . - -# Precise blame-based training -hotspots train . --blame - -# Train and check whether the model is worth applying -hotspots train . --blame --eval - -# Train on a specific repo with 2-year history -hotspots train /path/to/repo --blame --label-window 730 - -# Re-train after new commits land -hotspots train . --blame && hotspots analyze . -``` - ---- - -### `hotspots prune` - -Prune unreachable snapshots to reduce storage. - -#### Usage - -```bash -# Prune unreachable snapshots -hotspots prune --unreachable - -# Dry-run (preview what would be deleted) -hotspots prune --unreachable --dry-run - -# Prune only snapshots older than 30 days -hotspots prune --unreachable --older-than 30 -``` - -#### Options - -##### `--unreachable` -**Required.** Must be explicitly specified to confirm pruning. - -**Safety:** Prevents accidental data loss. - -##### `--older-than ` -**Optional.** Only prune snapshots older than N days. - -```bash -# Keep recent history, prune old unreachable snapshots -hotspots prune --unreachable --older-than 90 -``` - -##### `--dry-run` -**Optional.** Preview pruning without deleting. - -```bash -# See what would be pruned -hotspots prune --unreachable --dry-run -``` - -#### Output - -``` -Pruned 15 snapshots - -Pruned commit SHAs: - abc123... - def456... - ... - -Reachable snapshots: 42 -Unreachable snapshots kept (due to age filter): 8 -``` - -#### How Pruning Works - -1. **Find repository root** (searches up for `.git`) -2. **Identify reachable commits** via `refs/heads/*` (local branches) -3. **Mark unreachable snapshots** for deletion -4. **Apply age filter** (if `--older-than` specified) -5. **Delete snapshot files** from `.hotspots/snapshots/` -6. **Update index** (`.hotspots/index.json`) - -**Note:** Only prunes unreachable snapshots. Reachable history is preserved. - ---- - -### `hotspots compact` - -Compact snapshot history to reduce storage. - -#### Usage - -```bash -# Set compaction level -hotspots compact --level 0 -``` - -#### Options - -##### `--level ` -**Required.** Compaction level: `0`, `1`, or `2`. - -**Levels:** -- **Level 0:** Full snapshots (current implementation) -- **Level 1:** Deltas only (planned) -- **Level 2:** Band transitions only (planned) - -**Note:** Levels 1 and 2 are not yet implemented. Command only sets metadata. - -#### Output - -``` -Compaction level set to 0 (was 0) -``` - ---- - -### `hotspots trends` - -Analyze complexity trends from snapshot history. - -#### Usage - -```bash -# Analyze trends (last 10 snapshots, top 5 functions) -hotspots trends . - -# Custom window and top-K -hotspots trends . --window 20 --top 10 - -# JSON output -hotspots trends . --format json > trends.json -``` - -#### Options - -##### `` -**Required.** Path to repository root. - -##### `--format ` -**Optional.** Output format: `text`, `json`, `jsonl`, or `html`. -**Default:** `json` - -```bash -hotspots trends . --format text -hotspots trends . --format html -``` - -##### `--window ` -**Optional.** Number of snapshots to analyze. -**Default:** `10` - -```bash -# Analyze last 20 snapshots -hotspots trends . --window 20 -``` - -##### `--top ` -**Optional.** Top K functions for hotspot analysis. -**Default:** `5` - -```bash -# Track top 10 hotspots -hotspots trends . --top 10 -``` - -#### Output (Text Format) - -``` -Trends Analysis -================================================================================ - -Risk Velocities: -Function Velocity Direction First LRS Last LRS ----------------------------------------------------------------------------------------------------- -processOrder 0.50 positive 3.50 5.50 -validateInput -0.20 negative 4.20 3.00 - -Hotspot Stability: -Function Stability Overlap Appearances ----------------------------------------------------------------------------------------- -processPayment stable 0.90 9/10 -handleError emerging 0.60 6/10 - -Refactor Effectiveness: -Function Outcome Improvement Sustained ----------------------------------------------------------------------------------------- -refactoredFunction successful -2.50 5 - -Summary: - Risk velocities: 15 - Hotspots analyzed: 8 - Refactors detected: 3 -``` - -#### Metrics Explained - -**Risk Velocity:** -- LRS change per snapshot -- Positive: increasing complexity -- Negative: decreasing complexity -- Flat: stable - -**Hotspot Stability:** -- Stable: consistently in top-K (>80% overlap) -- Emerging: recently entered top-K (60-80% overlap) -- Volatile: intermittently in top-K (<60% overlap) - -**Refactor Effectiveness:** -- Successful: sustained LRS reduction (>3 commits) -- Partial: temporary improvement (1-2 commits) -- Cosmetic: no sustained improvement - ---- - -### `hotspots config validate` - -Validate configuration file without running analysis. - -#### Usage - -```bash -# Validate auto-discovered config -hotspots config validate - -# Validate specific config file -hotspots config validate --path custom-config.json -``` - -#### Options - -##### `--path ` -**Optional.** Path to config file. -**Default:** Auto-discover from current directory. - -#### Output - -**Valid config:** -``` -Config valid: .hotspotsrc.json -``` - -**No config found:** -``` -No config file found. Using defaults. -``` - -**Invalid config:** -``` -Config validation failed: thresholds.moderate (6.0) must be less than thresholds.high (5.0) -``` - -Exit code 1 on validation failure. - ---- - -### `hotspots config show` - -Show resolved configuration (merged defaults + config file). - -#### Usage - -```bash -# Show resolved config -hotspots config show - -# Show specific config file -hotspots config show --path .hotspots.ci.json -``` - -#### Options - -##### `--path ` -**Optional.** Path to config file. -**Default:** Auto-discover from current directory. - -#### Output - -``` -Configuration: - Source: .hotspotsrc.json - -Weights: - cc: 1.0 - nd: 0.8 - fo: 0.6 - ns: 0.7 - -Thresholds: - moderate: 3.0 - high: 6.0 - critical: 9.0 - -Filters: - min_lrs: 3.0 - top: 50 - include: all files - exclude: active (custom patterns) -``` - ---- - -### `hotspots init` - -Print hook templates for CI/CD integration. - -#### Usage - -```bash -# Print pre-commit and shell hook templates -hotspots init --hooks -``` - -#### Options - -##### `--hooks` -**Required to produce output.** Prints two hook templates to stdout: - -1. **Option 1 — pre-commit framework** (`.pre-commit-config.yaml` snippet) -2. **Option 2 — raw shell hook** (standalone `#!/usr/bin/env sh` script for `.git/hooks/pre-push`) - -If `--hooks` is not specified, prints a usage hint and exits successfully. - -#### Setup - -Delta mode requires a baseline snapshot to compare against. Seed one before enabling the hook: - -```bash -# 1. Seed the baseline -hotspots analyze . --mode snapshot - -# 2. Print and copy the hook template -hotspots init --hooks - -# 3a. pre-commit framework: append the YAML block to .pre-commit-config.yaml -# then install with: pre-commit install --hook-type pre-push - -# 3b. Raw shell hook: save the sh block (starting with #!/usr/bin/env sh) -# as .git/hooks/pre-push and run: chmod +x .git/hooks/pre-push -``` - -#### Example Output - -``` -# ── SETUP (run once before enabling the hook) ───────────────────── -# Seed the baseline first: -# -# hotspots analyze . --mode snapshot - -# ── Option 1: pre-commit framework ─────────────────────────────────── -# Add the following to .pre-commit-config.yaml: - -repos: - - repo: local - hooks: - - id: hotspots - name: hotspots risk check - language: system - entry: hotspots analyze . --mode delta --policy --format text - pass_filenames: false - stages: [pre-push] - -# ── Option 2: raw shell hook ───────────────────────────────────────── -# Save the lines below (starting with the shebang) as .git/hooks/pre-push -# and run: chmod +x .git/hooks/pre-push - -#!/usr/bin/env sh -set -e -hotspots analyze . --mode delta --policy --format text -``` - -Both hooks run `--mode delta --policy` on push and exit non-zero if blocking policy violations are found. - ---- - -### `hotspots diff` - -Compare analysis snapshots between any two git refs. Both refs must have existing snapshots (created with `hotspots analyze --mode snapshot`). - -#### Usage - -```bash -hotspots diff [OPTIONS] -``` - -#### Arguments - -| Argument | Description | -|----------|-------------| -| `` | Base git ref (branch, tag, SHA, or `HEAD~N`) | -| `` | Head git ref (branch, tag, SHA, or `HEAD~N`) | - -#### Options - -| Flag | Description | -|------|-------------| -| `--format ` | Output format: `text` (default), `json`, `jsonl`, `html` | -| `--output ` | Write output to file instead of stdout (HTML default: `.hotspots/delta-report.html`) | -| `--policy` | Evaluate policy rules; exit 1 on blocking failures | -| `--top ` | Limit output to top N changed functions by risk magnitude | -| `--config ` | Path to config file (default: auto-discover) | -| `--auto-analyze` | Analyze missing refs automatically using git worktrees *(not yet implemented)* | - -#### Exit Codes - -| Code | Meaning | -|------|---------| -| `0` | Success | -| `1` | Blocking policy failure (`--policy` only) | -| `3` | One or both snapshots missing | - -#### Examples - -```bash -# Compare current branch against main -hotspots diff main HEAD - -# Compare two release tags -hotspots diff v1.0.0 v2.0.0 - -# Top 10 changed functions with policy check -hotspots diff main HEAD --top 10 --policy - -# JSON output for scripting -hotspots diff main HEAD --format json - -# HTML report written to file -hotspots diff main HEAD --format html --output reports/delta.html -``` - -#### Notes - -- `--top` truncation happens **after** policy evaluation, so violations outside the top N are still detected. -- Annotated tags are peeled to the underlying commit SHA before snapshot lookup. -- Use `hotspots analyze --mode snapshot` to create snapshots for refs that don't have one yet. - ---- - -## Exit Codes - -| Code | Meaning | -|------|---------| -| `0` | Success (or warnings only) | -| `1` | Error or blocking policy failure | -| `3` | Snapshot missing (`hotspots diff` only) | - -**Policy Evaluation:** -- Exit code `0` if only warnings (watch, attention, rapid growth) -- Exit code `1` if blocking failures (regressions, band transitions) - ---- - -## Configuration Priority - -CLI flags override config file values: - -1. **CLI flags** (highest priority) -2. **Config file** (`.hotspotsrc.json`, etc.) -3. **Defaults** (lowest priority) - -```bash -# Config file says min_lrs: 3.0, CLI overrides to 5.0 -hotspots analyze src/ --min-lrs 5.0 -``` - -See [Configuration](../guide/configuration.md) for details. - ---- - -## Environment Variables - -Hotspots respects git environment variables for repository operations: - -- `GIT_DIR` - Override `.git` directory location -- `GIT_WORK_TREE` - Override working directory - -**CI/CD Detection (PR mode):** -- `GITHUB_EVENT_NAME=pull_request` - GitHub Actions PR -- `CI_MERGE_REQUEST_IID` - GitLab MR -- `CIRCLE_PULL_REQUEST` - CircleCI PR -- `TRAVIS_PULL_REQUEST` - Travis CI PR - -When PR context is detected, delta mode compares vs merge-base instead of direct parent. - ---- - -## Common Workflows - -### Local Development - -```bash -# Quick complexity check -hotspots analyze src/ --top 20 - -# Detailed analysis with filtering -hotspots analyze src/ --min-lrs 5.0 --format json | jq . -``` - -### CI/CD Integration - -**GitHub Actions:** -```yaml -- name: Complexity Analysis - run: hotspots analyze src/ --mode delta --policy --format json -``` - -**With config file:** -```yaml -- name: Complexity Analysis - run: hotspots analyze src/ --config .hotspots.ci.json --mode delta --policy -``` - -See [CI Integration](../guide/ci-cd.md) for complete examples. - -### Snapshot Management - -```bash -# Capture baseline snapshot -hotspots analyze src/ --mode snapshot --format json > baseline.json - -# Prune old snapshots (keep 30 days) -hotspots prune --unreachable --older-than 30 --dry-run -hotspots prune --unreachable --older-than 30 - -# Analyze trends -hotspots trends . --window 20 --format text -``` - -### Debugging - -```bash -# Validate configuration -hotspots config validate - -# Show resolved config (check what's active) -hotspots config show - -# Test with dry-run -hotspots analyze src/ --format json --dry-run # (if supported) -``` - ---- - -## Troubleshooting - -### "Path does not exist" - -**Cause:** Invalid path argument. - -**Fix:** Verify path exists: -```bash -ls -la src/ -hotspots analyze src/ -``` - -### "not in a git repository" - -**Cause:** Snapshot/delta mode requires git repository. - -**Fix:** Initialize git or use basic analysis: -```bash -git init -# OR -hotspots analyze src/ --format json # (no --mode flag) -``` - -### "--policy flag is only valid with --mode delta" - -**Cause:** Using `--policy` without `--mode delta`. - -**Fix:** -```bash -hotspots analyze src/ --mode delta --policy -``` - -### "HTML format requires --mode snapshot or --mode delta" - -**Cause:** Using `--format html` without `--mode`. - -**Fix:** -```bash -hotspots analyze src/ --format html --mode snapshot -``` - -### "SARIF format requires --mode snapshot" - -**Cause:** Using `--format sarif` without `--mode snapshot`. - -**Fix:** -```bash -hotspots analyze . --mode snapshot --format sarif -``` - -### "Config validation failed" - -**Cause:** Invalid configuration file. - -**Fix:** Validate and fix config: -```bash -hotspots config validate -# Read error message, fix config file -cat .hotspotsrc.json | jq . # Check JSON syntax -``` - -### "unreachable flag must be specified" - -**Cause:** Safety check for prune command. - -**Fix:** -```bash -hotspots prune --unreachable -``` - -### "text format without --explain is not supported for snapshot mode" - -**Cause:** Using `--mode snapshot --format text` without `--explain` or `--level`. - -**Fix:** Add `--explain`, `--level`, or use JSON format: -```bash -hotspots analyze . --mode snapshot --format text --explain -hotspots analyze . --mode snapshot --format text --level file -hotspots analyze . --mode snapshot --format json -``` - -### "--level is only valid with --mode snapshot" - -**Cause:** Using `--level` without `--mode snapshot --format text`. - -**Fix:** -```bash -hotspots analyze . --mode snapshot --format text --level file -``` - -### "--level and --explain are mutually exclusive" - -**Cause:** Both `--level` and `--explain` flags specified together. - -**Fix:** Use one or the other: -```bash -hotspots analyze . --mode snapshot --format text --level file -# OR -hotspots analyze . --mode snapshot --format text --explain -``` - -### "--no-persist and --force are mutually exclusive" - -**Cause:** Both `--no-persist` and `--force` flags specified together. - -**Fix:** Use one or the other: -```bash -hotspots analyze . --mode snapshot --no-persist # analyze without saving -hotspots analyze . --mode snapshot --force # overwrite existing snapshot -``` - ---- - -## Related Documentation - -- [Configuration Guide](../guide/configuration.md) - Config file format -- [CI Integration](../guide/ci-cd.md) - GitHub Actions, GitLab CI -- [Output Formats](../guide/output-formats.md) - JSON schema, HTML reports -- [LRS Specification](./lrs-spec.md) - How LRS is calculated -- [Policy Engine](../guide/usage.md#policy-engine) - Policy rules and enforcement -- [`hotspots train`](#hotspots-train) - Repo-specific ML ranker from fix-commit history diff --git a/docs/reference/json-schema.md b/docs/reference/json-schema.md deleted file mode 100644 index 242375b..0000000 --- a/docs/reference/json-schema.md +++ /dev/null @@ -1,731 +0,0 @@ -# JSON Schema & Output Format - -Hotspots outputs structured JSON that can be consumed by CI/CD pipelines, analysis tools, and AI assistants. This document describes the output format and provides integration examples. - -## Overview - -Hotspots produces JSON output in two modes: - -- **Snapshot Mode** (`hotspots analyze --mode snapshot --format json`): Complete analysis of all functions in the codebase -- **Delta Mode** (`hotspots analyze --mode delta --format json`): Analysis of changed functions since the last commit - -Both modes use the same JSON schema with consistent structure. - -## Schema Files - -JSON Schema definitions are available in the `schemas/` directory: - -- **`hotspots-output.schema.json`**: Complete output schema (main entry point) -- **`function-report.schema.json`**: Individual function analysis -- **`metrics.schema.json`**: Raw complexity metrics (CC, ND, FO, NS) -- **`policy-result.schema.json`**: Policy violation/warning format - -All schemas follow JSON Schema Draft 07 specification. - -## Output Structure - -```typescript -{ - schema_version: 2, - commit: { - sha: "abc123...", // Git commit SHA (40 chars) - parents: ["def456..."], // Parent commit SHAs - timestamp: 1234567890, // Unix timestamp - branch: "main" // Current branch (optional) - }, - analysis: { - scope: "full" | "delta", // Analysis mode - tool_version: "1.0.0" // Hotspots version - }, - functions: [ - { - function_id: "/path/to/file.ts::functionName", - file: "/absolute/path/to/file.ts", - line: 42, - metrics: { - cc: 8, // Cyclomatic Complexity - nd: 2, // Nesting Depth - fo: 4, // Fan-Out - ns: 2 // Non-Structured exits - }, - lrs: 7.2, // Logarithmic Risk Score - band: "high", // Risk band: low | moderate | high | critical - suppression_reason: "Legacy code, refactor planned", // Optional - driver: "high_complexity", // Primary risk driver (optional, see Driver Labels) - driver_detail: "cc (P72), nd (P68)" // Near-miss detail for composite (optional) - } - ], - aggregates: { // Present in snapshot mode output - file_risk: [...], // Ranked file risk views (see Aggregates section) - co_change: [...], // Co-change coupling pairs (see Aggregates section) - modules: [...] // Module instability views (see Aggregates section) - }, - policy_results: { // Optional (when --policy used) - failed: [...], // Blocking failures - warnings: [...] // Non-blocking warnings - } -} -``` - -## Metrics Explained - -### CC - Cyclomatic Complexity -Number of linearly independent paths through the code (decision points + 1). Counts `if`, `while`, `for`, `switch`, `||`, `&&`, `?:`, etc. - -**Example:** -```typescript -function simple() { - return 42; // CC = 1 (one path) -} - -function withBranch(x) { - if (x > 0) { // +1 decision point - return x; - } - return -x; // CC = 2 (two paths) -} -``` - -### ND - Nesting Depth -Maximum level of nested control structures. Deeply nested code is harder to understand and maintain. - -**Example:** -```typescript -function nested(x) { - if (x > 0) { // depth 1 - if (x < 100) { // depth 2 - if (x % 2 === 0) { // depth 3 (ND = 3) - return x; - } - } - } - return 0; -} -``` - -### FO - Fan-Out -Number of distinct functions or methods called. High fan-out indicates many dependencies. - -**Example:** -```typescript -function highFanOut() { - validate(); // callee 1 - transform(); // callee 2 - save(); // callee 3 - notify(); // callee 4 - // FO = 4 -} -``` - -### NS - Non-Structured Exits -Number of early returns, throws, breaks, and continues. Multiple exit points increase complexity. - -**Example:** -```typescript -function multipleExits(x) { - if (x < 0) return null; // NS +1 - if (x === 0) throw new Error(); // NS +1 - if (x > 100) return x; // NS +1 - return x * 2; // Normal exit (not counted) - // NS = 3 -} -``` - -### LRS - Logarithmic Risk Score -Composite metric combining all raw metrics with logarithmic scaling: - -``` -LRS = ln(CC + 1) + ln(ND + 1) + ln(FO + 1) + ln(NS + 1) -``` - -Higher scores indicate higher complexity and maintenance risk. - -## Risk Bands - -Functions are classified into risk bands based on LRS: - -| Band | LRS Range | Description | -|------------|------------|--------------------------------| -| Low | < 3.0 | Simple, easy to maintain | -| Moderate | 3.0 - 6.0 | Moderate complexity, acceptable | -| High | 6.0 - 9.0 | Complex, consider refactoring | -| Critical | ≥ 9.0 | Very complex, refactor recommended | - -## Driver Labels - -Each function in snapshot output includes an optional `driver` string identifying the primary -source of its risk. This is computed by the enricher after activity risk and call graph metrics -are populated. - -| Label | Condition | Recommended action | -|---|---|---| -| `cyclic_dep` | SCC size > 1 (function is in a dependency cycle) | Break the cycle before adding more callers | -| `high_complexity` | CC above the Pth percentile of the snapshot | Schedule a refactor; extract sub-functions | -| `high_churn_low_cc` | touch_count above Pth percentile and CC below (100-P)th | Add regression tests before next change | -| `high_fanout_churning` | fan_out above Pth percentile and touch above 50th | Extract an interface boundary | -| `deep_nesting` | ND above the Pth percentile of the snapshot | Flatten with early returns or guard clauses | -| `high_fanin_complex` | fan_in above Pth percentile and CC above 50th | Extract and stabilize; wide blast radius | -| `composite` | None of the above | Monitor complexity trends | - -Thresholds are percentile-relative (default P=75, configurable via `driver_threshold_percentile`). -`cyclic_dep` is the sole absolute check — being in a cycle is binary. - -### `driver_detail` — Near-miss context for composite functions - -When a function receives the `composite` label, an optional `driver_detail` string lists the -top dimensions (up to 3) that came closest to firing a specific label, with their percentile -rank. Example: `"cc (P72), nd (P68)"` means cyclomatic complexity is at the 72nd percentile -and nesting depth at the 68th — both notable but below the P75 threshold. Only dimensions -above the 40th percentile (above median) are included. - -`driver_detail` is omitted from JSON when null (`skip_serializing_if = "Option::is_none"`), -so it is forward-compatible with parsers that read existing v2 snapshots. - -## Aggregates - -Snapshot mode output includes an `aggregates` object with three arrays providing higher-level -views of codebase risk. These are computed from the per-function data at output time. - -### `aggregates.file_risk` — File-Level Risk - -Each entry covers one source file. Ranked by `file_risk_score` descending. - -```typescript -{ - file: "src/api.ts", // Relative file path - function_count: 12, // Number of functions in file - loc: 340, // Total lines of code - max_cc: 14, // Highest cyclomatic complexity in file - avg_cc: 6.8, // Mean CC across all functions - critical_count: 2, // Functions in critical band - file_churn: 180, // Lines changed in last 30 days - file_risk_score: 8.3 // Composite score: max_cc×0.4 + avg_cc×0.3 - // + log2(fn_count+1)×0.2 + churn_factor×0.1 -} -``` - -Accessible via `--level file` text output or `aggregates.file_risk` in JSON. - -### `aggregates.co_change` — Co-Change Coupling - -Pairs of files that frequently change together in the same commit. High coupling with -no static dependency indicates hidden implicit coupling — a classic maintenance risk. - -```typescript -{ - file_a: "hotspots-cli/src/main.rs", - file_b: "hotspots-core/src/aggregates.rs", - co_change_count: 14, // Times changed in the same commit - coupling_ratio: 0.78, // co_change_count / min(total_a, total_b) - has_static_dep: false, // true if a direct import exists between the two files - risk: "high" // "high" | "moderate" | "expected" | "low" - // "expected" if has_static_dep (coupling is explained) - // "high" if ratio > 0.5 and no static dep - // "moderate" if ratio > 0.25 and no static dep -} -``` - -Only pairs where both files currently exist are emitted (ghost files from renames are -filtered). Trivially expected pairs (e.g., `foo.rs` + `foo_test.rs`) are also filtered. - -`has_static_dep` uses the same import graph as module instability (D-3). Pairs with a -static dependency are classified as `"expected"` — the co-change is explained by the -import relationship and is lower risk than hidden coupling. - -Default window: 90 days; minimum co-occurrence count: 3. - -### `aggregates.modules` — Module Instability - -Each entry covers one directory. Applies Robert Martin's instability metric at directory -level. `instability = efferent / (afferent + efferent)`. - -```typescript -{ - module: "hotspots-core/src", // Directory path - file_count: 12, // Number of files - function_count: 409, // Number of functions - avg_complexity: 3.2, // Mean CC of all functions - afferent: 8, // External modules depending on this one - efferent: 3, // External modules this one depends on - instability: 0.27, // efferent / (afferent + efferent) - module_risk: "high" // "high" if instability < 0.3 and avg_complexity > 10 -} -``` - -Instability near 0.0 means everything depends on this module — risky to change. -Instability near 1.0 means this module depends on others but nothing depends on it — safe. - -Accessible via `--level module` text output or `aggregates.modules` in JSON. - -## Delta Aggregates - -Delta mode output (`--mode delta`) includes an `aggregates` object with file-level -regression summaries and co-change coupling changes. - -### `aggregates.co_change_delta` — Co-Change Pair Diff - -Each entry describes a co-change pair that is **new**, **dropped**, or changed risk -relative to the previous snapshot (if prior state is available). When no prior state -exists, all current pairs appear as `"new"`. - -```typescript -{ - file_a: "hotspots-cli/src/main.rs", - file_b: "hotspots-core/src/aggregates.rs", - status: "new", // "new" | "dropped" | "risk_increased" | "risk_decreased" - prev_risk: null, // Previous risk level (absent for "new") - curr_risk: "high", // Current risk level (absent for "dropped") - co_change_count: 14, // Times changed in the same commit - coupling_ratio: 0.78, // co_change_count / min(total_a, total_b) - has_static_dep: false // true if a direct import exists between the two files -} -``` - -In `--policy` text output, only pairs involving files touched in the current delta are -shown. This surfaces "you changed A — did you forget B?" coupling alerts. - -## TypeScript Integration - -### Using @hotspots/types Package - -```bash -npm install @hotspots/types -``` - -```typescript -import type { HotspotsOutput, FunctionReport } from '@hotspots/types'; -import { - filterByRiskBand, - getHighestRiskFunctions, - policyPassed -} from '@hotspots/types'; - -// Parse Hotspots output -const output: HotspotsOutput = JSON.parse( - await fs.readFile('hotspots-output.json', 'utf-8') -); - -// Get high-risk functions -const highRisk = filterByRiskBand(output.functions, 'high'); -const critical = filterByRiskBand(output.functions, 'critical'); - -console.log(`Found ${highRisk.length} high-risk functions`); -console.log(`Found ${critical.length} critical functions`); - -// Get top 10 most complex -const top10 = getHighestRiskFunctions(output.functions, 10); -top10.forEach(func => { - console.log(`${func.function_id} - LRS: ${func.lrs}`); -}); - -// Check policy results -if (output.policy_results && !policyPassed(output.policy_results)) { - console.error('Policy check failed!'); - output.policy_results.failed.forEach(failure => { - console.error(` ${failure.id}: ${failure.message}`); - }); - process.exit(1); -} -``` - -### Manual Schema Validation (TypeScript) - -```bash -npm install ajv ajv-formats -``` - -```typescript -import Ajv from 'ajv'; -import addFormats from 'ajv-formats'; -import * as fs from 'fs'; - -const ajv = new Ajv(); -addFormats(ajv); - -// Load schema -const schema = JSON.parse( - fs.readFileSync('schemas/hotspots-output.schema.json', 'utf-8') -); - -const validate = ajv.compile(schema); - -// Validate output -const output = JSON.parse( - fs.readFileSync('hotspots-output.json', 'utf-8') -); - -if (!validate(output)) { - console.error('Invalid Hotspots output:', validate.errors); - process.exit(1); -} - -console.log('✓ Output is valid'); -``` - -## Python Integration - -### Using jsonschema - -```bash -pip install jsonschema -``` - -```python -import json -from jsonschema import validate, ValidationError - -# Load schema -with open('schemas/hotspots-output.schema.json') as f: - schema = json.load(f) - -# Load and validate output -with open('hotspots-output.json') as f: - output = json.load(f) - -try: - validate(instance=output, schema=schema) - print('✓ Output is valid') -except ValidationError as e: - print(f'Invalid output: {e.message}') - exit(1) - -# Analyze results -high_risk = [ - func for func in output['functions'] - if func['band'] in ['high', 'critical'] -] - -print(f'Found {len(high_risk)} high-risk functions') - -# Sort by LRS -top_10 = sorted( - output['functions'], - key=lambda f: f['lrs'], - reverse=True -)[:10] - -for func in top_10: - print(f"{func['function_id']} - LRS: {func['lrs']}") - -# Check policy results -if 'policy_results' in output: - if output['policy_results']['failed']: - print('Policy check failed!') - for failure in output['policy_results']['failed']: - print(f" {failure['id']}: {failure['message']}") - exit(1) -``` - -## Go Integration - -### Using gojsonschema - -```bash -go get github.com/xeipuuv/gojsonschema -``` - -```go -package main - -import ( - "encoding/json" - "fmt" - "io/ioutil" - "os" - "sort" - - "github.com/xeipuuv/gojsonschema" -) - -type HotspotsOutput struct { - SchemaVersion int `json:"schema_version"` - Commit CommitInfo `json:"commit"` - Analysis AnalysisInfo `json:"analysis"` - Functions []FunctionReport `json:"functions"` - PolicyResults *PolicyResults `json:"policy_results,omitempty"` -} - -type CommitInfo struct { - SHA string `json:"sha"` - Parents []string `json:"parents"` - Timestamp int64 `json:"timestamp"` - Branch *string `json:"branch,omitempty"` -} - -type AnalysisInfo struct { - Scope string `json:"scope"` - ToolVersion string `json:"tool_version"` -} - -type FunctionReport struct { - FunctionID string `json:"function_id"` - File string `json:"file"` - Line int `json:"line"` - Metrics Metrics `json:"metrics"` - LRS float64 `json:"lrs"` - Band string `json:"band"` -} - -type Metrics struct { - CC int `json:"cc"` - ND int `json:"nd"` - FO int `json:"fo"` - NS int `json:"ns"` -} - -type PolicyResults struct { - Failed []PolicyResult `json:"failed"` - Warnings []PolicyResult `json:"warnings"` -} - -type PolicyResult struct { - ID string `json:"id"` - Severity string `json:"severity"` - Message string `json:"message"` -} - -func main() { - // Validate against schema - schemaLoader := gojsonschema.NewReferenceLoader("file://schemas/hotspots-output.schema.json") - documentLoader := gojsonschema.NewReferenceLoader("file://hotspots-output.json") - - result, err := gojsonschema.Validate(schemaLoader, documentLoader) - if err != nil { - panic(err) - } - - if !result.Valid() { - fmt.Println("Schema validation failed:") - for _, desc := range result.Errors() { - fmt.Printf("- %s\n", desc) - } - os.Exit(1) - } - - // Parse output - data, _ := ioutil.ReadFile("hotspots-output.json") - var output HotspotsOutput - json.Unmarshal(data, &output) - - // Get high-risk functions - var highRisk []FunctionReport - for _, fn := range output.Functions { - if fn.Band == "high" || fn.Band == "critical" { - highRisk = append(highRisk, fn) - } - } - - fmt.Printf("Found %d high-risk functions\n", len(highRisk)) - - // Sort by LRS - sort.Slice(output.Functions, func(i, j int) bool { - return output.Functions[i].LRS > output.Functions[j].LRS - }) - - fmt.Println("\nTop 10 most complex functions:") - for i := 0; i < 10 && i < len(output.Functions); i++ { - fn := output.Functions[i] - fmt.Printf("%s - LRS: %.2f\n", fn.FunctionID, fn.LRS) - } - - // Check policy results - if output.PolicyResults != nil && len(output.PolicyResults.Failed) > 0 { - fmt.Println("\nPolicy check failed!") - for _, failure := range output.PolicyResults.Failed { - fmt.Printf(" %s: %s\n", failure.ID, failure.Message) - } - os.Exit(1) - } -} -``` - -## Rust Integration - -### Using serde and jsonschema - -```toml -[dependencies] -serde = { version = "1.0", features = ["derive"] } -serde_json = "1.0" -jsonschema = "0.17" -``` - -```rust -use serde::{Deserialize, Serialize}; -use std::fs; - -#[derive(Debug, Deserialize, Serialize)] -struct HotspotsOutput { - schema_version: u32, - commit: CommitInfo, - analysis: AnalysisInfo, - functions: Vec, - #[serde(skip_serializing_if = "Option::is_none")] - policy_results: Option, -} - -#[derive(Debug, Deserialize, Serialize)] -struct CommitInfo { - sha: String, - parents: Vec, - timestamp: i64, - #[serde(skip_serializing_if = "Option::is_none")] - branch: Option, -} - -#[derive(Debug, Deserialize, Serialize)] -struct AnalysisInfo { - scope: String, - tool_version: String, -} - -#[derive(Debug, Deserialize, Serialize)] -struct FunctionReport { - function_id: String, - file: String, - line: u32, - metrics: Metrics, - lrs: f64, - band: String, -} - -#[derive(Debug, Deserialize, Serialize)] -struct Metrics { - cc: u32, - nd: u32, - fo: u32, - ns: u32, -} - -#[derive(Debug, Deserialize, Serialize)] -struct PolicyResults { - failed: Vec, - warnings: Vec, -} - -#[derive(Debug, Deserialize, Serialize)] -struct PolicyResult { - id: String, - severity: String, - message: String, -} - -fn main() -> Result<(), Box> { - // Load and validate schema - let schema_json = fs::read_to_string("schemas/hotspots-output.schema.json")?; - let schema = serde_json::from_str(&schema_json)?; - let compiled = jsonschema::JSONSchema::compile(&schema)?; - - // Load output - let output_json = fs::read_to_string("hotspots-output.json")?; - let output_value: serde_json::Value = serde_json::from_str(&output_json)?; - - // Validate - if let Err(errors) = compiled.validate(&output_value) { - eprintln!("Schema validation failed:"); - for error in errors { - eprintln!(" {}", error); - } - std::process::exit(1); - } - - // Parse into struct - let output: HotspotsOutput = serde_json::from_str(&output_json)?; - - // Get high-risk functions - let high_risk: Vec<_> = output.functions - .iter() - .filter(|f| f.band == "high" || f.band == "critical") - .collect(); - - println!("Found {} high-risk functions", high_risk.len()); - - // Sort by LRS - let mut sorted = output.functions.clone(); - sorted.sort_by(|a, b| b.lrs.partial_cmp(&a.lrs).unwrap()); - - println!("\nTop 10 most complex functions:"); - for func in sorted.iter().take(10) { - println!("{} - LRS: {:.2}", func.function_id, func.lrs); - } - - // Check policy results - if let Some(policy) = &output.policy_results { - if !policy.failed.is_empty() { - eprintln!("\nPolicy check failed!"); - for failure in &policy.failed { - eprintln!(" {}: {}", failure.id, failure.message); - } - std::process::exit(1); - } - } - - Ok(()) -} -``` - -## CI/CD Integration Patterns - -### GitHub Actions - -```yaml -- name: Run Hotspots Analysis - run: hotspots analyze --json > hotspots-output.json - -- name: Validate Output - run: | - npm install -g ajv-cli - ajv validate -s schemas/hotspots-output.schema.json -d hotspots-output.json - -- name: Check for High-Risk Functions - run: | - node -e " - const output = require('./hotspots-output.json'); - const highRisk = output.functions.filter(f => - f.band === 'high' || f.band === 'critical' - ); - if (highRisk.length > 0) { - console.error(\`Found \${highRisk.length} high-risk functions\`); - process.exit(1); - } - " -``` - -### GitLab CI - -```yaml -hotspots: - script: - - hotspots analyze --json > hotspots-output.json - - python3 scripts/validate_output.py - artifacts: - reports: - codequality: hotspots-output.json -``` - -## AI Assistant Integration - -AI assistants can use Hotspots output to: - -1. **Code Review**: Identify complex functions that need attention -2. **Refactoring Suggestions**: Target high-LRS functions for simplification -3. **Test Prioritization**: Focus testing on high-complexity areas -4. **Documentation**: Generate complexity reports and visualizations - -## Schema Versioning - -The `schema_version` field tracks schema compatibility: - -- **Version 2** (current): Snapshot and delta output — adds `driver`, `driver_detail`, and - enriched `aggregates` (file_risk, co_change, modules) -- **Version 3**: Agent-optimized output (`--all-functions`): triage-first structure with - `fire`/`debt`/`watch`/`ok` quadrant buckets and per-function `action` text -- **Version 1**: Delta output schema (separate constant from snapshot schema) -- Tools should check `schema_version` before consuming output - -## Additional Resources - -- [JSON Schema Specification](https://json-schema.org/) -- [Hotspots GitHub Repository](https://github.com/Stephen-Collins-tech/hotspots) -- [@hotspots/types npm package](https://www.npmjs.com/package/@hotspots/types) -- [Metrics Reference](./metrics.md) diff --git a/docs/reference/language-support.md b/docs/reference/language-support.md deleted file mode 100644 index 60a4d17..0000000 --- a/docs/reference/language-support.md +++ /dev/null @@ -1,71 +0,0 @@ -# Language Support - -Hotspots supports seven languages. All produce the same metrics (CC, ND, FO, NS, LRS) with consistent semantics. - -## Supported languages - -| Language | File extensions | -|----------|----------------| -| TypeScript | `.ts` `.tsx` `.mts` `.cts` | -| JavaScript | `.js` `.jsx` `.mjs` `.cjs` | -| Go | `.go` | -| Python | `.py` | -| Rust | `.rs` | -| Java | `.java` | -| C | `.c` `.h` | - ---- - -## Language notes - -### TypeScript / JavaScript - -JSX and TSX are fully supported. Short-circuit operators (`&&`, `||`) and ternaries count toward CC. Arrow functions, class methods, and standalone functions are all analyzed. - -Arrow functions and function expressions assigned to a variable or property are named after their binding — `const validate = (x) => …` appears in output as `validate`, not as an anonymous function. This applies to `.ts`, `.tsx`, `.mts`, and `.cts` files. - -### Go - -Goroutines, defer, select, and channel operations are supported. Each `case` in a `select` counts toward CC. - -### Python - -Async/await, comprehensions, context managers, and `match` statements (Python 3.10+) are supported. Comprehensions with conditionals contribute to CC. - -### Rust - -`match` arms, `?` operator, `unwrap`/`expect`, closures, and `impl` blocks are supported. Each `match` arm counts toward CC. - -### Java - -Java 8+ including lambdas, streams, try-with-resources, and switch expressions (Java 14+). Lambda bodies are analyzed as separate scopes. - -### C - -Standard C with support for all control flow constructs: `if`/`else`, `switch`, `for`, `while`, `do`/`while`, `goto`. Ternary operators and boolean short-circuit operators (`&&`, `||`) count toward CC. Header files are analyzed when function definitions are present. - ---- - -## What counts as a function - -Hotspots analyzes named, callable units of code: - -- Named functions and methods -- Class methods and constructors -- Arrow functions assigned to a named variable or property -- Closures assigned to a named binding - -Anonymous inline functions (callbacks passed directly to `.map()` or similar) are not analyzed as standalone units — their complexity folds into the containing named function's FO count. - ---- - -## Exclusions - -Hotspots automatically skips: - -- Test files (`*.test.*`, `*_test.*`, `*spec*`, `tests/`, `__tests__/`) -- Vendored dependencies (`vendor/`, `node_modules/`, `third_party/`, `external/`, `contrib/`, `deps/`) -- Generated files (detected by heuristic — minified files, protobuf output, etc.) -- Type declaration files (`.d.ts`) - -Override exclusions in `.hotspotsrc.json` with `exclude` patterns. See [Configuration](/guide/configuration). diff --git a/docs/reference/limitations.md b/docs/reference/limitations.md deleted file mode 100644 index 69394ef..0000000 --- a/docs/reference/limitations.md +++ /dev/null @@ -1,51 +0,0 @@ -# Known Limitations - -## Language coverage - -### Generator functions (JavaScript/TypeScript) - -Generator functions (`function*`) parse and analyze correctly, but `yield` expressions are not counted as control-flow branches. CC will be slightly underestimated for generators with conditional yield paths. - -### Async/await - -Async functions are analyzed as sequential control flow. Promise chains are not traced as control flow paths. CC accurately reflects the synchronous branching structure of the function body; the implicit async error path is not counted. - -### Type-level complexity - -Type annotations, generics, and overloads are parsed but not factored into metrics. LRS measures structural control-flow complexity only — type complexity does not affect the score. This is intentional. - ---- - -## Analysis scope - -### Cross-function dependencies - -Each function is analyzed in isolation. LRS does not account for the complexity of functions a function calls — that dimension is captured by Fan-Out (FO), which counts the number of distinct callees, not their internal complexity. - -### Module-level code - -Top-level module statements outside of function bodies are not analyzed as a function unit. - ---- - -## Performance - -### Large codebases - -Analysis is single-threaded. Very large repos (100k+ functions) will be slow. Most real-world codebases complete in under 30 seconds. - -### No incremental analysis - -Every run re-analyzes all matched files. Delta mode compares outputs between runs — it does not skip re-analyzing unchanged files. - ---- - -## Output - -### Floating point - -LRS values are computed in full `f64` precision. Text output rounds to 1 decimal place. Results are deterministic within a platform; rare floating-point edge cases may produce sub-0.001 differences across architectures. - -### Symlinks - -File paths are not symlink-resolved. Results may vary if the same file is reachable via multiple paths. diff --git a/docs/reference/lrs-spec.md b/docs/reference/lrs-spec.md deleted file mode 100644 index ef618d5..0000000 --- a/docs/reference/lrs-spec.md +++ /dev/null @@ -1,263 +0,0 @@ -# Local Risk Score (LRS) Specification - -## Overview - -The Local Risk Score (LRS) is a composite metric that quantifies the complexity and maintenance risk of individual functions. It combines four fundamental software metrics into a single weighted score. - -## Metrics - -### 1. Cyclomatic Complexity (CC) - -**Definition:** Number of linearly independent paths through a function's control flow. - -**Formula:** `CC = E - N + 2` where: -- `E` = number of edges in the CFG -- `N` = number of nodes in the CFG (excluding entry and exit) - -**Additional Increments:** -- Each boolean short-circuit operator (`&&`, `||`): +1 -- Each switch case: +1 -- Each catch clause: +1 - -**Minimum Value:** 1 (for empty functions) - -### 2. Nesting Depth (ND) - -**Definition:** Maximum depth of nested control structures in the AST. - -**Counted Constructs:** -- `if` statements -- Loops (`for`, `while`, `do-while`, `for-in`, `for-of`) -- `switch` statements -- `try` blocks - -**Not Counted:** -- Lexical scopes (block statements without control flow) -- Object/array literals - -**Range:** 0 to unbounded (capped at 8 for risk calculation) - -### 3. Fan-Out (FO) - -**Definition:** Number of distinct functions called from within the function. - -**Rules:** -- Count each call expression -- For chained calls like `foo().bar().baz()`, count each call: - - `foo` - - `foo().bar` - - `foo().bar().baz` -- Deduplicate by string representation -- Ignore intrinsics and operators -- Self-calls (recursion) are counted - -**Range:** 0 to unbounded - -### 4. Non-Structured Exits (NS) - -**Definition:** Number of early exit statements that break structured control flow. - -**Counted:** -- Early `return` statements (excluding final tail return) -- `break` statements -- `continue` statements -- `throw` statements - -**Not Counted:** -- Final `return` statement in a function (tail return) -- Implicit returns in arrow functions when they're the final expression - -**Range:** 0 to unbounded (capped at 6 for risk calculation) - -## Risk Transforms - -Each raw metric is transformed to a risk component using monotonic, bounded functions: - -### R_cc (Risk from Cyclomatic Complexity) - -``` -R_cc = min(log2(CC + 1), 6) -``` - -- Monotonic: increases with CC -- Bounded: maximum value of 6 -- Logarithmic scaling reduces impact of very high CC - -### R_nd (Risk from Nesting Depth) - -``` -R_nd = min(ND, 8) -``` - -- Linear scaling up to depth 8 -- Maximum value of 8 - -### R_fo (Risk from Fan-Out) - -``` -R_fo = min(log2(FO + 1), 6) -``` - -- Monotonic: increases with FO -- Bounded: maximum value of 6 -- Logarithmic scaling reduces impact of very high FO - -### R_ns (Risk from Non-Structured Exits) - -``` -R_ns = min(NS, 6) -``` - -- Linear scaling up to 6 exits -- Maximum value of 6 - -## LRS Calculation - -The Local Risk Score is a weighted sum of the risk components: - -``` -LRS = 1.0 * R_cc + 0.8 * R_nd + 0.6 * R_fo + 0.7 * R_ns -``` - -**Weights:** -- R_cc: 1.0 (highest weight - control flow complexity is primary risk) -- R_nd: 0.8 (high weight - deep nesting is hard to understand) -- R_ns: 0.7 (medium-high weight - non-structured exits complicate reasoning) -- R_fo: 0.6 (medium weight - dependencies add complexity but are somewhat expected) - -## Risk Bands - -Functions are classified into risk bands based on LRS: - -| Band | Range | Interpretation | -|-----------|------------|-----------------------------------------| -| Low | LRS < 3 | Simple, maintainable functions | -| Moderate | 3 ≤ LRS < 6| Moderate complexity, review recommended | -| High | 6 ≤ LRS < 9| High complexity, refactor recommended | -| Critical | LRS ≥ 9 | Very high complexity, urgent refactor | - -## Examples - -### Example 1: Simple Function - -```typescript -function simple(x: number): number { - return x * 2; -} -``` - -**Metrics:** -- CC = 1 (base formula) -- ND = 0 -- FO = 0 -- NS = 0 - -**Risk Components:** -- R_cc = min(log2(1 + 1), 6) = min(1.0, 6) = 1.0 -- R_nd = min(0, 8) = 0 -- R_fo = min(log2(0 + 1), 6) = min(0.0, 6) = 0.0 -- R_ns = min(0, 6) = 0 - -**LRS:** 1.0 * 1.0 + 0.8 * 0 + 0.6 * 0.0 + 0.7 * 0 = **1.0** (Low) - -### Example 2: Nested Branching - -```typescript -function nested(x: number, y: number): number { - if (x > 0) { - if (y > 0) { - return x + y; - } else { - return x - y; - } - } else { - return 0; - } -} -``` - -**Metrics:** -- CC = 3 (two if statements) -- ND = 2 (nested if) -- FO = 0 -- NS = 0 (all returns are structured) - -**Risk Components:** -- R_cc = min(log2(3 + 1), 6) = min(2.0, 6) = 2.0 -- R_nd = min(2, 8) = 2 -- R_fo = 0.0 -- R_ns = 0 - -**LRS:** 1.0 * 2.0 + 0.8 * 2 + 0.6 * 0.0 + 0.7 * 0 = **3.6** (Moderate) - -### Example 3: Complex Function - -```typescript -function complex(arr: number[]): number { - let sum = 0; - for (const item of arr) { - if (item < 0) { - break; - } - if (item > 100) { - continue; - } - sum += item; - } - return sum; -} -``` - -**Metrics:** -- CC = 3 (loop + 2 ifs) -- ND = 2 (loop with nested if) -- FO = 0 -- NS = 2 (break + continue) - -**Risk Components:** -- R_cc = min(log2(3 + 1), 6) = 2.0 -- R_nd = min(2, 8) = 2 -- R_fo = 0.0 -- R_ns = min(2, 6) = 2 - -**LRS:** 1.0 * 2.0 + 0.8 * 2 + 0.6 * 0.0 + 0.7 * 2 = **4.6** (Moderate) - -## Properties - -### Determinism - -LRS is deterministic: -- Identical input produces identical LRS -- Formatting, comments, and whitespace do not affect LRS -- Function order in file does not affect LRS - -### Monotonicity - -All risk transforms are monotonic: -- Increasing CC increases R_cc (capped at 6) -- Increasing ND increases R_nd (capped at 8) -- Increasing FO increases R_fo (capped at 6) -- Increasing NS increases R_ns (capped at 6) - -### Boundedness - -LRS has a theoretical maximum: -- Maximum R_cc = 6 -- Maximum R_nd = 8 -- Maximum R_fo = 6 -- Maximum R_ns = 6 -- **Maximum LRS** = 1.0 * 6 + 0.8 * 8 + 0.6 * 6 + 0.7 * 6 = **20.2** - -In practice, functions rarely approach this maximum. - -## Precision - -- Internal calculations use full `f64` precision -- Final LRS is not rounded internally -- Text output displays 2 decimal places -- JSON output uses full `f64` precision - -## References - -- Cyclomatic Complexity: McCabe, T. J. (1976). "A Complexity Measure" -- Fan-Out: Yourdon, E. & Constantine, L. L. (1979). "Structured Design" diff --git a/docs/reference/metrics.md b/docs/reference/metrics.md deleted file mode 100644 index c300e2f..0000000 --- a/docs/reference/metrics.md +++ /dev/null @@ -1,64 +0,0 @@ -# Metrics Reference - -Hotspots measures four independent dimensions of structural complexity per function, then combines them into a single risk score. - ---- - -## The four metrics - -### CC — Cyclomatic Complexity - -The number of independent decision paths through a function. Each `if`, `else if`, `for`, `while`, `case`, `catch`, `&&`, `||`, and ternary adds one. - -A function with CC 1 has a single straight-line path. CC 10 means at least 10 paths to test. - -### ND — Nesting Depth - -The maximum depth of nested control flow. Each `if`, loop, `try`, or block that contains another block adds a level. - -Deep nesting is a reliable signal that a function is doing too many things at once. ND 4+ almost always indicates a function that should be split. - -### FO — Fan-Out - -The number of distinct functions this function calls. High fan-out means the function coordinates many dependencies — change any one of them and this function may break. - -### NS — Non-Structured Exits - -The count of early returns, throws, and panics inside the function body (not counting the final return). High NS makes control flow hard to trace and test paths hard to enumerate. - ---- - -## LRS — Local Risk Score - -LRS combines the four metrics using log-scaled transforms and weights: - -``` -LRS = r_cc + r_nd + r_fo + r_ns -``` - -Where each component is a log-scaled, capped transform of its raw metric. Logarithmic scaling means the difference between CC 1 and CC 3 is larger than the difference between CC 20 and CC 22 — early growth matters more. - -### Risk bands - -| Band | LRS range | Meaning | -|----------|-----------|---------| -| Critical | ≥ 9.0 | Refactor now. These are your highest-probability bug sources. | -| High | 6.0–8.9 | Refactor the next time you touch this function. | -| Moderate | 3.0–5.9 | Monitor. Block increases in CI. | -| Low | < 3.0 | Not worth the risk of touching without a reason. | - -Thresholds are configurable in `.hotspotsrc.json`. See [Configuration](/guide/configuration). - ---- - -## What LRS is not - -LRS is a structural risk proxy, not a defect predictor. A function can have LRS 12 and never cause a bug (simple domain, no changes planned) or LRS 4 and be the source of a production incident (critical path, subtle invariant). - -LRS tells you where complexity is concentrated. What you do with that information requires judgment about which functions are actively changing, which are on critical paths, and which carry hidden invariants. The [HTML report](/getting-started/quick-start#what-next) and git integration help with that context. - ---- - -## Full technical spec - -See [LRS Specification](/reference/lrs-spec) for the exact formulas, transform definitions, and worked examples. diff --git a/docs/reference/scoring-changelog.md b/docs/reference/scoring-changelog.md deleted file mode 100644 index 42d018f..0000000 --- a/docs/reference/scoring-changelog.md +++ /dev/null @@ -1,67 +0,0 @@ -# Scoring Changelog - -A versioned record of every change to the scoring methodology — transforms, weights, thresholds, patterns, and ranking logic. - -Every promotion that touches a formula, weight, threshold, or ranking order gets an entry here, written as part of the same PR that ships the change. - ---- - -## Entry format - -```markdown -### vX.Y.Z — YYYY-MM-DD - -**Changed:** short name of what moved (e.g. "LRS weights", "churn_magnet threshold") -**PR:** hotspots#NNN - -| | Before | After | -|--|--|--| -| field or formula | old value | new value | - -**Notes:** (optional) why, edge cases, what stays the same -``` - -One entry per released version that contains a scoring change. If a release has no scoring changes, omit it. - ---- - -## Changelog - -### v1.24.0 — current baseline - -No scoring changes since this changelog was introduced. The formulas in [Scoring Methodology](/reference/scoring) represent the current baseline. - -**Baseline snapshot:** - -| Component | Value | -|-----------|-------| -| LRS weights | cc=1.0, nd=0.8, fo=0.6, ns=0.7 | -| R_cc cap | 6.0 (log2 scale) | -| R_nd cap | 8.0 (linear) | -| R_fo cap | 6.0 (log2 scale) | -| R_ns cap | 6.0 (linear) | -| Band: Low | LRS < 3.0 | -| Band: Moderate | 3.0 ≤ LRS < 6.0 | -| Band: High | 6.0 ≤ LRS < 9.0 | -| Band: Critical | LRS ≥ 9.0 | -| Activity: churn weight | 0.5 | -| Activity: fan-in weight | 0.4 | -| Activity: touch weight | 0.3 | -| Activity: SCC weight | 0.3 | -| Activity: recency weight | 0.2 | -| Activity: neighbor-churn weight | 0.2 | -| Activity: depth weight | 0.1 | -| Driver label percentile | P75 | -| Quadrant active threshold | touch > P50 OR days_since ≤ 30 | -| Pattern: complex_branching | CC ≥ 10 AND ND ≥ 4 | -| Pattern: deeply_nested | ND ≥ 5 | -| Pattern: exit_heavy | NS ≥ 5 | -| Pattern: god_function | LOC ≥ 60 AND FO ≥ 10 | -| Pattern: long_function | LOC ≥ 80 | -| Pattern: churn_magnet | churn ≥ 200 AND CC ≥ 8 | -| Pattern: hub_function | fan-in ≥ 10 AND CC ≥ 8 | -| Pattern: middle_man | fan-in ≥ 8 AND FO ≥ 8 AND CC ≤ 4 | -| Pattern: shotgun_target | fan-in ≥ 8 AND churn ≥ 150 | -| Pattern: stale_complex | CC ≥ 10 AND LOC ≥ 60 AND days ≥ 180 | -| Pattern: neighbor_risk | neighbor_churn ≥ 400 AND FO ≥ 8 | -| Pattern: cyclic_hub | SCC ≥ 2 AND fan-in ≥ 6 | diff --git a/docs/reference/scoring.md b/docs/reference/scoring.md deleted file mode 100644 index d7ebcd0..0000000 --- a/docs/reference/scoring.md +++ /dev/null @@ -1,265 +0,0 @@ -# Scoring Methodology - -🚧 **The scoring methodology is actively evolving.** Weights, thresholds, and formula details will change as research findings are validated and promoted. Check the [Scoring Changelog](/reference/scoring-changelog) for a versioned record of what has changed. 🚧 - -This document describes every step of the pipeline from raw source code to the ranked output Hotspots produces. Each stage builds on the previous one. - ---- - -## Pipeline overview - -``` -Source Code - ↓ -Raw Metrics (CC, ND, FO, NS, LOC) - ↓ -Risk Components (log-scaled, bounded transforms) - ↓ -Local Risk Score (LRS) + Risk Band - ↓ -Pattern Classification (Tier 1: structural · Tier 2: enriched) - ↓ -[optional enrichment: call graph, git churn, touch counts] - ↓ -Activity Risk Score (LRS + activity modifiers) - ↓ -Driver Label (primary dimension diagnosis) - ↓ -Quadrant Assignment (2-D: complexity × activity) - ↓ -Ranked Output -``` - -Stages above the enrichment line run on source code alone and are always computed. Stages below require a git repository. - ---- - -## Step 1 — Collect raw metrics - -Hotspots parses each source file and visits every function, extracting four structural measurements: - -**CC — Cyclomatic Complexity** -The number of independent decision paths through the function. Every `if`, `else if`, loop, `case`, `catch`, `&&`, `||`, and ternary adds one path. A function with no branches has CC 1. - -**ND — Nesting Depth** -The maximum depth of nested control structures — how many layers of `if`/loop/`try` are present at the deepest point. - -**FO — Fan-Out** -The number of distinct functions called from within this function. - -**NS — Non-Structured Exits** -The count of early returns, throws, breaks, and continues inside the function body, excluding the final tail return. - -**LOC — Lines of Code** -Physical line count. Used only for pattern detection (see Step 4), not for the risk score itself. - -See [Metrics Reference](/reference/metrics) for exact counting rules per language. - ---- - -## Step 2 — Transform to risk components - -Raw metric values are passed through bounded, monotonic transforms before being combined: - -``` -R_cc = min(log2(CC + 1), 6.0) # logarithmic, capped at 6 -R_nd = min(ND, 8.0) # linear, capped at 8 -R_fo = min(log2(FO + 1), 6.0) # logarithmic, capped at 6 -R_ns = min(NS, 6.0) # linear, capped at 6 -``` - -**Logarithmic scaling for CC and FO** gives more weight to early growth than to increases at already-high values — the marginal risk of going from CC 1 to CC 4 is larger than going from CC 40 to CC 44. Fan-out follows the same reasoning. - -**Linear scaling for ND and NS** reflects that each additional nesting level or exit point contributes more uniformly to complexity in practice. - -**Caps** prevent a single extreme metric from dominating the score. Each dimension is bounded independently so the combined score reflects overall structural complexity. - ---- - -## Step 3 — Compute the Local Risk Score (LRS) - -The four risk components are combined into a single score using a weighted sum: - -``` -LRS = 1.0 × R_cc + 0.8 × R_nd + 0.6 × R_fo + 0.7 × R_ns -``` - -**Weight rationale:** -- **CC (1.0)** — highest weight; control-flow complexity is the primary correlate of defect density and testing difficulty. -- **ND (0.8)** — nesting depth captures a dimension of complexity that CC alone can miss; a function can have moderate CC but still be hard to follow due to deep nesting. -- **NS (0.7)** — non-structured exits increase the number of implicit exit conditions and make postconditions harder to reason about. -- **FO (0.6)** — fan-out represents external coupling rather than internal complexity; weighted lower because some degree of fan-out is expected in most functions. - -LRS is always ≥ 1.0. The theoretical maximum is **20.2** (all four components at their caps: 1.0×6 + 0.8×8 + 0.6×6 + 0.7×6). The theoretical minimum for a trivial single-path function with no nesting, calls, or exits is 1.0. - -**Risk bands:** - -| Band | LRS range | Meaning | -|------|-----------|---------| -| Critical | ≥ 9.0 | High structural risk | -| High | 6.0–8.9 | Elevated structural risk | -| Moderate | 3.0–5.9 | Moderate structural risk | -| Low | < 3.0 | Low structural risk | - -See [LRS Specification](/reference/lrs-spec) for the complete formula derivation, worked examples, and precision notes. - ---- - -## Step 4 — Classify patterns - -Patterns are named labels that identify specific structural combinations. They complement LRS by describing *what kind* of issue a function has, not just its overall score. A function can match multiple patterns simultaneously. - -### Tier 1 — structural (source code only) - -Detected from raw metrics alone; always computed: - -| Pattern | Trigger | Description | -|---------|---------|-------------| -| `complex_branching` | CC ≥ 10 **and** ND ≥ 4 | High branching combined with deep nesting | -| `deeply_nested` | ND ≥ 5 | Maximum nesting depth at or above threshold | -| `exit_heavy` | NS ≥ 5 | High number of non-structured exits | -| `god_function` | LOC ≥ 60 **and** FO ≥ 10 | Long function with high fan-out | -| `long_function` | LOC ≥ 80 | High physical line count | - -### Tier 2 — enriched (call graph + git data) - -Require git history and the call graph; computed only when that data is available: - -| Pattern | Trigger | Description | -|---------|---------|-------------| -| `churn_magnet` | file churn ≥ 200 lines **and** CC ≥ 8 | High complexity combined with high change volume | -| `cyclic_hub` | SCC size ≥ 2 **and** fan-in ≥ 6 | Part of a dependency cycle with many callers | -| `hub_function` | fan-in ≥ 10 **and** CC ≥ 8 | High fan-in with high complexity | -| `middle_man` | fan-in ≥ 8 **and** FO ≥ 8 **and** CC ≤ 4 | High fan-in and fan-out with low internal complexity | -| `neighbor_risk` | neighbor churn ≥ 400 **and** FO ≥ 8 | High fan-out into frequently changing functions | -| `shotgun_target` | fan-in ≥ 8 **and** file churn ≥ 150 | Many callers in a frequently changed file | -| `stale_complex` | CC ≥ 10 **and** LOC ≥ 60 **and** days since change ≥ 180 | High complexity with no recent changes | - -**Derived pattern:** `volatile_god` fires only when **both** `god_function` and `churn_magnet` are true. - -All thresholds are configurable in `.hotspotsrc.json`. See [Configuration](/guide/configuration). - ---- - -## Step 5 — Compute the Activity Risk Score - -When git history is available, Hotspots extends LRS with activity signals: - -``` -Activity Risk = LRS - + (lines_added + lines_deleted) / 100 × 0.5 - + min(touch_count_30d / 10, 5.0) × 0.3 - + max(0, 5.0 − days_since_change / 7) × 0.2 - + min(fan_in / 5, 10.0) × 0.4 - + (scc_size, if in cycle, else 0) × 0.3 - + min(dependency_depth / 3, 5.0) × 0.1 - + neighbor_churn / 500 × 0.2 -``` - -Each modifier is non-negative, so Activity Risk is always ≥ LRS. When no git data is available, Activity Risk equals LRS. - -| Signal | Weight | What it captures | -|--------|--------|-----------------| -| Churn (lines added + deleted) | 0.5 | Volume of recent change | -| Fan-in (call-graph callers) | 0.4 | Number of functions that depend on this one | -| Touch count (30-day commits) | 0.3 | Frequency of recent modification | -| SCC membership | 0.3 | Presence in a dependency cycle | -| Recency (days since last change) | 0.2 | How recently the function was last modified | -| Neighbor churn | 0.2 | Change volume in called functions | -| Dependency depth | 0.1 | Depth in the call graph from entry points | - ---- - -## Step 6 — Assign driver labels - -Every function gets a single **driver** label identifying which dimension contributes most to its risk. Labels are assigned using population-relative percentile thresholds, computed independently per dimension across all functions in the current scope. - -The label is assigned by checking dimensions in the following priority order: - -| Label | Condition | Interpretation | -|-------|-----------|----------------| -| `cyclic_dep` | Function is part of a dependency cycle | Risk is primarily structural — a cycle in the call graph | -| `high_complexity` | CC above P75 | Cyclomatic complexity is the dominant dimension | -| `deep_nesting` | ND above P75 | Nesting depth is the dominant dimension | -| `high_fanout_churning` | FO above P75 **and** touches above P50 | High fan-out combined with active change | -| `high_fanin_complex` | Fan-in above P75 **and** CC above P50 | High caller count combined with elevated complexity | -| `high_churn_low_cc` | Touches above P75 **and** CC below P25 | High activity relative to structural complexity | -| `composite` | No single dimension clearly dominates | Multiple dimensions are elevated | - -Because percentiles are codebase-relative, the absolute metric value that triggers a label varies across repos. - ---- - -## Step 7 — Assign quadrants - -Every function is placed in one of four quadrants by combining its risk band with its activity level: - -| | Low activity | High activity | -|--|---|---| -| **High or Critical band** | `debt` | `fire` | -| **Low or Moderate band** | `ok` | `watch` | - -**Activity** is considered high if either of the following is true: -- 30-day touch count is above the population median, **or** -- Function was changed within the last 30 days - -| Quadrant | Signal | Typical action | -|----------|--------|----------------| -| `fire` | High complexity **and** high activity | Prioritize for review or refactoring | -| `debt` | High complexity, low activity | Schedule for future refactoring | -| `watch` | Low complexity, high activity | Monitor for complexity increases | -| `ok` | Low complexity, low activity | No immediate action indicated | - -Note: a high Activity Risk score does not by itself place a function in `fire`. Quadrant is determined by band (from LRS) and activity independently. Always check `quadrant` alongside `touches_30d` for context. - ---- - -## Step 8 — Rank output - -**Default ranking (no trained ranker):** - -1. LRS descending -2. File path ascending (tiebreak) -3. Line number ascending (tiebreak) -4. Function name ascending (tiebreak) - -**With `--mode snapshot` triage view:** - -Functions are grouped by quadrant (`fire` → `debt` → `watch` → `ok`), then sorted by Activity Risk descending within each group. - -**With a trained ranker (`hotspots train`):** - -Functions are re-scored using a RandomForest model trained on the repo's bug-fix history and sorted by predicted probability descending. LRS, band, and quadrant remain in the output. - ---- - -## File risk score - -In addition to per-function scoring, Hotspots computes a per-file score for the file-risk view: - -``` -File Risk Score = max_cc × 0.4 - + avg_cc × 0.3 - + log2(function_count + 1) × 0.2 - + min(file_churn / 100, 10.0) × 0.1 -``` - -The score weights the highest-complexity function most heavily, incorporates the average complexity distribution, accounts for file size by function count, and includes recent change volume. Files are ranked descending by this score. - ---- - -## Version history - -Every change to a formula, weight, threshold, or ranking rule is recorded in the [Scoring Changelog](/reference/scoring-changelog). - ---- - -## Coming soon: ranker scoring - -The trained ranker layer is currently in active development. Once complete, the ranker will: - -- Assign a `rank_score` (predicted probability this function appears in a future bug-fix commit) -- Surface functions that are statistically over-represented in past defects, even when LRS is moderate -- Blend structural risk with historical signal rather than treating them as separate steps - -The heuristic pipeline above will remain the default for repos without training data. diff --git a/docs/roadmap.md b/docs/roadmap.md deleted file mode 100644 index 6f08a04..0000000 --- a/docs/roadmap.md +++ /dev/null @@ -1,906 +0,0 @@ -# Hotspots Roadmap - -**Last Updated:** 2026-03-03 -**Current Version:** v1.2.1 -**Vision:** Universal complexity analysis for modern software development - ---- - -## Table of Contents - -1. [Current State](#current-state) -2. [Strategic Direction](#strategic-direction) -3. [Phase 1: Market Validation (Q1 2026)](#phase-1-market-validation-q1-2026) -4. [Phase 2: Multi-Language Experiment (Q2 2026)](#phase-2-multi-language-experiment-q2-2026) -5. [Phase 3: Scale & Enterprise (Q3-Q4 2026)](#phase-3-scale--enterprise-q3-q4-2026) -6. [Multi-Language Technical Analysis](#multi-language-technical-analysis) -7. [Decision Framework](#decision-framework) -8. [Experimental Plan](#experimental-plan) -9. [Success Metrics](#success-metrics) - ---- - -## Current State - -### ✅ Completed (as of 2026-02-04) - -**Core Engine:** -- ✅ LRS (Local Risk Score) calculation for all 6 supported languages -- ✅ CFG (Control Flow Graph) construction -- ✅ Activity Risk scoring (churn, touch, recency, fan_in, SCC, depth, neighbor_churn) -- ✅ Call graph engine (import resolution, PageRank, betweenness centrality, SCC detection) -- ✅ Branch-aware recency: divergence-point comparison instead of HEAD-only -- ✅ Driver labels: `high_complexity`, `deep_nesting`, `high_churn_low_cc`, - `high_fanout_churning`, `high_fanin_complex`, `cyclic_dep`, `composite` -- ✅ `--explain` mode (dimension-specific refactoring guidance + near-miss detail) -- ✅ `driver_detail` field: near-miss percentile context for composite functions -- ✅ Quadrant classification: `fire`, `debt`, `watch`, `ok` (band × activity) -- ✅ Quadrant-aware action text: single `driver_action_for_quadrant` source of truth -- ✅ Per-function touch cache (warm-run speedup via on-disk cache) -- ✅ `--level file` and `--level module` aggregation views -- ✅ JSONL output format -- ✅ Policy engine with 7 built-in policies -- ✅ Suppression comments system -- ✅ HTML report: trend charts (band count, risk line, share line), action column in triage table -- ✅ Agent-optimized JSON (schema v3): triage-first structure with quadrant buckets -- ✅ `--all-functions` flag for agent/tooling consumption -- ✅ Proactive warning system -- ✅ Git delta mode (snapshot + trends) -- ✅ Configuration file support (`hotspots config show` / `hotspots config validate`) - -**Language Support:** -- ✅ TypeScript (.ts, .tsx, .mts, .cts) -- ✅ JavaScript (.js, .jsx, .mjs, .cjs) -- ✅ JSX/TSX (React components) -- ✅ Go (.go) — defer, goroutines, select, type switches -- ✅ Python (.py) — comprehensions, context managers, match statements -- ✅ Rust (.rs) — match, if-let, `?` operator, loop labels -- ✅ Java (.java) — lambdas, try-with-resources, switch expressions - -**CI/CD Integration:** -- ✅ GitHub Action (Task 2.1 - COMPLETED) - - Automatic PR/push detection - - Delta analysis for PRs - - PR comments with violations - - HTML report artifacts - - Job summaries - - Binary caching - -**Infrastructure:** -- ✅ Automated release workflow (multi-platform binaries) -- ✅ Comprehensive test suite -- ✅ Documentation -- ✅ CLAUDE.md coding conventions - -### 📊 Progress - -**Overall:** 8/25 tasks completed (32%) - -**Phase Breakdown:** -- **Phase 1 (Foundations):** 4/7 completed -- **Phase 2 (CI/CD):** 3/4 completed -- **Phase 3 (Governance):** 1/4 completed -- **Phase 4 (Advanced):** 0/10 started - ---- - -## Strategic Direction - -### Vision Statement - -**"Make complexity regressions impossible in CI/CD pipelines"** - -Hotspots should be the standard tool that prevents code complexity from growing unchecked, just like linters prevent formatting issues and type checkers prevent type errors. - -### Core Principles - -1. **Zero Configuration** - Works out of the box -2. **Deterministic** - Byte-for-byte reproducible -3. **Fast** - Completes in <30s for most repos -4. **Actionable** - Clear violations, not just metrics -5. **Developer-Friendly** - Integrates seamlessly into existing workflows - -### Strategic Bets - -**Bet #1: GitHub Actions Integration is the Wedge** -- Hypothesis: Teams adopt via GitHub Actions first -- If true: Focus on GitHub Action UX, PR comments, workflow polish -- If false: Pivot to CLI-first, local development focus - -**Bet #2: Multi-Language Expands Addressable Market** *(validated)* -- Original hypothesis: TS/JS coverage may be sufficient -- Outcome: Go, Python, Rust, and Java were added; full 6-language parity now shipped -- Current focus: Deepening features across all languages, not expanding language count - -**Bet #3: Policy Engine Differentiates vs Competitors** -- Hypothesis: Automated regression blocking > manual metric review -- If true: Invest in policy sophistication, customization -- If false: Focus on reporting, visualization, trends - ---- - -## Phase 1: Market Validation (Q1 2026) - -**Timeline:** Feb - Mar 2026 (8 weeks) -**Goal:** Achieve product-market fit with TypeScript/JavaScript users - -### Milestones - -#### M1.1: v1.0.0 Release (Week 1) - -**Deliverables:** -- ✅ Tag v1.0.0 -- ✅ Trigger release workflow (build binaries) -- ✅ Test published GitHub Action -- ✅ Publish to GitHub Marketplace -- ✅ Announcement (Twitter, Reddit, HN) - -**Success Criteria:** -- Binaries available for all platforms -- GitHub Action works in external repos -- Listed on GitHub Marketplace - -#### M1.2: Early Adopter Feedback (Weeks 2-4) - -**Activities:** -1. **Outreach:** - - Share in TypeScript/React communities - - Post on r/typescript, r/reactjs - - Tweet with demo video - - Reach out to 10 target repos - -2. **Documentation:** - - Create "Getting Started" video (5 min) - - Write 3 blog posts: - - "Why LRS > Cyclomatic Complexity" - - "Blocking Complexity Regressions in CI" - - "Case Study: Analyzing [Popular TS Repo]" - -3. **Monitoring:** - - Track GitHub stars/forks - - Monitor GitHub Action usage (via releases) - - Collect issues/feedback - - Run user interviews (5-10 users) - -**Success Criteria:** -- 50+ GitHub stars -- 10+ repos using the GitHub Action -- 5+ user interviews completed -- Clear feedback themes identified - -#### M1.3: Iteration Based on Feedback (Weeks 5-8) - -**Focus Areas:** -- Fix top 3 user pain points -- Add most-requested features (if small) -- Improve documentation gaps -- Polish GitHub Action UX - -**Deliverables:** -- v1.1.0 release with improvements -- Updated docs -- Case studies from real users - -**Success Criteria:** -- 100+ GitHub stars -- 25+ repos using the action -- <10% churn rate (repos stop using) -- Net Promoter Score > 30 - -### Decision Point 1: Language Expansion (End of Week 8) - -**Question:** Should we add multi-language support? - -**Collect Data:** -- Survey users: "What languages do you use alongside TypeScript?" -- Analyze user repos: What languages appear in the same repos? -- Count feature requests for specific languages - -**Decision Criteria:** - -| Scenario | Data | Decision | -|----------|------|----------| -| **A: TS/JS is Sufficient** | <20% users need other languages | Continue TS/JS deepening | -| **B: Go is Top Request** | >40% users have Go + TS repos | Proceed to Multi-Language Experiment | -| **C: Mixed Demand** | Multiple languages requested equally | Survey deeper, defer decision | - ---- - -## Phase 2: Multi-Language Support (Completed) - -**Status:** ✅ Complete — Go, Python, Rust, and Java were implemented ahead of the original -Phase 2 timeline. Full language parity across all metrics and features is shipped. -The experiment and decision-point process described below was superseded by implementation. - -### Experimental Approach - -**Instead of full implementation, run a controlled experiment:** - -#### Experiment 1: Go Prototype (4 weeks) - -**Hypothesis:** -- Users will adopt Hotspots for Go if it provides value comparable to TS -- Go's simpler control flow makes it a good validation case -- Go support expands addressable market by 30%+ - -**Experiment Design:** - -**Week 1-2: Architecture Refactoring** -- Extract language-agnostic CFG -- Create `LanguageSupport` trait -- Refactor TS/JS to new architecture -- **Validation:** All existing tests pass, no regression - -**Week 3-4: Go Minimal Viable Implementation** -- Integrate tree-sitter-go parser -- Implement basic CFG builder (if, loops, switch) -- Skip: defer, goroutines, select (initially) -- Generate LRS for Go functions -- **Validation:** Analyze 3 popular Go repos, compare to manual review - -**Deliverables:** -- `hotspots analyze --lang go main.go` works -- LRS calculation for basic Go code -- 20+ unit tests -- Analysis of 3 real Go repos - -**Success Criteria:** -- LRS values make sense (validated by Go developers) -- No crashes on real Go code -- <30% variance from manual complexity assessment - -#### Experiment 2: User Validation (2 weeks) - -**Beta Release: v1.2.0-beta (Go support)** - -**Recruit 10 Beta Testers:** -- Criteria: TypeScript + Go polyglot repos -- Provide early access -- Ask to analyze their Go code -- Collect feedback - -**Questions:** -1. Does Go LRS align with your intuition about complex functions? -2. Would you use Hotspots for Go in CI? -3. What Go features are missing/broken? -4. Is Go support worth it vs. TS-only? - -**Success Criteria:** -- 7/10 testers say "LRS is accurate" -- 5/10 testers would use in CI -- <5 critical bugs reported -- No major architectural blockers - -#### Decision Point 2: Full Go Implementation (Week 6) - -**Question:** Should we complete Go support? - -| Outcome | Data | Decision | -|---------|------|----------| -| **Strong Signal** | 8/10 testers positive, high demand | Invest 4 more weeks for full Go | -| **Mixed Signal** | 5/10 testers positive, some demand | Release as experimental, iterate | -| **Weak Signal** | <5/10 testers positive, low demand | Shelve Go, focus on TS/JS depth | - -### Post-Experiment: Full Go Implementation (4 weeks) - -**If Decision Point 2 → Strong Signal:** - -**Week 7-8: Complete Go Features** -- Defer handling -- Goroutine spawn (count as fan-out) -- Select statements -- Multiple return values -- Error handling patterns - -**Week 9-10: Testing & Polish** -- 50+ integration tests -- Analyze 10 popular Go repos -- Document Go-specific behavior -- Update GitHub Action - -**Week 11-12: Release & Validation** -- v1.2.0 stable release -- Announce Go support -- Monitor adoption -- Collect feedback - -**Success Criteria:** -- 50+ repos using Go analysis -- <5% error rate on real Go code -- Positive community feedback - ---- - -## Phase 3: Scale & Enterprise (Q3-Q4 2026) - -**Timeline:** Jul - Dec 2026 (24 weeks) -**Goal:** Enterprise-ready features, monetization, ecosystem growth - -### Q3: Enterprise Features (12 weeks) - -**Focus:** Features that large teams need - -#### M3.1: Advanced CI/CD (4 weeks) - -- GitLab CI integration -- Jenkins plugin -- CircleCI orb -- Bitbucket Pipelines support -- Azure DevOps task - -#### M3.2: Team Features (4 weeks) - -- Config inheritance (repo-level, org-level) -- Team-specific policies -- Centralized reporting dashboard -- Trend tracking across repos -- Email/Slack notifications - -#### M3.3: Enterprise Security (4 weeks) - -- SAML/SSO integration -- Audit logging -- Policy enforcement API -- Webhook support -- On-premise deployment option - -### Q4: Ecosystem & Growth (12 weeks) - -#### M3.4: Developer Experience (6 weeks) - -- VS Code extension (inline LRS display) -- IntelliJ plugin -- CLI autocomplete -- Interactive tutorials -- Playground (web-based demo) - -#### M3.5: Community & Content (6 weeks) - -- Open source reference implementations -- Complexity best practices guide -- Video tutorial series -- Case studies (5+ companies) -- Conference talks - -### Decision Point 3: Monetization (End of Q3) - -**Question:** What's the business model? - -**Options:** - -**A: Open Core** -- Free: Core analysis, CLI, basic GitHub Action -- Paid: Team features, SSO, dashboard, SLA support -- Price: $99/month per team (5-50 devs) - -**B: Hosted SaaS** -- Free: Open source repos -- Paid: Private repos, $49/month per org -- Enterprise: Custom pricing, on-premise - -**C: Consulting/Support** -- Free: All features -- Revenue: Implementation consulting, training, custom integrations -- Price: $5K-50K per engagement - -**D: Keep Free, GitHub Sponsors** -- Free: Everything -- Donations: GitHub Sponsors, Open Collective -- Sustainability: Grants, company sponsorships - ---- - -## Multi-Language Technical Analysis - -### Language Complexity Assessment - -| Language | Complexity | Dev Time | Risk | ROI | -|----------|-----------|----------|------|-----| -| **Go** | 🟡 Medium | 7 weeks | Medium | ⭐⭐⭐ High | -| **Python** | 🔴 High | 9 weeks | High | ⭐⭐ Medium | -| **Rust** | 🔴🔴 Very High | 14 weeks | Very High | ⭐ Low | -| **Java** | 🟡 Medium | 6 weeks | Medium | ⭐⭐ Medium | -| **C#** | 🟡 Medium | 6 weeks | Medium | ⭐⭐ Medium | - -### Go (Medium Complexity) - -**Parser:** tree-sitter-go or official Go parser via FFI - -**Control Flow Challenges:** -```go -// Error handling pattern (inflates CC) -result, err := function() -if err != nil { - return err -} - -// Defer (implicit finally) -defer cleanup() - -// Select (multi-way branching) -select { -case msg := <-ch1: -case ch2 <- value: -default: -} - -// Goroutines (concurrent execution) -go worker() -``` - -**Metrics Impact:** -- **CC:** `if err != nil` pattern everywhere (2x normal) -- **Fan-out:** Goroutine spawns count? -- **NS:** panic, multiple returns, defer -- **ND:** Standard nesting - -**Effort Breakdown:** -- Architecture refactor: 3 weeks -- Go parser + CFG: 3 weeks -- Testing + polish: 1 week -- **Total:** 7 weeks - -**Risk Factors:** -- CGO dependency if using official parser -- Error handling inflates CC (need normalization?) -- Defer semantics subtle - -### Python (High Complexity) - -**Parser:** tree-sitter-python or rustpython-parser - -**Control Flow Challenges:** -```python -# Comprehensions (implicit loops) -result = [x for x in range(10) if x % 2 == 0] - -# Context managers (implicit try/finally) -with open('file') as f: - data = f.read() - -# Generators (multiple exits) -def gen(): - yield 1 - yield 2 - -# Loop else clause -for x in items: - pass -else: - # Runs if no break - pass - -# Multiple exception types -try: - risky() -except (TypeError, ValueError) as e: - handle() -``` - -**Metrics Impact:** -- **CC:** Comprehensions = loops? (design decision) -- **Fan-out:** Comprehensions have implicit calls? -- **NS:** yield, raise, comprehension exits -- **ND:** Comprehensions nest differently - -**Effort Breakdown:** -- Basic Python CFG: 4 weeks -- Comprehensions: 2 weeks -- Context managers: 1 week -- Testing + polish: 2 weeks -- **Total:** 9 weeks (without generators) - -**Risk Factors:** -- Dynamic typing makes fan-out hard -- Comprehension modeling (research problem) -- Generator CFG very complex (defer to later) - -### Rust (Very High Complexity) - -**Parser:** syn crate (official Rust parser) - -**Control Flow Challenges:** -```rust -// Pattern matching (multi-way branching) -match value { - Some(x) if x > 10 => {} - Some(_) => {} - None => {} -} - -// if let / while let -if let Some(value) = option { - // ... -} - -// ? operator (implicit return) -let result = func()?; - -// Async/await (state machines) -async fn foo() { - bar().await; -} - -// Loop labels with break values -'outer: loop { - break 'outer 42; -} -``` - -**Metrics Impact:** -- **CC:** Match arms explode CC -- **Fan-out:** Trait methods, closures -- **NS:** ?, unwrap, panic! -- **ND:** Match arms count as nesting? - -**Effort Breakdown:** -- Basic Rust CFG: 5 weeks -- Match, if let: 3 weeks -- ? operator: 1 week -- Testing + polish: 3 weeks -- **Async (later):** 8 weeks -- **Total:** 12 weeks (subset), 20 weeks (full) - -**Risk Factors:** -- Async/await is research-level complex -- Macro expansion changes CFG -- Trait resolution for fan-out - -### Architecture Requirements - -**For any new language, need:** - -1. **Language-agnostic CFG representation** - ```rust - pub trait LanguageSupport { - fn parse(&self, source: &str) -> Result; - fn discover_functions(&self, module: &ParsedModule) -> Vec; - fn build_cfg(&self, function: &Function) -> Cfg; - } - ``` - -2. **Shared metric calculation** - - CFG → metrics should be language-agnostic - - Risk transformation same across languages - -3. **Determinism guarantees** - - Function ordering deterministic - - Output byte-for-byte identical - -4. **Cross-language test suite** - - Same algorithm in different languages - - Validate LRS consistency - ---- - -## Decision Framework - -### When to Add a Language - -**Required Conditions (ALL must be true):** - -1. **User Demand** - - >30% of surveyed users need this language - - OR: 20+ GitHub issues requesting it - - OR: Clear competitor gap (they don't support it) - -2. **Market Size** - - Language in top 10 by usage (TIOBE/Stack Overflow) - - OR: Niche with high willingness to pay - -3. **Technical Feasibility** - - Parser available in Rust ecosystem - - Control flow modeling is tractable - - Estimated effort <8 weeks - -4. **Strategic Fit** - - Aligns with roadmap priorities - - Team has capacity - - Won't distract from core value prop - -### Language Priority Matrix - -``` -High Demand, Low Complexity → IMPLEMENT SOON (Go) -High Demand, High Complexity → EXPERIMENTAL VALIDATION (Python) -Low Demand, Low Complexity → WAIT FOR SIGNAL (Java, C#) -Low Demand, High Complexity → DEFER INDEFINITELY (Rust async, Haskell) -``` - -### Go/No-Go Decision Template - -For each language candidate: - -**Demand Score (0-10):** -- User requests: ___ / 3 pts -- Market size: ___ / 3 pts -- Competitor gap: ___ / 2 pts -- Strategic importance: ___ / 2 pts - -**Feasibility Score (0-10):** -- Parser quality: ___ / 3 pts -- Control flow complexity: ___ / 4 pts (inverse) -- Team expertise: ___ / 2 pts -- Testing burden: ___ / 1 pt (inverse) - -**Formula:** `Priority = (Demand × 1.5) + Feasibility` - -**Thresholds:** -- **>22:** Implement now -- **15-22:** Experimental validation -- **<15:** Defer - ---- - -## Experimental Plan - -### Experiment Framework - -**For each language being considered:** - -#### Phase 1: Rapid Prototype (2 weeks) - -**Goal:** Prove technical feasibility - -**Deliverables:** -- Parser integration -- Basic CFG for 3 control flow constructs (if, loop, return) -- LRS calculation for 1 real-world function -- 5 unit tests - -**Success Criteria:** -- Parser doesn't crash on real code -- LRS values in expected range -- No obvious architectural blockers - -**Budget:** 2 weeks, 1 developer - -#### Phase 2: User Validation (2 weeks) - -**Goal:** Validate user interest and accuracy - -**Activities:** -1. Recruit 5 beta testers (users who requested this language) -2. Analyze 3-5 repos in their codebases -3. Compare LRS to manual assessment -4. Interview: "Would you use this in CI?" - -**Success Criteria:** -- 4/5 testers say LRS is accurate -- 3/5 testers would use in CI -- <3 critical bugs found - -**Budget:** 2 weeks, 1 developer + PM for interviews - -#### Phase 3: Decision Point - -**Go:** Commit to full implementation -- Allocate 6-8 weeks -- Assign dedicated developer -- Set quality bar (95% accuracy, <5% crashes) - -**No-Go:** Shelve and document learnings -- Publish "Why we didn't add [Language]" blog post -- Keep prototype as proof-of-concept -- Revisit in 6 months - -### Experiment Tracking - -**For each experiment, track:** - -| Metric | Target | Actual | Status | -|--------|--------|--------|--------| -| Time to prototype | 2 weeks | - | - | -| Parser crash rate | <1% | - | - | -| LRS accuracy (vs manual) | >80% | - | - | -| User interest (would use) | >60% | - | - | -| Critical bugs found | <5 | - | - | - -**Decision Criteria:** -- If ALL targets met → Full implementation -- If 4/5 targets met → Iterate prototype -- If <4/5 targets met → Shelve - ---- - -## Success Metrics - -### Phase 1: Market Validation - -**Adoption:** -- 100 GitHub stars (Week 4) -- 25 repos using GitHub Action (Week 8) -- 10 paying customers (if monetization starts) - -**Engagement:** -- 20 issues/PRs from community -- 5 blog posts/articles about Hotspots -- 10 user interviews completed - -**Quality:** -- <5% crash rate on real repos -- <10% false positive rate (violations that aren't real) -- Net Promoter Score >30 - -### Phase 2: Multi-Language Experiment - -**Technical:** -- Go prototype completes in 2 weeks -- >80% LRS accuracy on Go code -- <1% parser crash rate - -**User Validation:** -- 5/10 beta testers positive -- 3/10 would use in CI -- <5 critical bugs - -**Decision Quality:** -- Clear go/no-go decision made -- Documented reasoning -- No regrets 3 months later - -### Phase 3: Scale & Enterprise - -**Revenue (if monetized):** -- $10K MRR (Month 12) -- $50K MRR (Month 18) -- 5 enterprise customers - -**Ecosystem:** -- 3 integrations (GitLab, Jenkins, etc.) -- 2 IDE plugins -- 10 case studies - -**Community:** -- 1,000 GitHub stars -- 100 contributors -- 50 third-party blog posts - ---- - -## Risks & Mitigation - -### Technical Risks - -| Risk | Impact | Probability | Mitigation | -|------|--------|-------------|------------| -| Multi-language breaks determinism | High | Medium | Extensive testing, formal verification | -| LRS not comparable across languages | High | High | Cross-language validation, normalize metrics | -| Parser dependencies break builds | Medium | Medium | Pin versions, vendor if needed | -| CFG modeling too complex | High | Medium | Start with subset, defer edge cases | - -### Market Risks - -| Risk | Impact | Probability | Mitigation | -|------|--------|-------------|------------| -| No demand for multi-language | High | Low | Validate before building | -| Competitors add TS support | Medium | Medium | Move fast, differentiate on UX | -| Users don't trust LRS | High | Low | Publish methodology, case studies | -| GitHub changes Actions API | Medium | Low | Abstract GitHub-specific code | - -### Execution Risks - -| Risk | Impact | Probability | Mitigation | -|------|--------|-------------|------------| -| Scope creep delays v1.0 | High | Medium | Ruthless prioritization, MVP focus | -| Multi-language takes 2x longer | Medium | High | Experimental validation first | -| Team burnout | High | Low | Sustainable pace, celebrate wins | -| Poor documentation blocks adoption | Medium | Medium | Invest in docs, videos, examples | - ---- - -## Open Questions - -### Product Direction - -1. **Is Hotspots a CLI tool or a service?** - - CLI-first: Self-hosted, open source, community-driven - - Service-first: Hosted analysis, dashboard, enterprise features - -2. **Should LRS be comparable across languages?** - - Yes: Massive validation effort, normalize aggressively - - No: Language-specific thresholds, easier to implement - -3. **What's the primary use case?** - - PR blocking: Policy enforcement, regression prevention - - Refactoring guide: Identify hotspots, track improvements - - Code review: Show complexity in reviews - -### Technical Architecture - -1. **Trait-based abstraction or per-language modules?** - - Trait-based: Shared CFG builder, complex design - - Per-language: Code duplication, simpler implementation - -2. **How to handle async/await?** - - Model state machine CFG: Very complex - - Ignore for now: Missing important complexity - - Treat as black box: Simple but inaccurate - -3. **Should we support language subsets initially?** - - Yes: Ship faster, iterate based on feedback - - No: Users expect full coverage, partial support is confusing - ---- - -## Next Actions - -### Immediate (This Week) - -- [ ] Merge GitHub Action PR to main -- [ ] Tag v1.0.0 release -- [ ] Test published action in external repo -- [ ] Publish to GitHub Marketplace -- [ ] Write launch announcement - -### Short Term (Next 4 Weeks) - -- [ ] User outreach (10 target repos) -- [ ] Create "Getting Started" video -- [ ] Write 3 blog posts -- [ ] Collect 5+ user interviews -- [ ] Monitor GitHub Action adoption - -### Medium Term (Next 8 Weeks) - -- [ ] Analyze interview feedback -- [ ] Survey users on language needs -- [ ] Make Decision Point 1 (language expansion) -- [ ] If GO: Start architecture refactoring -- [ ] If NO: Plan TS/JS feature deepening - ---- - -## Appendix: Competitor Landscape - -### Complexity Analysis Tools - -**SonarQube** -- Multi-language (30+ languages) -- Hosted + self-hosted -- Enterprise focus -- Heavy, slow, expensive - -**CodeClimate** -- Multi-language (10+ languages) -- Hosted only -- GitHub integration -- $200+/month - -**Codacy** -- Similar to CodeClimate -- Multi-language -- Hosted + self-hosted -- Enterprise focus - -### Hotspots Differentiation - -**vs SonarQube:** -- ✅ Fast (seconds vs minutes) -- ✅ Deterministic (byte-for-byte) -- ✅ Policy-based (not just metrics) -- ❌ Fewer languages (for now) - -**vs CodeClimate:** -- ✅ Self-hosted (no data leaves repo) -- ✅ Open source (transparent metrics) -- ✅ GitHub Action (zero config) -- ❌ No hosted dashboard (yet) - -**vs All:** -- ✅ LRS (better than CC alone) -- ✅ Git-aware (delta analysis) -- ✅ Proactive warnings (not just blocking) -- ✅ Suppression with documentation - ---- - -**Last Updated:** 2026-03-03 -**Next Review:** 2026-04-01 -**Maintained By:** Core team - -**Changes to this roadmap require:** -- Data-driven decision making -- User validation -- Team consensus -- Updated success metrics