Skip to content

Release v0.2.0 — pipeline overhaul + web UI follow-ups#53

Merged
RaghavChamadiya merged 5 commits intomainfrom
feat/pipeline-overhaul
Apr 7, 2026
Merged

Release v0.2.0 — pipeline overhaul + web UI follow-ups#53
RaghavChamadiya merged 5 commits intomainfrom
feat/pipeline-overhaul

Conversation

@RaghavChamadiya
Copy link
Copy Markdown
Collaborator

Summary

Bumps repowise from 0.1.31 → 0.2.0. Lands the full pipeline overhaul plus the P0+P1 web UI follow-ups in one release.

See CHANGELOG.md for the complete list. Highlights:

Pipeline & ingestion

  • Parallel AST parsing via ProcessPoolExecutor; concurrent graph + git stages
  • RAG-aware doc generation (topological order, dependency summaries injected)
  • AtomicStorageCoordinator (3-store transactions + drift health check)
  • Dynamic import hint extractors (Django, pytest, Node/TS path aliases)
  • Temporal hotspot decay (180-day half-life)
  • SQL PERCENT_RANK() window-function percentiles

Analysis & MCP

  • PR blast radius analyzer (get_risk(changed_files=...))
  • Security pattern scanner (security_findings table)
  • Knowledge map in get_overview (top owners, silos, onboarding targets)
  • Configurable dead-code sensitivity in CLI and get_dead_code MCP tool
  • LLM cost tracking (llm_costs table, repowise costs CLI, live progress)

Web UI (8 new pages/sections, 5 new REST endpoints)

  • Costs page · Blast Radius page · Knowledge Map card · System Health card
  • Trend column on Hotspots · Security panel + No-tests badge on wiki
  • Live cost on generation progress

MCP tool surface unchanged — eight tools, all changes additive/optional.

Test plan

  • pytest tests/ — 757 passed, 9 skipped
  • End-to-end repowise init against test-repos/microdot (107 pages, 132 git rows, 83 cost rows, 4 security findings, drift 0)
  • Manual smoke test of all new web pages against the microdot index
  • After merge: tag v0.2.0 to trigger PyPI publish workflow

…res, cost tracking, PR blast radius

Adds 11 capabilities across the indexing pipeline, persistence layer, MCP
tools, and CLI. MCP tool count is unchanged; new functionality is folded
into existing tools (get_risk, get_overview, get_dead_code).

Pipeline & generation
- ProcessPool-based parsing with sequential fallback; ingestion and git
  stages now run concurrently via asyncio.gather
- RAG-aware doc generation: dependency summaries are pre-fetched from the
  vector store and injected into the file_page prompt; pages generated in
  topological order so leaves are summarized before their dependents
- Dynamic import hint extractors (Django INSTALLED_APPS/ROOT_URLCONF/
  MIDDLEWARE/url include, pytest conftest fixtures, Node package.json
  exports + tsconfig path aliases) wired into GraphBuilder.add_dynamic_edges

Persistence
- AtomicStorageCoordinator with async transaction() context manager and
  health_check() spanning SQL, in-memory graph, and vector store
- recompute_git_percentiles now uses a single SQL PERCENT_RANK() window
  function instead of in-memory Python ranking
- New temporal_hotspot_score column on git_metadata, computed via exp
  decay (180-day half-life) and used as the primary percentile sort key
- New llm_costs and security_findings tables; matching ORM models
- vector_store.get_page_summary_by_path() on all three backends

Cost tracking
- CostTracker with per-call recording, persisted to llm_costs; pricing
  table covers Claude 4.6 family, GPT-4o, and Gemini 1.5/2.5/3.x variants
- Wired into Anthropic, Gemini, OpenAI, and LiteLLM providers
- Live USD column on the indexing progress bar
- New `repowise costs` CLI grouping by operation/model/day

Analysis
- PRBlastRadiusAnalyzer: transitive ancestor BFS over graph_edges,
  co-change warnings, recommended reviewers by temporal ownership,
  test gaps, 0–10 overall risk score
- SecurityScanner: pattern-based scan for eval/exec/pickle/raw SQL/
  hardcoded secrets/weak hashes; persisted at index time

MCP tool extensions
- get_risk(changed_files=[...]) returns blast radius; per-file payload
  now includes test_gap and security_signals
- get_overview returns knowledge_map with top owners, knowledge silos
  (>80% ownership), and onboarding targets
- get_dead_code accepts min_confidence, include_internals,
  include_zombie_packages, no_unreachable, no_unused_exports

CLI
- `repowise dead-code` exposes the same sensitivity flags
- `repowise doctor` adds a coordinator drift health check (Check #10)
- `repowise costs` command registered

Tests
- test_models.py: expected table set updated to include llm_costs and
  security_findings; full suite green (757 passed, 9 skipped)
- End-to-end validated against test-repos/microdot: 164 files ingested,
  83 pages generated, 132 git_metadata rows with temporal hotspot score,
  83 cost rows totaling $0.0258, 2 security findings, drift = 0
P0:
- Cost tracking: GET /api/repos/{id}/costs[/summary], /repos/[id]/costs page
  with day/model/operation grouping, Recharts chart, sidebar nav.
- Hotspot table: temporal_hotspot_score in HotspotResponse, ordered by trend,
  new sortable Trend column with flame indicator.
- Test gap: test_gap on GitMetadataResponse, "No tests" badge in wiki sidebar.
- Security findings: GET /api/repos/{id}/security router, SecurityPanel
  client component on wiki page right sidebar.

P1:
- PR blast radius: POST /api/repos/{id}/blast-radius, interactive page with
  risk score gauge, direct/transitive/co-change/reviewer/test-gap tables,
  sidebar + mobile nav entry.
- Knowledge map: extracted compute_knowledge_map() service, REST endpoint,
  3-column overview card (top owners / silos / onboarding targets).
- Coordinator drift: GET /api/repos/{id}/health/coordinator, System Health
  card on settings page with color-coded status.
- Live cost on web progress: SSE stream now sums llm_costs since job start,
  generation-progress component shows running cost with live indicator.
- coordinator._vector_count: fall back to list_page_ids() for LanceDB/Pg
  (was returning -1, surfaced as 0 in System Health → false 100% drift).
- coordinator.health_check(): count graph_nodes from SQL when no in-memory
  GraphBuilder is supplied (the REST route does not have one).
- Blast radius page: hotspot suggestion chips, 'Use top hotspots' prefill,
  Clear button, helper sentence — users no longer need to know paths.
- Bump version 0.1.31 → 0.2.0 across all four package files
- Add CHANGELOG.md with full 0.2.0 entry (11 new capabilities, 8 new
  REST endpoints, 5 new web pages, 3 new migrations)
- Remove temporary 'What's new' section from README; reflect new web UI
  pages (Costs, Blast Radius, Knowledge Map, System Health) in the
  Local dashboard table
Copy link
Copy Markdown
Collaborator

@swati510 swati510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RaghavChamadiya RaghavChamadiya merged commit 132a3ee into main Apr 7, 2026
5 checks passed
@RaghavChamadiya RaghavChamadiya deleted the feat/pipeline-overhaul branch April 7, 2026 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants