Skip to content

perf(flows): load adjacency in-memory for trace_flows#296

Open
0bLoM wants to merge 1 commit intotirth8205:mainfrom
0bLoM:perf/flow-adjacency-cache
Open

perf(flows): load adjacency in-memory for trace_flows#296
0bLoM wants to merge 1 commit intotirth8205:mainfrom
0bLoM:perf/flow-adjacency-cache

Conversation

@0bLoM
Copy link
Copy Markdown

@0bLoM 0bLoM commented Apr 15, 2026

Summary

  • On ~500k-node / ~3M-edge graphs, trace_flows and compute_criticality were grinding for many minutes at 100% CPU because every BFS step and criticality factor did per-row SQLite point queries (get_edges_by_source, get_node, get_node_by_id, get_edges_by_target).
  • Add FlowAdjacency dataclass + GraphStore.load_flow_adjacency() that streams nodes and CALLS/TESTED_BY edges into memory in two queries.
  • Refactor _trace_single_flow, compute_criticality, trace_flows, incremental_trace_flows to use in-memory dict/set lookups instead of SQLite round-trips.

Test plan

  • uv run pytest tests/ — 788 passed, 1 skipped, 2 xpassed
  • ruff check on modified files — clean
  • mypy --ignore-missing-imports --no-strict-optional on modified files — clean
  • Validate wall-clock speedup on the large (MekWarLive) graph where the original hang was observed

🤖 Generated with Claude Code

…cality

On large graphs (~500k nodes, ~3M edges) trace_flows and compute_criticality
did tens of millions of per-row SQLite point queries (get_edges_by_source,
get_node, get_node_by_id, get_edges_by_target), causing the build's flow
step to grind for many minutes at 100% CPU with no progress output.

Add FlowAdjacency dataclass and GraphStore.load_flow_adjacency() that
builds the needed adjacency (CALLS out-edges, TESTED_BY incoming set,
nodes-by-qn, nodes-by-id) in two streaming SELECTs. Refactor
_trace_single_flow, compute_criticality, trace_flows, and
incremental_trace_flows to use it instead of per-node/per-edge queries.

BFS target resolution, external-call detection, and test-coverage checks
become dict/set lookups; the whole pass reduces to two table scans plus
in-memory traversal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant