Skip to content

fix: cross-file call resolution — preserve N-way collision signal#397

Open
jackshiung wants to merge 4 commits intosafishamsi:v4from
jackshiung:chillvibe/main
Open

fix: cross-file call resolution — preserve N-way collision signal#397
jackshiung wants to merge 4 commits intosafishamsi:v4from
jackshiung:chillvibe/main

Conversation

@jackshiung
Copy link
Copy Markdown

Summary

Fixes a silent data-loss bug in cross-file call resolution (introduced in #348 / 5c77d9c) where global_label_to_nid: dict[str, str] drops all but one candidate when multiple nodes share the same normalised label.

Problem

When multiple functions/methods across files share a name (e.g. .get() defined in 32 files of a Laravel/CodeIgniter monorepo):

global_label_to_nid: dict[str, str] = {}
for n in all_nodes:
    global_label_to_nid[normalised.lower()] = n["id"]   # N-1 candidates silently dropped

Every call-site of .get() gets routed to whichever nid happened to be iterated last, producing a pseudo-god-node. All ambiguity_degree=0 — the ambiguity signal is gone.

Observed on a real 5-subsystem monorepo (997 files):

  • .get(): 32 unique nodes, but only 1 received all 321 INFERRED incoming edges
  • Similar collapse for .all(), .delete(), .create(), .update()

Approach

Three layered commits (pickable individually):

  1. refactor — bucket the lookup: dict[str, list[str]] preserving all candidates, with per-bucket dedup and consumption-site invariant assert.
  2. fix — on multi-candidate resolution, emit one AMBIGUOUS edge per candidate (confidence=AMBIGUOUS, confidence_score=0.2, ambiguity_degree=N). Single-candidate paths remain INFERRED@0.8 unchanged.
  3. feat — add a fan-out cap (max_ambiguity_fanout, default 20, override via kwarg or GRAPHIFY_MAX_AMBIGUITY_FANOUT). Above the cap, emit no edges and record to cross_file_call_stats (propagated into graph.json by build/export/watch).

The cap exists because generic verbs (.get() with 30+ candidates) produce fan-outs that are AST-undecidable — the edges would be semantically noise. Default of 20 is conservative; fleet users can raise it via env var without code change.

Evidence

Tests: 441 pass (4 existing + 4 new collision tests; all existing extract/confidence/pipeline/etc. green). No regressions.

Real monorepo validation (TBA-backend, 997 files):

Metric Before (v0.4.16) After (this PR)
Total edges 10,214 10,529 (+3.1%)
INFERRED calls 2,580 1,230 (collisions correctly reclassified)
AMBIGUOUS calls 0 1,426
truncated_high_degree n/a 1,079 (examples: update, all, get, create, delete)
Communities 474 449 (-5.3%, no collapse)
Top god nodes stable stable

Without the cap, TBA-backend exploded to 37,166 edges (+264%) — this was the motivation for commit 3.

Notes

  • ambiguity_degree field is additive — consumers that don't read it are unaffected.
  • cross_file_call_stats is optional and present in graph.json only when cross-file resolution runs.
  • Default cap can be raised for fleet users (monorepo-heavy codebases) via GRAPHIFY_MAX_AMBIGUITY_FANOUT=40 graphify update . without patching.

Testing

pytest tests/test_cross_file_collision.py -v   # 8/8 new
pytest tests/                                   # 441/441

Happy to split into multiple PRs if preferred (each commit stands alone).


🤖 Authored with assistance from Claude Opus 4.6

The cross-file call resolution was using dict[str, str], causing
dict-overwrite: when N nodes shared the same normalised label, only
the last-iterated nid survived in the lookup table, silently dropping
N-1 valid candidates.

This commit changes the lookup table to dict[str, list[str]] with
per-bucket uniqueness guaranteed via a seen-set, and adds
list(dict.fromkeys(...)) dedup at consumption site as a belt-and-braces
invariant. Resolution behaviour is kept equivalent for now — we still
pick candidates[0] as the target — preparing the ground for subsequent
commits to emit proper AMBIGUOUS edges.

No behaviour change in this commit. All existing tests pass.
…_degree

When cross-file call resolution finds multiple candidates for a callee
(e.g. `.get()` defined in 32 files), emitting a single edge to an
arbitrary winner is indistinguishable from dict-overwrite. This commit
fans out the edge to all candidates, marking each as AMBIGUOUS
(confidence_score=0.2) and recording ambiguity_degree = number of
candidates on each edge.

Single-candidate resolution remains INFERRED at 0.8 (unchanged).
Self-reference is filtered (caller is excluded from its own candidate list).

Real-world impact on a 5-subsystem monorepo (TBA-backend, 997 files):
- INFERRED calls: 2580 → 1230 (collisions correctly reclassified)
- AMBIGUOUS calls: 0 → N (exposes previously-hidden ambiguity)

Note: heavy collisions (CRUD verbs like `.get()` with 30+ candidates)
cause edge explosion. The next commit addresses this with a fan-out cap.
Labels with degree > 20 (typically generic verbs like `.get()`,
`.all()`, `.delete()` in multi-subsystem monorepos) produce N-way
AMBIGUOUS fanouts with no semantic value — the AST cannot
disambiguate them regardless. On TBA-backend this caused edge count
to jump 264% (10K → 37K).

This commit adds a configurable cap:
- extract() gains max_ambiguity_fanout kw-arg (default 20)
- env var GRAPHIFY_MAX_AMBIGUITY_FANOUT overrides
- When len(candidates) > cap → drop fan-out, record to stats
- Stats surface in cross_file_call_stats with:
    - resolved_single / resolved_ambiguous / truncated_high_degree
    - truncated_examples (first 5 dropped labels)
    - max_ambiguity_fanout (effective value)

build.py / export.py / watch.py propagate the stats into graph.json
so downstream tools can see what was truncated.

Real-world impact on TBA-backend:
- Edges: 37166 (+264%) → 10529 (+3.1%)
- truncated_high_degree: 1079 (examples: update, all, get, create, delete)
- All other consumers (cluster, god_nodes, report) behave normally.
Adds 8 tests covering:
- Single candidate → INFERRED 0.8 (unchanged behaviour)
- N candidates → N AMBIGUOUS edges with ambiguity_degree=N
- Self-reference filter correctness
- Default cap (20) drops high-degree fanouts and records to stats
- Cap override via kw-arg
- Cap override via GRAPHIFY_MAX_AMBIGUITY_FANOUT env var
- ambiguity_degree always matches actual fan-out count (invariant)
- Unique targets within each call-site's fan-out

Fixtures (tests/fixtures/collision/*.py) provide minimal Python
programs that exercise each case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant