Skip to content

fix(incremental): prune edges by source-only to preserve inbound edges from unchanged files#392

Open
tirth8205 wants to merge 1 commit into
Egonex-AI:mainfrom
tirth8205:fix/incremental-prune-by-source-only
Open

fix(incremental): prune edges by source-only to preserve inbound edges from unchanged files#392
tirth8205 wants to merge 1 commit into
Egonex-AI:mainfrom
tirth8205:fix/incremental-prune-by-source-only

Conversation

@tirth8205

Copy link
Copy Markdown
Contributor

Summary

Phase 2 incremental updates previously pruned existing edges whose source or target referenced a removed node. When file F was modified, F was re-analyzed and regenerated its outbound edges — but edges into F from an unchanged file U (e.g. U imports F, U calls F, U tests F) were silently dropped, because U is never re-analyzed and therefore never regenerates them. Every incremental run silently eroded the inbound edges of changed files. In one observed 64-file incremental, 194 inbound edges would have been lost.

This PR addresses Bug 2 of 3 from #366. Bugs 1 (rename detection) and 3 (importMap recovery) are tracked in separate PRs.

The fix

Old rule: prune edges whose source OR target was a removed node.
New rule: keep an existing edge iff its source node survives the node prune (i.e. prune by source only).

Why this is safe

  • Inbound file-level edges (U imports F where U is unchanged, F is changed) survive prune. After merge, the fresh batch re-emits file:F, so the target resolves and the edge is dedup'd by (source, target, type, direction). Result: inbound edge preserved.
  • Inbound function-level edges into a since-deleted function inside F (function:U:caller → function:F:removedFn) briefly survive prune. The merge script's existing dangling-edge sweep (Step 6 in merge-batch-graphs.py) drops them once function:F:removedFn does not reappear in the fresh batch. Result: stale edges still cleaned up.
  • Outbound edges from F are pruned at this step (F is removed → its outbound source is gone). The fresh F batch re-emits them.
  • Overlapping edges (an outbound edge from U → F regenerated by U on a future run) hit the merge's dedup on (source, target, type, direction) and pick the heavier weight.

Trade-off: over-keeping. The dangling sweep handles edge cases; a deleted inbound edge from an unchanged source can never be recovered until the next full re-analysis.

Changes

  1. understand-anything-plugin/skills/understand/SKILL.md — Phase 2 "Incremental update path", step 2 rewritten with the new rule and a safety-analysis paragraph.
  2. understand-anything-plugin/hooks/auto-update-prompt.md — same rule in the auto-update commit hook.
  3. understand-anything-plugin/skills/understand/merge-batch-graphs.py — new prune_existing_graph(existing, changed_files) helper so the prune is programmatically testable and agents can call it directly. Does not mutate input. Preserves non-node/edge top-level fields (projectName, frameworks, layers, etc.).
  4. tests/skill/understand/test_merge_batch_graphs.py — 8 new regression tests in PruneExistingGraphTests:
    • test_inbound_edges_from_unchanged_sources_survive — the issue's fixture (F1 imports F2, F3 imports F2, only F2 changed → F1→F2 and F3→F2 survive).
    • test_outbound_edges_from_removed_nodes_are_dropped
    • test_function_level_nodes_inside_changed_file_are_removed
    • test_no_changed_files_is_noop
    • test_unrelated_top_level_fields_preserved
    • test_does_not_mutate_input
    • test_merge_dangling_sweep_drops_truly_removed_function_target — end-to-end: inbound edge into a deleted function survives prune, then the dangling sweep drops it after merge.
    • test_merge_preserves_inbound_file_edge_after_prune — end-to-end happy path through prune + merge.

Test plan

  • python3 -m unittest tests.skill.understand.test_merge_batch_graphs -v — 77/77 pass (8 new + 69 existing).
  • pnpm test — 200/200 pass.
  • pnpm lint — clean.
  • pnpm --filter @understand-anything/core test — 670/670 pass.

Refs #366

🤖 Generated with Claude Code

…s from unchanged files (Egonex-AI#366)

Phase 2 incremental previously pruned existing edges whose `source` OR
`target` referenced a removed node. When file F was modified, edges
INTO F from an unchanged file U (e.g. `U imports F`, `U calls F`,
`U tests F`) were silently dropped because U is never re-analyzed and
therefore never regenerates them. In one observed 64-file incremental
run, 194 inbound edges would have been lost.

New rule: keep an existing edge iff its `source` node survives the
prune. Targets may briefly reference removed nodes — the merge script's
existing dangling-edge sweep (Step 6) cleans them up once the fresh
batch lands. Over-keeping is the safe trade-off: dangling sweep handles
edge cases, but a deleted inbound edge from an unchanged source can
never be recovered until the next full re-analysis.

Adds `prune_existing_graph(existing, changed_files)` helper to
`merge-batch-graphs.py` so the rule is programmatically testable, and
8 regression tests covering the fixture from the issue (F1 imports F2,
F3 imports F2, F2 modified — both inbound edges survive) plus the
dangling-sweep safety net for stale function-level targets.

Bug 2 of 3 from Egonex-AI#366; Bugs 1 (rename detection) and 3 (importMap
recovery) tracked in separate PRs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant