fix(incremental): prune edges by source-only to preserve inbound edges from unchanged files#392
Open
tirth8205 wants to merge 1 commit into
Open
Conversation
…s from unchanged files (Egonex-AI#366) Phase 2 incremental previously pruned existing edges whose `source` OR `target` referenced a removed node. When file F was modified, edges INTO F from an unchanged file U (e.g. `U imports F`, `U calls F`, `U tests F`) were silently dropped because U is never re-analyzed and therefore never regenerates them. In one observed 64-file incremental run, 194 inbound edges would have been lost. New rule: keep an existing edge iff its `source` node survives the prune. Targets may briefly reference removed nodes — the merge script's existing dangling-edge sweep (Step 6) cleans them up once the fresh batch lands. Over-keeping is the safe trade-off: dangling sweep handles edge cases, but a deleted inbound edge from an unchanged source can never be recovered until the next full re-analysis. Adds `prune_existing_graph(existing, changed_files)` helper to `merge-batch-graphs.py` so the rule is programmatically testable, and 8 regression tests covering the fixture from the issue (F1 imports F2, F3 imports F2, F2 modified — both inbound edges survive) plus the dangling-sweep safety net for stale function-level targets. Bug 2 of 3 from Egonex-AI#366; Bugs 1 (rename detection) and 3 (importMap recovery) tracked in separate PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 incremental updates previously pruned existing edges whose
sourceortargetreferenced a removed node. When file F was modified, F was re-analyzed and regenerated its outbound edges — but edges into F from an unchanged file U (e.g.U imports F,U calls F,U tests F) were silently dropped, because U is never re-analyzed and therefore never regenerates them. Every incremental run silently eroded the inbound edges of changed files. In one observed 64-file incremental, 194 inbound edges would have been lost.This PR addresses Bug 2 of 3 from #366. Bugs 1 (rename detection) and 3 (importMap recovery) are tracked in separate PRs.
The fix
Old rule: prune edges whose
sourceORtargetwas a removed node.New rule: keep an existing edge iff its
sourcenode survives the node prune (i.e. prune bysourceonly).Why this is safe
U imports Fwhere U is unchanged, F is changed) survive prune. After merge, the fresh batch re-emitsfile:F, so the target resolves and the edge is dedup'd by(source, target, type, direction). Result: inbound edge preserved.function:U:caller → function:F:removedFn) briefly survive prune. The merge script's existing dangling-edge sweep (Step 6 inmerge-batch-graphs.py) drops them oncefunction:F:removedFndoes not reappear in the fresh batch. Result: stale edges still cleaned up.sourceis gone). The fresh F batch re-emits them.(source, target, type, direction)and pick the heavier weight.Trade-off: over-keeping. The dangling sweep handles edge cases; a deleted inbound edge from an unchanged source can never be recovered until the next full re-analysis.
Changes
understand-anything-plugin/skills/understand/SKILL.md— Phase 2 "Incremental update path", step 2 rewritten with the new rule and a safety-analysis paragraph.understand-anything-plugin/hooks/auto-update-prompt.md— same rule in the auto-update commit hook.understand-anything-plugin/skills/understand/merge-batch-graphs.py— newprune_existing_graph(existing, changed_files)helper so the prune is programmatically testable and agents can call it directly. Does not mutate input. Preserves non-node/edge top-level fields (projectName,frameworks,layers, etc.).tests/skill/understand/test_merge_batch_graphs.py— 8 new regression tests inPruneExistingGraphTests:test_inbound_edges_from_unchanged_sources_survive— the issue's fixture (F1 imports F2, F3 imports F2, only F2 changed → F1→F2 and F3→F2 survive).test_outbound_edges_from_removed_nodes_are_droppedtest_function_level_nodes_inside_changed_file_are_removedtest_no_changed_files_is_nooptest_unrelated_top_level_fields_preservedtest_does_not_mutate_inputtest_merge_dangling_sweep_drops_truly_removed_function_target— end-to-end: inbound edge into a deleted function survives prune, then the dangling sweep drops it after merge.test_merge_preserves_inbound_file_edge_after_prune— end-to-end happy path through prune + merge.Test plan
python3 -m unittest tests.skill.understand.test_merge_batch_graphs -v— 77/77 pass (8 new + 69 existing).pnpm test— 200/200 pass.pnpm lint— clean.pnpm --filter @understand-anything/core test— 670/670 pass.Refs #366
🤖 Generated with Claude Code