Summary
Three independent edge-fidelity bugs in the /understand incremental update path. Each silently erodes graph correctness on incremental runs (only stderr hints, exit 0). Discovered while dogfooding understand-anything 2.7.4 on a ~1,150-file Python/FastAPI + React/TS repo; I then verified all three are still present on current main (skills/understand/SKILL.md and skills/understand/merge-batch-graphs.py, around commit 025b884).
Cross-referencing to show these are distinct from known issues: #292 is the batch-existing.json filename-regex drop; #293 (closed) is scan-result.json cleanup; #302 is the tested_by path-convention linker. The three below are separate root causes.
Bug 1 — Renamed/moved files leave orphaned old-path nodes
Root cause. Phase 0 and Phase 2 (incremental) build the changed-file list with:
git diff <lastCommitHash>..HEAD --name-only
With git's default rename detection, a rename is reported as only the new path (a single line), not both. The prune step ("Remove old nodes whose filePath matches any changed file") therefore never sees the old path, so the pre-rename node — and every edge touching it — survives as an orphan pointing at a file that no longer exists.
Repro. Baseline-scan a repo, git mv docs/a.md docs/b.md, commit, run /understand (incremental). The graph keeps a stale document:docs/a.md node (plus its edges) in addition to the new document:docs/b.md.
Observed. 7 docs moved drafts/ → completed/ in one incremental; --name-only listed only the 7 new paths, so the 7 old-path nodes would have persisted unless pruned by hand.
Fix. Use git diff <base>..HEAD --name-status -M (or --no-renames) and add both old and new paths to the changed/prune set — prune the old node, analyze the new path.
Bug 2 — Naive edge prune deletes inbound edges from unchanged files (never regenerated)
Root cause. Phase 2 incremental, step 2 (SKILL.md ~L380):
Remove old edges whose source or target references a removed node
When file F is modified, it's re-analyzed (regenerating its outbound edges). But edges into F from an unchanged file U (e.g. U imports/calls/tests F) are deleted because their target was removed — and U is never re-analyzed, so those edges are never regenerated. Every incremental run silently erodes the inbound edges of changed files.
Observed. In one 64-file incremental, 194 inbound edges would have been lost under the source-OR-target rule.
Fix. Prune by source only: keep an existing edge iff its source node survives. Re-analysis regenerates all outbound edges of changed files, while inbound edges from unchanged sources are preserved. The merge's existing dedup ((source,target,type)) and dangling-edge drop safely absorb any overlap or now-invalid function-level targets. (I ran exactly this rule locally — preserved the 194 edges with zero dangling/duplicate fallout.)
Bug 3 — importMap recovery only matches file:-typed nodes, dropping edges to config/doc/table nodes
Root cause. recover_imports_from_scan() in merge-batch-graphs.py (main, ~L939-966):
file_node_ids = set()
for node in assembled["nodes"]:
if node.get("type") == "file": # <-- only type == "file"
file_node_ids.add(node.get("id", ""))
...
src_id = f"file:{src_path}" # <-- hardcoded file: prefix
...
tgt_id = f"file:{tgt_path}" # <-- hardcoded file: prefix
if tgt_id not in file_node_ids:
skipped_no_tgt_node += 1
continue
A source file the scanner classifies as config/docs/script/etc. gets a non-file: node — e.g. a settings module literally named config.py becomes config:src/demo/config.py. The recovery never finds it as a source or a target (its synthesized file:… id isn't in file_node_ids), so every import edge into/out of it is permanently dropped — surfaced only as Skipped N importMap target paths with no file: node.
Real example. src/demo/config.py (a config: node holding pydantic Settings) is imported by ~23 files; all 23 imports edges were skipped by recovery. (In my run the LLM assemble-reviewer recovered 2 of them by hand, but the deterministic pass should not have dropped them.)
Fix. Resolve source/target to the actual node id by file path across all file-level node types, not just file:. e.g.:
FILE_LEVEL = {"file","config","document","service","table","schema","resource","endpoint","pipeline"}
path_to_id = {n["filePath"]: n["id"] for n in assembled["nodes"]
if n.get("type") in FILE_LEVEL and n.get("filePath")}
src_id = path_to_id.get(src_path) # skip only if truly absent
tgt_id = path_to_id.get(tgt_path)
Environment
- Plugin
understand-anything 2.7.4 (installed), bugs re-verified against current main.
- Windows, Node 24, Python 3.13.
Happy to open a PR for any/all three (the source-only prune in Bug 2 and the prefix-resolution in Bug 3 are small, self-contained changes).
Summary
Three independent edge-fidelity bugs in the
/understandincremental update path. Each silently erodes graph correctness on incremental runs (only stderr hints, exit 0). Discovered while dogfoodingunderstand-anything2.7.4 on a ~1,150-file Python/FastAPI + React/TS repo; I then verified all three are still present on currentmain(skills/understand/SKILL.mdandskills/understand/merge-batch-graphs.py, around commit025b884).Cross-referencing to show these are distinct from known issues: #292 is the
batch-existing.jsonfilename-regex drop; #293 (closed) isscan-result.jsoncleanup; #302 is thetested_bypath-convention linker. The three below are separate root causes.Bug 1 — Renamed/moved files leave orphaned old-path nodes
Root cause. Phase 0 and Phase 2 (incremental) build the changed-file list with:
With git's default rename detection, a rename is reported as only the new path (a single line), not both. The prune step ("Remove old nodes whose
filePathmatches any changed file") therefore never sees the old path, so the pre-rename node — and every edge touching it — survives as an orphan pointing at a file that no longer exists.Repro. Baseline-scan a repo,
git mv docs/a.md docs/b.md, commit, run/understand(incremental). The graph keeps a staledocument:docs/a.mdnode (plus its edges) in addition to the newdocument:docs/b.md.Observed. 7 docs moved
drafts/→completed/in one incremental;--name-onlylisted only the 7 new paths, so the 7 old-path nodes would have persisted unless pruned by hand.Fix. Use
git diff <base>..HEAD --name-status -M(or--no-renames) and add both old and new paths to the changed/prune set — prune the old node, analyze the new path.Bug 2 — Naive edge prune deletes inbound edges from unchanged files (never regenerated)
Root cause. Phase 2 incremental, step 2 (SKILL.md ~L380):
When file
Fis modified, it's re-analyzed (regenerating its outbound edges). But edges intoFfrom an unchanged fileU(e.g.U imports/calls/tests F) are deleted because their target was removed — andUis never re-analyzed, so those edges are never regenerated. Every incremental run silently erodes the inbound edges of changed files.Observed. In one 64-file incremental, 194 inbound edges would have been lost under the source-OR-target rule.
Fix. Prune by source only: keep an existing edge iff its
sourcenode survives. Re-analysis regenerates all outbound edges of changed files, while inbound edges from unchanged sources are preserved. The merge's existing dedup ((source,target,type)) and dangling-edge drop safely absorb any overlap or now-invalid function-level targets. (I ran exactly this rule locally — preserved the 194 edges with zero dangling/duplicate fallout.)Bug 3 —
importMaprecovery only matchesfile:-typed nodes, dropping edges to config/doc/table nodesRoot cause.
recover_imports_from_scan()inmerge-batch-graphs.py(main, ~L939-966):A source file the scanner classifies as
config/docs/script/etc. gets a non-file:node — e.g. a settings module literally namedconfig.pybecomesconfig:src/demo/config.py. The recovery never finds it as a source or a target (its synthesizedfile:…id isn't infile_node_ids), so every import edge into/out of it is permanently dropped — surfaced only asSkipped N importMap target paths with no file: node.Real example.
src/demo/config.py(aconfig:node holding pydanticSettings) is imported by ~23 files; all 23importsedges were skipped by recovery. (In my run the LLM assemble-reviewer recovered 2 of them by hand, but the deterministic pass should not have dropped them.)Fix. Resolve source/target to the actual node id by file path across all file-level node types, not just
file:. e.g.:Environment
understand-anything2.7.4 (installed), bugs re-verified against currentmain.Happy to open a PR for any/all three (the source-only prune in Bug 2 and the prefix-resolution in Bug 3 are small, self-contained changes).