Skip to content

Incremental updates silently lose all unchanged nodes — merge-batch-graphs.py drops batch-existing.json #402

@maizoro87

Description

@maizoro87

Summary

On incremental /understand updates, the merged graph silently loses every unchanged node/edge — the result contains only the freshly re-analyzed (changed) files.

Version: 2.7.6 (Claude Code plugin)

Root cause

skills/understand/SKILL.mdPhase 2 — ANALYZE → Incremental update path instructs:

Write the pruned existing nodes/edges as batch-existing.json in the intermediate directory … Run the same merge script — it will combine batch-existing.json with the fresh batch-*.json files.

But skills/understand/merge-batch-graphs.py discovers batch files with a numeric-only regex:

batch-(\d+)(?:-part-(\d+))?\.json

batch-existing.json does not match \d+, so the merge silently skips it, discarding all surviving (unchanged) nodes and edges. The SKILL.md itself even warns about this regex in the full path (re: fused names), but the incremental instruction violates it.

Reproduction

  1. Run /understand (full) on a repo → graph has N nodes.
  2. Commit a change touching a few files.
  3. Run /understand again (incremental).
  4. Observe the node count collapse to only the changed files' nodes; all unchanged nodes are gone.

Evidence

Real run, 2026-06-05, v2.7.6, a ~250-node TS/React project:

  • As written (batch-existing.json): merge produced 94 / 164 (157 surviving nodes lost).
  • Renaming the pruned file to batch-900.json: merge produced the correct 251 / 536.

Impact

Silent data loss on the skill's headline incremental feature — and it's invisible (no error; the graph just shrinks).

Suggested fix (any one)

  1. merge-batch-graphs.py: widen the regex, e.g. batch-(\d+|existing)(?:-part-(\d+))?\.json and treat existing as a normal source.
  2. SKILL.md: instruct the incremental path to write the pruned graph as a high numeric index (e.g. batch-900.json) instead of batch-existing.json.
  3. Have the incremental path write batch-<maxBatchIndex+1>.json.

Option 1 is the most robust (keeps the documented name working).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions