First: thanks for the fast turnaround on #1634 and #1640 — both verified working on our repo after upgrading to 0.9.6. Plain modules and Struct.new/Data.define constants now get real nodes (63 new Result-struct nodes here), constant-receiver edges land, and our Rails-truth sidecar's file-node fallback counter went 99 → 0 exactly as predicted. Roughly 1,000 edge-pairs our sidecar used to supply are now emitted natively. Great releases.
While re-measuring on 0.9.6 we found two new issues (filing separately; they may share a root cause in parallel extraction). This one is the more impactful of the two.
Symptom
In full-repo runs of graphify update ., a stable subset of Ruby files yields ZERO nodes — not even file nodes. In our repo it is exactly 5 of ~700 app/**/*.rb files, and the set is identical across consecutive runs.
Detection method:
covered = graph["nodes"].map { |n| n["source_file"] }.to_set
Dir["app/**/*.rb"].reject { |f| covered.include?(f) }
# => same 5 files, every run
Each file extracts perfectly in isolation
Copy any of the 5 files alone (or into a small fresh probe corpus) and run graphify update . there: full extraction — class/module node, method nodes, file node all present, stable across reruns.
Failure is batch-context-dependent, not content-dependent
- A 3-file probe corpus containing exactly three of the failing files: 1 extracted fine, 2 produced zero nodes.
- The same 2 files later extracted fine in a different 4-file corpus (original content, unmodified).
- Minimal probes of every construct the files use (
extend self, begin/rescue/retry, namespaced superclass class X < Devise::PasswordsController, def-less AR model body, endless methods) all extract fine.
- Stripping comments from two of the files made them extract in a batch where the originals didn't — but the originals also extract fine in other batches, so comment content is not the trigger either. The only consistent predictor is which other files are being processed alongside.
File shapes for reference (no shared decisive construct): one Devise-inheriting namespaced controller (~30 lines), one ActiveRecord model with a kwargs def self. upsert + begin/rescue/retry (~70 lines, comment-heavy), three extend self modules with long leading comment blocks.
Hypothesis
Parallel/batched extraction drops files at a batch boundary, or a worker exception mid-batch silently swallows the remainder of its batch. Possibly the same root as the community-assignment nondeterminism we're filing alongside this.
Impact
affected/explain are silently blind to and through the dropped files — in our case the dropped set includes hub modules that many services route through, so reverse-dependency queries under-report with no signal that anything is missing.
A cheap mitigation independent of the fix: warn (or count in output) when a scanned source file produces zero nodes.
Environment
- graphifyy 0.9.6 (uv tool install), macOS (Darwin 25.5)
- Rails repo: ~11,800 nodes / ~19,800 links total, ~5,600 Ruby nodes
- Reproduced on every full-repo run since upgrading; drop set byte-stable across runs
First: thanks for the fast turnaround on #1634 and #1640 — both verified working on our repo after upgrading to 0.9.6. Plain modules and
Struct.new/Data.defineconstants now get real nodes (63 newResult-struct nodes here), constant-receiver edges land, and our Rails-truth sidecar's file-node fallback counter went 99 → 0 exactly as predicted. Roughly 1,000 edge-pairs our sidecar used to supply are now emitted natively. Great releases.While re-measuring on 0.9.6 we found two new issues (filing separately; they may share a root cause in parallel extraction). This one is the more impactful of the two.
Symptom
In full-repo runs of
graphify update ., a stable subset of Ruby files yields ZERO nodes — not even file nodes. In our repo it is exactly 5 of ~700app/**/*.rbfiles, and the set is identical across consecutive runs.Detection method:
Each file extracts perfectly in isolation
Copy any of the 5 files alone (or into a small fresh probe corpus) and run
graphify update .there: full extraction — class/module node, method nodes, file node all present, stable across reruns.Failure is batch-context-dependent, not content-dependent
extend self,begin/rescue/retry, namespaced superclassclass X < Devise::PasswordsController, def-less AR model body, endless methods) all extract fine.File shapes for reference (no shared decisive construct): one Devise-inheriting namespaced controller (~30 lines), one ActiveRecord model with a kwargs
def self.upsert +begin/rescue/retry(~70 lines, comment-heavy), threeextend selfmodules with long leading comment blocks.Hypothesis
Parallel/batched extraction drops files at a batch boundary, or a worker exception mid-batch silently swallows the remainder of its batch. Possibly the same root as the community-assignment nondeterminism we're filing alongside this.
Impact
affected/explainare silently blind to and through the dropped files — in our case the dropped set includes hub modules that many services route through, so reverse-dependency queries under-report with no signal that anything is missing.A cheap mitigation independent of the fix: warn (or count in output) when a scanned source file produces zero nodes.
Environment