Bug
_extract_calls in code_review_graph/parser.py gates CALLS edge emission on enclosing_func being set:
if call_name and enclosing_func:
caller = self._qualify(enclosing_func, file_path, enclosing_class)
...edges.append(EdgeInfo(kind="CALLS", source=caller, target=target, ...))
So calls made from module scope — top-level script glue, CLI entrypoints, if __name__ == "__main__" blocks, and Jupyter/Databricks notebook cells — produce zero CALLS edges. Any function invoked only from those contexts is then flagged as dead by find_dead_code (which counts incoming CALLS edges as evidence of liveness).
Notebook impact (severe)
PR #69 added notebook parsing — node extraction and IMPORTS_FROM edges work — but every cell is module-scope by definition, so notebooks emit no CALLS edges at all. This makes the dead-code detector's notebook coverage vacuous: any function called only from notebooks looks orphaned.
Reproducer
Real-world: a Databricks notebook (production inference pipeline) that calls Predict.extract_data_from_sample_ids():
# Direct parser call — bypasses any CLI/MCP layering
>>> from pathlib import Path
>>> from code_review_graph.parser import CodeParser
>>> nodes, edges = CodeParser().parse_file(Path("ML_wpredict_apply_v1.0.ipynb"))
>>> [e for e in edges if e.kind == "CALLS"]
[]
>>> [e for e in edges if e.kind == "IMPORTS_FROM"]
[<IMPORTS_FROM ... -> logging>, <IMPORTS_FROM ... -> sys>, <IMPORTS_FROM ... -> src.predict>]
The notebook contains predict_obj.extract_data_from_sample_ids(...) and similar calls in cells. Imports resolve correctly; calls are silently dropped.
refactor_tool(mode="dead_code") then flags extract_data_from_sample_ids and extract_data_from_files as dead — they're the entire reason the apply notebook exists.
Same shape reproduces with a plain .py file containing only a top-level helper() invocation, so this isn't notebook-specific — notebooks just suffer worst because they're 100% module-scope.
Scope of the fix
5 emission sites in parser.py gate on enclosing_func:
| Line |
Site |
Languages affected |
| 1455 |
Elixir call path |
Elixir |
| 2379 |
Generic _extract_calls |
Python, JS, TS, others — the main fix |
| 2415 |
JSX component invocation |
TSX/JSX (a top-level <App /> render is module-scope) |
| 2700 |
Solidity emit statement |
Solidity |
| 4002 |
R call path |
R |
Plus a downstream consideration: detect_entry_points in flows.py treats "is a CALLS target" as "is not a root" — so attributing a script's module-scope calls to the script's own File node would make script-only callees look "called by the script" and hide them from flow analysis. The fix needs to filter File-sourced CALLS in entry-point detection.
Why the existing convention supports the fix
_extract_value_references already attributes references to file_path when enclosing_func is None (parser.py line 2508-ish). CONTAINS edges do the same when there's no enclosing function. The fix just brings CALLS into line with the existing pattern.
Not addressed by prior PRs
Confirmed by reviewing dead-code-related history:
Fix submitted
PR pending — links 5 module-scope CALLS sites + filters File-sourced CALLS in detect_entry_points. End-to-end verification: edge count on ML_wpredict_apply_v1.0.ipynb goes from 0 to 14 CALLS edges; find_dead_code no longer flags the notebook-only methods.
318 tests pass (parser, refactor, flows, multilang, notebook).
Bug
_extract_callsincode_review_graph/parser.pygates CALLS edge emission onenclosing_funcbeing set:So calls made from module scope — top-level script glue, CLI entrypoints,
if __name__ == "__main__"blocks, and Jupyter/Databricks notebook cells — produce zero CALLS edges. Any function invoked only from those contexts is then flagged as dead byfind_dead_code(which counts incoming CALLS edges as evidence of liveness).Notebook impact (severe)
PR #69 added notebook parsing — node extraction and IMPORTS_FROM edges work — but every cell is module-scope by definition, so notebooks emit no CALLS edges at all. This makes the dead-code detector's notebook coverage vacuous: any function called only from notebooks looks orphaned.
Reproducer
Real-world: a Databricks notebook (production inference pipeline) that calls
Predict.extract_data_from_sample_ids():The notebook contains
predict_obj.extract_data_from_sample_ids(...)and similar calls in cells. Imports resolve correctly; calls are silently dropped.refactor_tool(mode="dead_code")then flagsextract_data_from_sample_idsandextract_data_from_filesas dead — they're the entire reason the apply notebook exists.Same shape reproduces with a plain
.pyfile containing only a top-levelhelper()invocation, so this isn't notebook-specific — notebooks just suffer worst because they're 100% module-scope.Scope of the fix
5 emission sites in
parser.pygate onenclosing_func:_extract_calls<App />render is module-scope)emitstatementPlus a downstream consideration:
detect_entry_pointsinflows.pytreats "is a CALLS target" as "is not a root" — so attributing a script's module-scope calls to the script's own File node would make script-only callees look "called by the script" and hide them from flow analysis. The fix needs to filter File-sourced CALLS in entry-point detection.Why the existing convention supports the fix
_extract_value_referencesalready attributes references tofile_pathwhenenclosing_funcis None (parser.py line 2508-ish). CONTAINS edges do the same when there's no enclosing function. The fix just brings CALLS into line with the existing pattern.Not addressed by prior PRs
Confirmed by reviewing dead-code-related history:
self/this) — narrows what counts as a call; doesn't restore module-scope sourceenclosing_func(the same bug)Record<string, fn>dispatch maps) — value-reference patterns; addressed by_extract_value_referencesFix submitted
PR pending — links 5 module-scope CALLS sites + filters File-sourced CALLS in
detect_entry_points. End-to-end verification: edge count onML_wpredict_apply_v1.0.ipynbgoes from 0 to 14 CALLS edges;find_dead_codeno longer flags the notebook-only methods.318 tests pass (parser, refactor, flows, multilang, notebook).