Skip to content

feat: resolve tsconfig.json / jsconfig.json path aliases in dependency graph #40

@swati510

Description

@swati510

Problem

The dependency graph builder (graph.py:294-315) treats all non-relative TypeScript/JavaScript imports as external npm packages. This means path aliases configured in tsconfig.json (or jsconfig.json) produce broken graph edges.

# graph.py:294-315 — current logic
if language in ("typescript", "javascript"):
    if module_path.startswith("."):
        # resolve relative imports ✅
    # Everything else → external npm package ❌
    external_key = f"external:{module_path}"

For a Next.js project with "paths": {"@/*": ["./src/*"]}, the import import { Button } from "@/components/ui/button" becomes external:@/components/ui/button — a phantom node disconnected from the real file.

Downstream impact

Feature Effect
Dependency graph Missing internal edges. Files imported only via aliases appear isolated.
PageRank / centrality Scores deflated for heavily-aliased files (e.g. components/, lib/) since incoming edges are lost.
Dead code detection False positives — _detect_unreachable_files() sees in_degree=0 for files that are actually imported via aliases.
Change propagation repowise update can't cascade staleness through alias edges → stale pages aren't regenerated.
get_dependency_path() Returns "no path" between connected files.
Generation context (RAG) ContextAssembler uses graph neighbors for context — missing edges = worse LLM-generated docs.
Architecture diagrams Missing edges produce misleading visuals.

Affected ecosystem

This affects any TS/JS project using path aliases — effectively most modern frontend codebases:

Framework Default alias Config
Next.js (create-next-app) @/*./src/* tsconfig.json paths
Vite (via vite-tsconfig-paths) @/, ~/ tsconfig.json paths
Angular CLI @mylib/* tsconfig.json paths
Nuxt 3 #imports, #components .nuxt/tsconfig.json paths
CRA / Webpack @/, src/ tsconfig.json paths or jsconfig.json
Any project with baseUrl bare specifiers tsconfig.json baseUrl

Proposed design

Architecture: TsconfigResolver as a pre-resolution layer

Add a new class TsconfigResolver that GraphBuilder uses before falling back to external-package classification. The resolver is read-only, stateless after init, and shared across all files.

Lifecycle

orchestrator._run_ingestion()
    │
    ├── FileTraverser.traverse()   → discovers tsconfig.json / jsconfig.json paths
    │
    ├── TsconfigResolver(repo_root, config_paths)   ← NEW
    │       └── parses all tsconfigs, resolves extends chains, builds alias maps
    │
    ├── GraphBuilder(tsconfig_resolver=resolver)     ← NEW parameter
    │
    └── graph_builder.build()
            └── _resolve_import()
                    └── resolver.resolve(module_path, importer_path, path_set)   ← NEW call

Where it integrates in _resolve_import()

# graph.py — updated TS/JS resolution
if language in ("typescript", "javascript"):
    if module_path.startswith("."):
        # existing relative resolution (unchanged)
        ...
        return resolved_path

    # NEW: try tsconfig path aliases before external fallback
    if self._tsconfig_resolver is not None:
        resolved = self._tsconfig_resolver.resolve(
            module_path, importer_path, path_set
        )
        if resolved:
            return resolved

    # External npm package (fallback — unchanged)
    external_key = f"external:{module_path}"
    ...

TsconfigResolver internals

1. Config discovery

During traversal (or as a dedicated pre-pass), collect all tsconfig.json and jsconfig.json files. tsconfig.json takes precedence if both exist in the same directory. Store them indexed by directory path.

2. Extends resolution

Follow extends chains to produce a flattened config per tsconfig file:

  • paths in child completely overrides parent (no merge — this is TypeScript's behavior).
  • baseUrl is resolved relative to the config file that defines it, not the root config.
  • Handle extends pointing to node_modules packages (e.g. "extends": "@tsconfig/next/tsconfig.json") — resolve via node_modules lookup from the config's directory.
  • Detect and break circular extends chains.

3. Per-file config binding

For a given source file, find the applicable tsconfig by walking up the directory tree from the file until hitting a directory with a tsconfig. Cache this mapping (file dir → resolved config). This matches TypeScript's own behavior.

4. Pattern matching

For each import specifier:

  1. Exact match patterns first (no * wildcard): "jquery"["node_modules/jquery/dist/jquery"]
  2. Wildcard patterns sorted by specificity: longest prefix before * wins, then longest suffix, then declaration order.
  3. For each matching pattern, try candidates left-to-right in the array.
  4. For each candidate, substitute the captured * text, resolve relative to baseUrl (or config dir), then apply file extension resolution.

5. File extension resolution

For each resolved candidate path, try (in order):

  • .ts, .tsx, .js, .jsx (direct extension)
  • /index.ts, /index.tsx, /index.js, /index.jsx (directory index)

Check against path_set (the set of all known repo-relative POSIX paths that GraphBuilder already maintains).

6. baseUrl-only resolution (no paths match)

If no paths pattern matches but baseUrl is set, try baseUrl + specifier with extension resolution. This handles projects that use baseUrl: "src" without explicit paths entries:

// tsconfig: { "baseUrl": "src" }
import { api } from "services/api"  // resolves to src/services/api.ts

Edge cases explicitly handled

Case Handling
Monorepo with per-package tsconfigs Walk-up discovery finds the nearest tsconfig per file. Each package gets its own alias scope.
extends from node_modules Resolve the package path, read the config, flatten.
extends chain overrides paths Child's paths wins entirely — no merge with parent.
Multiple candidates ["./src/*", "./lib/*"] Try left-to-right, return first match in path_set.
baseUrl without paths Fall through to baseUrl + specifier resolution after paths miss.
jsconfig.json Same as tsconfig; used by JS-only projects. Lower priority than tsconfig in same dir.
* wildcard anywhere in pattern Single * captured and substituted (not just suffix).
No tsconfig found Resolver returns None, falls through to existing external-package logic. Fully backwards compatible.
Circular extends Track visited set, break cycle, log warning.
Alias resolves to non-existent file Try next candidate; if all fail, fall through to baseUrl, then external.
rootDirs / moduleSuffixes Out of scope for v1 — rare and complex. Can be added later.
#imports (Node.js subpath imports) Out of scope — requires package.json imports field parsing, different mechanism than tsconfig paths.

What does NOT need to change

  • Import extraction (parser.py, typescript.scm) — already captures raw module paths correctly.
  • Dead code analyzer — already filters external: nodes. Fixing alias resolution automatically eliminates false positives since imports resolve to real file paths instead of external: nodes.
  • path_set format — already POSIX-relative-to-repo-root, which is what the resolver will produce.
  • Edge creation logicbuild() already handles resolved paths generically.

Performance considerations

  • Tsconfig parsing: One-time cost at graph-build time. Typically 1-5 configs in a monorepo. Negligible.
  • Per-import resolution: Pattern matching is a sorted-list scan + string prefix check + path_set lookup (O(1) set membership). No measurable overhead vs. the current instant external: classification.
  • Caching: Map dir → resolved_config to avoid repeated walk-up. Map (config_id, specifier) → resolved_path if profiling shows hot paths.

Key files to modify

File Change
core/ingestion/graph.py Add tsconfig_resolver parameter to GraphBuilder.__init__(). Call resolver.resolve() in _resolve_import() before external fallback.
core/ingestion/tsconfig_resolver.py New file. TsconfigResolver class: config discovery, extends flattening, pattern matching, file resolution.
core/pipeline/orchestrator.py Discover tsconfig files during traversal, instantiate TsconfigResolver, pass to GraphBuilder.
core/ingestion/traverser.py Optionally collect tsconfig/jsconfig paths during file walk (cheap — just note their paths).

Test plan

  • Next.js project with @/* alias — edges resolve to real files, not external:@/...
  • baseUrl: "src" without paths — bare specifiers resolve via baseUrl
  • Monorepo with different paths per package — correct scoping per tsconfig
  • extends chain (2-3 levels) — child paths override parent entirely
  • extends from node_modules package — resolved correctly
  • Circular extends — no infinite loop, logs warning
  • Multiple candidates ["./src/*", "./lib/*"] — first match wins
  • Exact (non-wildcard) path mapping — "jquery" → specific file
  • jsconfig.json fallback when no tsconfig present
  • Both tsconfig and jsconfig in same dir — tsconfig wins
  • Alias that resolves to directory → finds index.ts
  • Alias where no candidate exists → falls through to external (backwards compat)
  • No tsconfig in project at all → fully backwards compatible, no behavior change
  • Dead code: file imported only via alias no longer flagged as unreachable
  • get_dependency_path() finds paths through alias-resolved edges
  • PageRank: heavily-aliased utility files show appropriate centrality
  • Pattern specificity: @components/* matched before @/* for @components/Button

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions