CODETRAVERSE is a cross-language static code analysis tool that extracts structural components and dependencies from code repositories in Haskell, Rescript, Typescript, Rust, golang, python.
Given any supported codebase, CODETRAVERSE outputs:
- Per-file JSON summaries (
fdep/) - Full call and type dependency graphs in GraphML and gpickle formats
This enables rich code visualization, impact analysis, dependency inspection, and more.
- Multi-language Support: Haskell, Rescript, Typescript, Rust, golang, python (easily extensible!)
- Scalable: Handles arbitrarily nested folder structures
- Rich Outputs:
- JSON component files for every source file
repo_function_calls.graphml(visual graph for tools like Gephi, yEd, etc.)repo_function_calls.gpickle(Python-native NetworkX graph)
- Graph Querying: Use
path.pyto:- Find the shortest path between two components in the graph
- List all direct neighbors (incoming and outgoing edges) for a given component
git clone https://github.com/juspay/codetraverse.git
cd codetraverseWe recommend using a virtual environment:
python3 -m venv venv
source venv/bin/activate
# Install all required Python packages
pip install -r requirements.txtpython main.py --ROOT_DIR=/path/to/your/repo --LANGUAGE=golang- Replace
/path/to/your/repowith the root directory of your code repository - Set
--LANGUAGEto one of:haskell,rescript,typescript,rust,golang,python
Outputs will be saved in:
fdep/(JSON component summaries)graph/repo_function_calls.graphmlgraph/repo_function_calls.gpickle
| Argument | Type | Description |
|---|---|---|
--ROOT_DIR |
str | Path to the root of the source code repository (required) |
--LANGUAGE |
str | One of haskell, rescript, typescript, rust, golang, python (required) |
--OUTPUT_BASE (opt) |
str | Output directory for per-file JSONs (default: fdep) |
--GRAPH_DIR (opt) |
str | Output directory for the graphs (default: graph) |
python main.py --ROOT_DIR =/my-go-project --LANGUAGE=golangUse path.py to query the generated graph for relationships between components.
python path.py --GRAPH_PATH=/path/to/repo_function_calls.graphml --COMPONENT=<target_component> --SOURCE=<source_component>- Replace
/path/to/repo_function_calls.graphmlwith the path to your graph file. - Replace
<target_component>with the fully-qualified ID of the target component (e.g.,PgIntegrationApp::make). - Replace
<source_component>with the fully-qualified ID of the source component.
Example:
python path.py --GRAPH_PATH=graph/repo_function_calls.graphml --COMPONENT=PgIntegrationApp::make --SOURCE=PgIntegrationApp::initpython path.py --GRAPH_PATH=/path/to/repo_function_calls.graphml --COMPONENT=<target_component>- Replace
/path/to/repo_function_calls.graphmlwith the path to your graph file. - Replace
<target_component>with the fully-qualified ID of the target component.
Example:
python path.py --GRAPH_PATH=graph/repo_function_calls.graphml --COMPONENT=PgIntegrationApp::make| Argument | Type | Description |
|---|---|---|
--GRAPH_PATH or -g |
str | Path to the saved graph (.gpickle or .graphml) |
--COMPONENT or -c |
str | Target fully-qualified component ID (e.g., PgIntegrationApp::make) |
--SOURCE or -s |
str | (Optional) Source fully-qualified ID. If omitted, lists direct neighbors. |
If a path exists between the source and target components, the output will look like:
Shortest path from 'PgIntegrationApp::init' → 'PgIntegrationApp::make':
PgIntegrationApp::init → PgIntegrationApp::process → PgIntegrationApp::make
If listing direct neighbors, the output will show incoming and outgoing edges:
Nodes with edges INTO 'PgIntegrationApp::make' (2):
PgIntegrationApp::process --[calls]--> PgIntegrationApp::make
PgIntegrationApp::init --[calls]--> PgIntegrationApp::make
Nodes with edges OUT OF 'PgIntegrationApp::make' (1):
PgIntegrationApp::make --[uses_type]--> PgIntegrationApp::ResultType
-
Extract: Recursively walks your codebase, parsing each file, extracting functions, types, variables, dependencies, etc.
-
Component Output: For every file, generates a detailed
.jsonfile in thefdep/directory. -
Unification and Graph Construction: Merges all components into a single global call/type graph using NetworkX.
-
Export: Writes the graph to:
repo_function_calls.graphml(portable to any graph tool)repo_function_calls.gpickle(for Python/NetworkX use)
Per-source-file JSONs, containing extracted component data.
Universal call/type dependency graph in GraphML (XML) format.
NetworkX-serialized Python object for fast downstream analysis.
Q: Does this overwrite my source code?
A: No. All outputs are written to fdep/ and graph/.
Q: Can I use this for closed-source or private code?
A: Yes! All code runs locally.
Q: What if I want to re-run and overwrite outputs?
A: Just rm -rf fdep graph before running again.
Q: What if I want to visualize the graph?
A: Use Gephi, yEd, or any tool that supports GraphML.