Traverse is a comprehensive suite of static analysis tools for Solidity smart contracts. Its core philosophy centers on a fundamental insight: all meaningful interactions in a smart contract system can be represented as a Call Graph. By building a single, highly accurate Call Graph representation of a codebase, Traverse creates a unified data model that powers multiple specialized analysis tools.
This approach ensures consistency across tools, reduces redundancy, and allows each tool to benefit from improvements to the core graph extraction engine. Whether generating visualizations, creating test suites, or analyzing storage patterns, all Traverse tools operate on the same foundational Call Graph structure.
graph TD
subgraph "CLI Tools Layer"
sol2cg[sol2cg - Visualizer]
sol2test[sol2test - Test Gen]
storage_analyzer[sol-storage-analyzer]
storage_trace[storage-trace]
sol2bnd[sol2bnd - Bindings]
end
subgraph "Core Libraries"
graph_core[graph - Call Graph Core]
codegen[codegen - Test Generation]
language[language - Tree-sitter]
solidity[solidity - Solidity Utils]
end
subgraph "Support Libraries"
mermaid_lib[mermaid - Diagram Support]
logging[logging - Diagnostics]
end
sol2cg --> graph_core
sol2test --> graph_core
sol2test --> codegen
storage_analyzer --> graph_core
storage_trace --> graph_core
sol2bnd --> graph_core
codegen --> graph_core
codegen --> solidity
graph_core --> language
graph_core --> solidity
graph_core --> mermaid_lib
sol2cg -.-> logging
sol2test -.-> logging
storage_analyzer -.-> logging
storage_trace -.-> logging
sol2bnd -.-> logging
- AST (Abstract Syntax Tree): A tree representation of the syntactic structure of source code, produced by the parser
- Call Graph: A directed graph where nodes represent functions/contracts and edges represent relationships like calls, storage access, or control flow
- Node: A vertex in the Call Graph representing a code element (function, modifier, state variable, etc.)
- Edge: A directed connection between nodes representing a relationship (call, return, storage read/write)
- Span: Source code location information (start and end positions) for AST nodes
- Entry Point: A public or external function that can be called from outside the contract
- Pipeline Step: A modular analysis phase that processes the AST to build or enrich the Call Graph
The Traverse architecture follows a clear data flow pipeline that transforms Solidity source code into actionable insights.
graph LR
source[Solidity Source Files]
manifest[Manifest Discovery]
parser[Tree-sitter Parser]
ast[Abstract Syntax Tree]
pipeline[Core Pipeline]
base_graph[Base Call Graph]
analyzers[Specialized Analyzers]
enriched[Enriched Call Graph]
generators[Output Generators]
outputs[Visualizations/Tests/Reports]
source --> manifest
manifest --> parser
parser --> ast
ast --> pipeline
pipeline --> base_graph
base_graph --> analyzers
analyzers --> enriched
enriched --> generators
generators --> outputs
The parsing stage transforms raw Solidity source code into a structured AST.
crates/language: Provides Tree-sitter grammar bindings for Solidity. This crate encapsulates the low-level parsing infrastructure.crates/graph/src/parser.rs: High-level parsing utilities that consume Tree-sitter output to build a strongly-typed, project-specific AST.crates/graph/src/manifest.rs: Discovers and manages Solidity files across a project, handling directory traversal and file filtering.
Tree-sitter's incremental parsing and error recovery capabilities provide resilience to both syntax errors and Solidity language version differences. The parser can process code written in different Solidity versions (0.4.x through 0.8.x) within the same analysis session, creating error recovery nodes for unrecognized syntax while continuing to parse the remainder of the tree successfully. This allows Traverse to analyze mixed-version codebases and partially malformed contracts rather than failing completely.
The heart of Traverse is the CallGraphGeneratorPipeline (defined in cg.rs), which orchestrates a sequence of analysis steps to build the Call Graph.
- Modularity: Each step implements the
CallGraphGeneratorSteptrait - Configurability: Steps can be enabled, disabled, or configured independently
- Shared Context: Steps communicate through a shared
CallGraphGeneratorContext
-
ContractHandling(steps/entity.rs):- Scans the AST for structural elements
- Creates nodes for contracts, interfaces, libraries
- Identifies functions, constructors, modifiers, and state variables
- Builds inheritance relationships
- Populates the initial node structure of the graph
-
CallsHandling(steps/channel.rs):- Analyzes function bodies for interactions
- Identifies direct function calls, library calls, and contract creations
- Detects storage variable access (reads and writes)
- Recognizes event emissions and require statements
- Creates edges between nodes based on these interactions
- Leverages Expression Analysis for complex call resolution
interface_resolver.rs(BindingRegistry): Maps interfaces to their implementations, crucial for accurate call target resolutionbuiltin.rs: Provides definitions for standard Solidity functions and types (msg.sender, block.timestamp, etc.)
These modules consume the base Call Graph and add specialized analysis layers:
Critical for resolving complex, chained calls like getFactory().createPair(tokenA, tokenB).initialize():
- Recursively breaks down nested expressions
- Tracks types through method chains
- Resolves intermediate return values
- Determines final call targets
Tracks state variable interactions:
- Identifies which functions read from specific storage variables
- Tracks storage writes and their locations
- Adds
StorageReadandStorageWriteedges to the graph - Aggregates storage access patterns for summary generation
Determines code accessibility:
- Identifies all public/external functions as entry points
- Uses graph traversal algorithms (DFS/BFS) to trace reachable code
- Detects dead code (unreachable functions)
- Provides traversal utilities for other analyzers
The Call Graph is the central unifying data structure that all Traverse tools operate on. Defined in crates/graph/src/cg.rs, it provides a comprehensive representation of all code interactions.
pub enum NodeType {
Function, // Regular functions
Interface, // Interface definitions
Constructor, // Contract constructors
Modifier, // Function modifiers
Library, // Library functions
StorageVariable, // State variables
Evm, // Synthetic node for EVM interactions
EventListener, // Synthetic node for event listeners
RequireCondition,// Require/assert statements
IfStatement, // Control flow - if conditions
ThenBlock, // Control flow - then branches
ElseBlock, // Control flow - else branches
WhileStatement, // Control flow - while loops
WhileBlock, // Control flow - while body
ForCondition, // Control flow - for loops
ForBlock, // Control flow - for body
}pub enum EdgeType {
Call, // Function calls
Return, // Return statements
StorageRead, // Reading state variables
StorageWrite, // Writing state variables
Require, // Require checks
IfConditionBranch, // Edge to if condition
ThenBranch, // True branch of if
ElseBranch, // False branch of if
WhileConditionBranch,// Edge to while condition
WhileBodyBranch, // While loop body
ForConditionBranch, // Edge to for condition
ForBodyBranch, // For loop body
}Each node carries rich metadata:
- Visibility: Public, Private, Internal, External, Default
- Location: Source file span (start and end positions)
- Contract Association: Which contract contains this element
- Parameters: Function parameters with types and names
- Return Types: Declared return types for functions
- Additional Context: Revert messages, condition expressions, etc.
The Call Graph structure is sufficiently general to support all analysis needs:
- Visualization tools traverse nodes and edges directly
- Test generators use function signatures and parameters
- Storage analyzers filter for StorageRead/Write edges
- Security tools can trace call paths and control flow
- Documentation generators access natspec and signatures
The codegen crate is responsible for transforming the abstract Call Graph into concrete, executable Foundry test files. It demonstrates how the Call Graph serves as a foundation for code generation.
The codegen crate bridges the gap between static analysis and practical testing. It:
- Analyzes the Call Graph to identify testable functions
- Generates appropriate test patterns based on function characteristics
- Produces Foundry-compatible Solidity test files
- Handles complex scenarios like access control and state changes
The primary orchestrator that:
- Iterates over Call Graph nodes to find testable functions
- Extracts metadata (names, parameters, visibility modifiers)
- Generates boilerplate test functions with proper setup
- Creates both positive tests (
test_functionName) and negative tests (test_revert_functionName)
Specializes in generating tests for failure conditions:
- Creates tests that expect specific revert conditions
- Handles different revert patterns (require, revert, assert)
- Generates appropriate setup to trigger revert conditions
Advanced test pattern generation for:
- Property-based testing using Foundry's invariant testing
- Generating functions that attempt to break contract invariants
- Creating comprehensive state exploration tests
Focuses on state modifications:
- Generates tests that verify state changes
- Creates before/after assertions
- Handles complex state transitions across multiple calls
Generates tests for access control:
- Tests different caller contexts (owner, user, attacker)
- Verifies role-based access control
- Tests modifier-based restrictions
Handles contract deployment scenarios:
- Generates deployment test boilerplate
- Tests constructor parameters
- Verifies initial state after deployment
Output generators transform the in-memory Call Graph into various useful formats. Each generator serves different use cases while operating on the same underlying data.
Provides direct, detailed graph serialization for Graphviz:
Characteristics:
- One-to-one mapping of Call Graph to DOT format
- Preserves all nodes and edges without abstraction
- Ideal for deep, comprehensive analysis
Features:
- Node styling based on type (functions, modifiers, state variables)
- Edge labeling with parameters and return values
- Color coding for visibility levels
- Optional filtering (e.g.,
--exclude-isolated-nodes)
Use Cases:
- Architectural documentation
- Security audits requiring full detail
- Debugging contract interactions
Creates abstracted, user-friendly sequence diagrams:
Characteristics:
- Intelligently traverses from public entry points
- Focuses on inter-contract communication
- Abstracts internal implementation details
Features:
- Generates sequence diagrams showing execution flow
- Groups interactions by contract boundaries
- Highlights external calls and state changes
- Produces GitHub-compatible Mermaid syntax
Use Cases:
- Documentation for developers
- High-level system overviews
- Communication with non-technical stakeholders
Each CLI tool demonstrates a different way of leveraging the Call Graph, from direct visualization to complex transformations.
Pipeline: Source → Parser → Call Graph → DOT/Mermaid Generator
The most straightforward consumer of the Call Graph:
- Builds the graph using the standard pipeline
- Applies minimal transformation
- Outputs via selected generator (DOT or Mermaid)
- Supports configuration options for graph filtering
Pipeline: Source → Parser → Call Graph → codegen Crate → Foundry Test Files
The most complex transformation:
- Builds complete Call Graph of the contract system
- Analyzes function signatures and dependencies
- Invokes codegen crate for test generation
- Produces ready-to-run Foundry test suites
Pipeline: Source → Parser → Call Graph → Storage Access Analyzer → Markdown Table
Specialized analysis tool:
- Builds Call Graph with storage access edges
- Runs storage access analysis on all entry points
- Aggregates read/write patterns per function
- Generates markdown tables for documentation
Pipeline: Source → Parser → Call Graph → Storage Analyzer (2 functions) → Diff Output
Differential analysis tool:
- Builds Call Graph for the entire codebase
- Extracts subgraphs for specified functions
- Compares storage access patterns
- Outputs differences for upgrade safety analysis
Pipeline: Source → Parser → Call Graph with Natspec → Binding Config File
Interface mapping tool:
- Extracts Natspec documentation from source
- Identifies interfaces and implementations
- Maps relationships using Call Graph structure
- Generates YAML configuration for cross-contract bindings
The Traverse architecture embodies several key principles:
- Single Call Graph structure serves all analysis needs
- Ensures consistency across tools
- Reduces redundancy and maintenance burden
- New analysis steps can be added without modifying the core
- Steps are composable and independently configurable
- Facilitates experimentation and tool evolution
- Rust's type system ensures graph integrity
- Prevents invalid graph states at compile time
- Makes refactoring safer and more predictable
- Tree-sitter provides robust error recovery
- Partial parsing allows analysis of incomplete code
- Tools degrade gracefully with malformed input
- Direct graph manipulation avoids intermediate representations
- Rust's zero-cost abstractions ensure efficiency
- Memory-safe without garbage collection overhead
Current limitations stem from the inherent challenges of static analysis:
- Cannot fully analyze calls through arbitrary interface addresses
- Limited visibility into delegate calls and proxy patterns
- Difficulty tracking storage slots in upgradeable contracts
- Assembly blocks are not fully analyzed
- Raw
delegatecallandcalloperations have limited tracking - Bytecode-level operations are outside current scope
- Deep inheritance hierarchies may have resolution edge cases
- Virtual function overrides across multiple levels need careful handling
- Diamond inheritance patterns require special consideration
- External library calls without source code cannot be fully analyzed
- Linked libraries require additional binding configuration
Potential future directions for the Traverse project:
- Vyper Integration: Extend the parser to handle Vyper contracts
- Yul/Assembly: Deeper analysis of inline assembly blocks
- Cross-Language: Analyze interactions between different smart contract languages
- Taint Analysis: Track data flow from untrusted sources
- Symbolic Execution: Integration with formal verification tools
- Gas Optimization: Identify expensive call patterns and suggest optimizations
- Vulnerability Detection: Automated scanning for common vulnerabilities
- Invariant Inference: Automatically derive contract invariants from code
- Attack Path Generation: Find potential exploit sequences
- Language Server Protocol: Real-time analysis in development environments
- CI/CD Integration: Automated analysis in deployment pipelines
- Cloud-Based Analysis: Scalable analysis service for large codebases
- Pattern Recognition: Learn common contract patterns and anti-patterns
- Anomaly Detection: Identify unusual code structures
- Code Quality Metrics: ML-based code quality assessment
The Call Graph serves as the central abstraction powering all Traverse analysis tools. The architecture maintains a clear separation between parsing, graph construction, analysis, and output generation, enabling extensibility while ensuring consistency across all tools.
The pipeline architecture allows for future enhancements without disrupting existing functionality, while Rust provides performance and memory safety guarantees. Traverse's modular design enables adaptation as the Solidity ecosystem evolves.