TraceRender is a from-scratch Chrome JSON trace viewer built with C++17, Dear ImGui (docking branch), SDL3, OpenGL, and SQLite. It runs natively on desktop (Linux, Windows, macOS) and in the browser via WebAssembly/Emscripten. The viewer handles large traces (100MB+) at 60fps through SAX streaming parsing, string interning, spatial indexing, and sub-pixel culling.
┌─────────────────────────────────────────────────────────┐
│ main.cpp Entry point, SDL3/OpenGL init, │
│ event loop, ImGui setup │
├─────────────────────────────────────────────────────────┤
│ App Shell: dockspace layout, panel │
│ orchestration, settings, file loading │
├──────────┬──────────────────────────────┬───────────────┤
│ Platform │ UI Layer │ Tracing │
│ │ │ │
│ SDL3/GL │ TimelineView DetailPanel │ Self-profile │
│ File I/O │ SearchPanel FilterPanel │ Chrome JSON │
│ WASM │ StatsPanel FlameGraph │ output │
│ Memory │ InstancePanel SourcePanel │ │
│ │ Diagnostics Toolbar │ │
│ │ CounterTrack FlowRenderer │ │
│ │ RangeStats KeyBindings │ │
├──────────┴──────────┬───────────────────┴───────────────┤
│ │ │
│ Parser │ Model │
│ SAX JSON handler │ TraceModel, TraceEvent, ViewState │
│ B/E pair matching │ BlockIndex, QueryDb, ColorPalette │
│ │ │
└─────────────────────┴────────────────────────────────────┘
src/
├── main.cpp SDL3/OpenGL init, ImGui setup, main loop
├── app.h / app.cpp App shell: owns all panels, model, view state
├── tracing.h RAII self-profiling tracer (Chrome JSON output)
│
├── model/ Data layer
│ ├── trace_event.h TraceEvent (~40 bytes), Phase enum (17 phases)
│ ├── trace_model.h/cpp Central container: events, string pool, processes,
│ │ threads, counters, flows, indexes, call stack nav
│ ├── block_index.h 256-event block spatial index (O(log N) range queries)
│ ├── color_palette.h 48-color hash-based category/name coloring
│ └── query_db.h/cpp SQLite in-memory DB with async execution
│
├── parser/
│ └── trace_parser.h/cpp SAX streaming JSON parser, B/E matching, indexing
│
├── platform/ Compile-time platform switching
│ ├── platform.h Abstract interface (GL, dialogs, main loop, settings)
│ ├── platform_desktop.cpp SDL3/OpenGL 3.3, native file dialogs, ~/.config persistence
│ ├── platform_wasm.cpp Emscripten/WebGL2, JS interop, browser file I/O
│ ├── file_loader.h Async loading interface (4-phase progress)
│ ├── file_loader_desktop.cpp Background std::thread loader
│ ├── file_loader_wasm.cpp Synchronous loader (no threads)
│ └── memory.h RSS memory query (Linux /proc, macOS task_info)
│
├── ui/ ImGui panels and rendering
│ ├── view_state.h Viewport, selection, filters, layout, key bindings
│ ├── timeline_view.h/cpp Ruler, tracks, event boxes, zoom/pan, hit testing
│ ├── detail_panel.h/cpp Call stack, children breakdown, arguments tree
│ ├── search_panel.h/cpp Text search, name stats, unique-by-name filter
│ ├── filter_panel.h/cpp Process/thread/category visibility toggles
│ ├── stats_panel.h/cpp SQL editor, visual query builder, result tables
│ ├── flame_graph_panel.h/cpp Per-thread icicle charts, flat node pool
│ ├── instance_panel.h/cpp Function instance list with keyboard nav
│ ├── source_panel.h/cpp Source viewer with path remapping
│ ├── diagnostics_panel.h/cpp FPS/memory sparklines, render stats
│ ├── counter_track.h/cpp Step-function graphs with sub-pixel merging
│ ├── flow_renderer.h/cpp Bezier arrows for flow/async events
│ ├── toolbar.h/cpp Menu bar: file open, zoom, memory, settings
│ ├── range_stats.h/cpp Aggregated stats for selected time ranges
│ ├── key_bindings.h/cpp Rebindable shortcuts (17 actions, primary+alt)
│ ├── format_time.h Auto-scaling time display (ns/us/ms/s)
│ ├── export_utils.h CSV/TSV export (RFC 4180)
│ ├── string_utils.h Case-insensitive substring search
│ └── sort_utils.h Three-way comparators for ImGui tables
│
tests/ Google Test unit tests (15 files)
scripts/ Build/test/format wrapper scripts
web/ WebAssembly HTML shell
The fundamental unit. All string data is interned; events store indices, not copies.
| Field | Type | Description |
|---|---|---|
name_idx |
uint32_t |
Index into string pool |
cat_idx |
uint32_t |
Category index into string pool |
ph |
Phase |
Event phase (B/E/X/i/C/s/t/f/M/b/e/n/N/O/D/P/R) |
ts |
double |
Timestamp in microseconds |
dur |
double |
Duration in microseconds |
pid |
uint32_t |
Process ID |
tid |
uint32_t |
Thread ID |
id |
uint64_t |
Flow/async event ID |
args_idx |
uint32_t |
Index into args storage (UINT32_MAX = none) |
depth |
uint8_t |
Nesting depth (computed in build_index) |
is_end_event |
bool |
True for matched 'E' events (filtered after indexing) |
parent_idx |
int32_t |
Parent event index (-1 if root, computed in build_index) |
self_time |
double |
Wall time minus children (computed in build_index) |
Central container owning all trace data:
events_— Flat vector of all TraceEvent objectsstrings_— Interned string pool (names, categories)string_map_— Hash map for deduplication during parsingargs_— Raw JSON strings for event arguments (lazy deserialization)processes_— ProcessInfo hierarchy, each containing ThreadInfo vectorcounter_series_— Counter tracks:(timestamp, value)pairs per seriesflow_groups_— Map from flow ID to event indicescategories_— Unique sorted category indicesname_to_events_— Map from name_idx to sorted event indices
TraceModel
├── events[] Flat array, all events across all threads
├── strings[] Interned string pool
├── args[] JSON argument strings
├── processes[]
│ └── ProcessInfo pid, name, sort_index
│ └── threads[]
│ └── ThreadInfo tid, name, sort_index, max_depth
│ ├── event_indices[] Sorted by (ts, -dur)
│ └── block_index Spatial index for range queries
├── counter_series[] (pid, name) → (timestamp, value) pairs
├── flow_groups{} flow_id → [event_indices]
├── categories[] Unique category indices
└── name_to_events{} name_idx → [event_indices]
Shared mutable state passed to all panels by reference:
- Viewport:
view_start_ts,view_end_ts,trace_start_ts,trace_end_ts - Selection: Selected event index, pending scroll flag
- Range: Start/end timestamps, drag state
- Filters: Hidden pid/tid/category sets
- Search: Query string, result indices, current index
- Layout: Track height, label width, ruler height, counter height, flame bar settings
- Rendering: Flow arrow visibility, selection border color, time unit (ns/us)
- Coordinate helpers:
time_to_x(),x_to_time(),zoom_to_fit(),navigate_to_event()
Uses nlohmann/json SAX handler to process traces without building a DOM tree:
- Read file — 4MB chunked reads with progress reporting
- SAX parse — State machine: TopLevel → InTopObject → InTraceEvents → InEvent → InArgs
- Event extraction — Fields mapped to TraceEvent, strings interned, args reconstructed as JSON
- Special handling:
- Counter events (C): Split into separate CounterSeries per
(pid, arg_key) - Flow events (s/t/f): Grouped by ID
- Metadata (M): Set process/thread names and sort indices
- Duration B/E: Stored as-is, matched later in build_index
- Counter events (C): Split into separate CounterSeries per
All derived data computed once, stored on events/threads:
- Sort — Events per thread sorted by
(ts, -dur)(parents before children) - Match B/E pairs — Stack-based: 'B' pushes, 'E' pops and computes duration
- Per-thread processing:
- Remove matched end events and metadata
- Deduplicate same-name/same-timestamp events
- Compute nesting depth via stack of active spans
- Set
parent_idxfrom stack - Compute
self_time = dur - sum(immediate_children.dur) - Build BlockIndex (256-event blocks with min_ts, max_end_ts, depth_mask)
- Global state — Sort processes/threads, collect categories, build name→events map, compute counter min/max, cache aggregate stats
Events grouped into 256-event blocks for fast visible-range queries:
| Block Field | Purpose |
|---|---|
min_ts |
Earliest event timestamp in block |
max_end_ts |
Latest end timestamp (monotonically propagated) |
local_max_end_ts |
Max end within this block only |
start_idx / count |
Position in event_indices |
depth_mask |
Bitmask of depths 0-31 present |
Query: Binary search for first block where max_end_ts >= range_start, iterate blocks until min_ts > range_end. O(log N) block search + O(K) per visible event.
main_loop_step()
├── SDL_PollEvent() Handle input, quit, file drops
├── ImGui_NewFrame() Begin ImGui frame
├── App::update()
│ ├── FileLoader::poll() Check async load completion
│ ├── Dockspace setup Configure panel layout (first frame)
│ ├── Toolbar::render() Menu bar
│ ├── Settings modal If open
│ ├── Panel rendering If trace loaded:
│ │ ├── TimelineView Ruler, tracks, events, selection
│ │ ├── DetailPanel Call stack, children, args
│ │ ├── SearchPanel Search results table
│ │ ├── FilterPanel Visibility checkboxes
│ │ ├── StatsPanel SQL queries
│ │ ├── InstancePanel Function instances
│ │ ├── SourcePanel Source code
│ │ ├── FlameGraphPanel Icicle charts
│ │ └── DiagnosticsPanel FPS/memory sparklines
│ ├── Welcome screen If no trace
│ ├── Loading overlay If loading (spinner + progress bar)
│ └── Status bar Selected event, load progress
├── ImGui::Render() Process draw commands
├── glClear + RenderDrawData OpenGL rendering
└── SDL_GL_SwapWindow() Present frame
- ImDrawList for direct draw commands (no ImGui widgets)
- Sub-pixel culling: Events < 2px wide merged into per-depth gray bars
- Clip regions: Per-thread clip rects prevent overdraw
- Text culling: Labels only drawn when event box is wide enough
- Instant events: Rendered as diamond shapes
- Counter tracks: Step-function line graphs with sub-pixel point merging
- Flow arrows: Bezier curves from flow start to flow end across track positions
┌──────────────┬────────────────────────────────┐
│ │ │
│ Diagnostics │ Timeline / Source │
│ (18%) │ (center) │
│ │ │
│ ├─────────────────┬──────────────┤
│ │ │ │
│ │ │ Details / │
│ │ │ Filters │
│ │ │ (35%) │
├──────────────┴─────────────────┤ │
│ │ │
│ Search / Stats / Flame Graph │ │
│ (bottom 42%) │ │
│ ┌──────────┤ │
│ │Instances │ │
└─────────────────────┴──────────┴──────────────┘
The platform:: namespace provides a compile-time abstraction:
| Function | Desktop (SDL3/GL 3.3) | WASM (Emscripten/WebGL2) |
|---|---|---|
| Main loop | while (*running) step() |
emscripten_set_main_loop() |
| File open | SDL3 native dialog | HTML <input type="file"> + JS |
| File save | SDL3 save dialog | Browser download via Blob |
| File drop | SDL_EVENT_DROP_FILE | JS interop |
| Settings | ~/.config/trace-render/settings.json |
None |
| Font scale | 1.4x | 1.5x |
| GLSL | #version 330 core |
#version 300 es |
| File loading | Background std::thread |
Synchronous |
| Memory query | /proc/self/statm (Linux) / task_info (macOS) |
N/A |
SQLite in-memory database populated from TraceModel after loading:
-- Schema
events (id, name, category, phase, ts, dur, end_ts, pid, tid, depth)
processes (pid, name)
threads (tid, pid, name)
counters (pid, name, ts, value)- Async execution on background thread with progress tracking
- Visual query builder (SELECT with aggregates, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT)
- Multiple saved query tabs with JSON persistence
- Sortable result tables with CSV/TSV export
The Tracer singleton emits Chrome JSON trace format for profiling the viewer itself:
TRACE_SCOPE(name)/TRACE_FUNCTION()— RAII duration events with file/line/func argsTRACE_SCOPE_ARGS(name, cat, ...)— Duration events with user-supplied args- Counter events for FPS and RSS memory written each frame
- Output trace can be loaded back into the viewer
- Enabled via
--trace output.jsonCLI flag
15 Google Test files covering all non-UI logic:
| Test File | Coverage |
|---|---|
test_parser |
JSON formats, metadata, counters, flows, B/E matching, time units |
test_trace_model |
String interning, hierarchy, depth, parent/sibling nav, self-time, dedup |
test_trace_event |
Phase mapping, end_ts, defaults |
test_block_index |
Range queries, multi-block spans, empty index |
test_query_db |
SQL execution, aggregation, joins, error handling |
test_view_state |
Coordinate conversion, zoom, range selection, bounds clamping |
test_timeline_hit_test |
Category/depth filtering, shortest-duration selection |
test_search_panel |
Name stats, unique-by-name, empty results |
test_flame_graph_panel |
Tree building, aggregation, range/filter scoping, deep nesting |
test_counter_track |
Value lookup, point merging, zoom scenarios |
test_source_panel |
Location extraction, path remapping (Windows/WSL/Jenkins) |
test_format_time |
All units, boundaries, negative values, ruler formatting |
test_export_utils |
CSV/TSV with RFC 4180 quoting |
test_tracing |
Arg serialization, escaping, scope macros, integration |
test_panel_reset |
Panel lifecycle: on_model_changed() resets |
CI runs on Ubuntu (GCC, Clang) and Windows (MSVC) via GitHub Actions.
All fetched via CMake FetchContent (no system packages except OpenGL):
| Dependency | Version | Purpose |
|---|---|---|
| Dear ImGui | docking branch | Immediate-mode UI framework |
| SDL3 | 3.2.8 (static) | Windowing, input, OpenGL/WebGL context |
| nlohmann/json | 3.11.3 | SAX streaming JSON parser |
| SQLite | 3.45.1 (amalgamation) | In-memory SQL query engine |
| Google Test | 1.15.2 | Unit testing (desktop only) |
- Pre-compute in build_index() — Depth, parent indices, self times, categories, name maps all computed once. The render loop does lookups, not scans.
- String interning — All names and categories stored once in a shared pool. Events reference 4-byte indices, not string copies.
- SAX streaming — Never build a DOM tree. Parse events one at a time for bounded memory usage on large traces.
- Lazy deserialization — Event arguments stored as raw JSON strings, only parsed when displayed in the detail panel.
- Spatial indexing — BlockIndex enables O(log N) visible-range queries per thread, with depth bitmasks for filtered queries.
- ImDrawList rendering — Bypass ImGui widget overhead. Direct draw commands for thousands of colored rectangles at 60fps.
- Sub-pixel culling — Events narrower than ~2px merged into per-depth gray bars to prevent overdraw and GPU waste.
- Flat data structures — Flame graph uses index-based node pools (not pointers) for cache-friendly traversal and safe rebuilds.
- Deferred expensive work — Range statistics computed when drag ends, not during. SQL queries run async with progress.
- Platform abstraction — Desktop and WASM share all business logic; only windowing/file I/O differs via compile-time switching.