RollingStone aims to be a high-fidelity simulator of RocksDB's LSM tree behavior, based on the official RocksDB documentation.
- Memtable: In-memory write buffer (64-256 MB typical)
- Immutable Memtables: Queue of frozen memtables waiting for flush
- L0 (Tiered): Overlapping SST files from memtable flushes
- L1-L6 (Leveled): Non-overlapping sorted runs, exponentially growing
L0 → L1 Compaction:
- Triggered when L0 file count ≥
level0_file_num_compaction_trigger(default: 4) - Usually picks all L0 files (they overlap)
- Merges with overlapping L1 files
- Applies reduction factor (0.9 for L0→L1, simulating deduplication)
- Not parallelized by default (RocksDB limitation)
L1+ → L1+ Compaction:
- Scored by
size / target_size(or file_count / trigger for L0) - Highest-scoring level compacts first
- Picks subset of files using statistical distributions:
- Source files: Uniform or Exponential distribution
- Overlapping files in next level: Exponential distribution
- Splits output into multiple files based on
target_file_size_base * multiplier^(level-1) - Can run in parallel (up to
max_background_jobs, default: 2)
Dynamic Thresholds:
- If target level is empty: require 2.0x over target before compacting
- If target level has 1-2 files: require 1.5x over target
- Otherwise: require 1.0x over target (compact when over target)
This prevents premature compaction into empty levels and allows levels to naturally accumulate files.
Disk as Shared Resource:
- Single
diskBusyUntiltoken tracks when disk is free - All flushes and compactions consume from same bandwidth pool
- Operation duration =
(inputSize + outputSize) / ioThroughputMBps
I/O Profiles:
- EBS gp3: 500 MB/s, 3ms latency
- NVMe: 3000 MB/s, 0.1ms latency
- HDD: 150 MB/s, 10ms latency
Write Stalls:
- When
numImmutableMemtables >= maxWriteBufferNumber, incoming writes stall - WriteEvent rescheduled with 100ms delay
- Simulates RocksDB's backpressure mechanism
CompactionCheckEvent (fires every 1 virtual second):
- Simulates RocksDB's background compaction threads
- Scores all levels
- Schedules compactions up to
maxBackgroundJobsslots - Decoupled from flush events (compactions happen independently)
- Don't track actual keys or ranges
- Use statistical distributions for file selection and overlap
- Much faster than tracking every key
- RocksDB can compact L0 files with each other before L1
- Not currently implemented (TODO)
- RocksDB can parallelize L0→L1 using subcompactions
- Currently, L0→L1 uses 1 job slot (faithful to default RocksDB)
- Read amplification =
1 + numImmutableMemtables + L0_file_count + num_levels_with_data - No actual read workload or bloom filter modeling (yet)
WA = TotalBytesWrittenToDisk / TotalBytesWrittenByUser- Includes flushes and compactions
RA = 1 + numImmutableMemtables + L0_fileCount + (num_non_empty_levels)- Represents worst-case file lookups for a point query
SA = TotalSizeOnDisk / LogicalDataSize- Tracks overhead from uncompacted data and obsolete keys
- Window: 100ms around current time (
[now-0.05s, now+0.05s]) - Calculation: Sum bandwidth of all active I/O operations
- Capping: Scaled proportionally if sum exceeds physical disk limit
App
├── Header (RollingStone title, connection status)
├── SimulationControls
│ ├── Play/Pause/Reset buttons
│ ├── LSM Configuration (collapsible)
│ ├── Workload & Traffic (collapsible)
│ ├── I/O Performance (collapsible)
│ └── Advanced LSM Tuning (collapsible)
├── MetricsDashboard
│ ├── Key metrics (4-card grid)
│ ├── Write throughput (numeric display)
│ └── Active I/O operations (compaction details)
└── LSMTreeVisualization
├── Memtable (green bar)
└── Levels (L0-L6, size bars, compaction indicators)
Each config section shows a summary line when collapsed:
- "LSM Configuration: 7 levels, 64 MB memtable, L0 trigger: 4"
- "Workload: 10.0 MB/s write rate"
- "I/O Performance: EBS gp3 (500 MB/s)"
Click to expand/collapse. Keeps UI manageable.
Static (disabled while running):
- LSM structure, compaction settings, I/O profiles
- Require simulation reset to apply
- Shows "Reset required" badge when changed
Dynamic (adjustable live):
writeRateMBpsslider (0-500 MB/s)- Changes take effect immediately (next WriteEvent)
LSM Presets:
- Small (3 levels) - for debugging
- Default (7 levels) - standard RocksDB
- Large (9 levels) - very large databases
I/O Presets:
- EBS gp3, EBS io2, NVMe, HDD, Custom
- Sets
ioThroughputMBpsandioLatencyMs
Compaction Indicators:
- Yellow pulsing dot + border on level card
- Shows input→output sizes: "Compacting: 500.0 MB → 495.0 MB"
- Multiple compactions shown as comma-separated
Size Bars:
- Logarithmic scale (handles wide size ranges)
- Color gradient: blue (small) → purple (large)
Active I/O:
- Count: "3 active writes"
- Details: "Flush → L0: 64.0 MB", "L1→L2: 500.0 MB → 495.0 MB"
simulator.go:
Simulatorstruct: Holds all simulation stateStep(): Process events untiltargetTimereachedprocessEvent(): Dispatch to specific event handlersprocessWrite(),processFlush(),processCompaction(),processCompactionCheck()tryScheduleCompaction(): Score levels, pick highest-priority compaction
lsm.go:
LSMTreestruct: MemTable + LevelsFlushMemtable(): Create L0 file from frozen memtableCreateSSTFile(): Generate SST file with IDcalculateCompactionScore(): Score = size/target (or fileCount/trigger for L0)pickLevelToCompact(): Find highest-scoring level above thresholdState(): Serialize LSM tree for UI (limit 20 files/level)
compactor.go:
Compactorinterface:NeedsCompaction(),PickCompaction(),ExecuteCompaction()LeveledCompactor: Statistical file selection (no key tracking)PickCompaction(): Select source files + overlapping target files using distributionsExecuteCompaction(): Merge files, apply reduction, split into target-sized files
metrics.go:
Metricsstruct: Tracks amplifications, throughput, latenciesRecordWrite(),RecordFlush(),RecordCompaction()StartWrite(),CompleteWrite(): Track in-progress I/O for throughputcalculateThroughput(): Instantaneous bandwidth calculation (100ms window)CapThroughput(): Enforce physical disk limits
events.go:
Eventinterface:Timestamp(),Type()WriteEvent,FlushEvent,CompactionEvent,CompactionCheckEvent- Each event stores start time for duration tracking
event_queue.go:
- Min-heap priority queue (ordered by event timestamp)
Push(),Pop(),Peek(),Len()
store.ts:
- Zustand store (global state)
- WebSocket connection management
- Message dispatcher (
handleMessage()) - State:
connectionStatus,isRunning,currentConfig,currentMetrics,currentState,metricsHistory
types.ts:
- TypeScript interfaces for all messages and data structures
SimulationConfig,SimulationMetrics,SimulationState,WSMessage
components/:
- Each component is self-contained
- Uses Tailwind for styling, Lucide for icons
- Radix UI for accessible components (Slider, Collapsible)
Backend:
- Edit
simulator/*.go - Run:
go build -o /tmp/rollingstone cmd/server/main.go - Test:
/tmp/rollingstoneand openhttp://localhost:8080
Frontend:
- Edit
web/src/**/*.tsx - Vite auto-reloads (if running
npm run dev) - Or rebuild:
cd web && npm run build(output toweb/dist/)
Extreme Workload (Stress Test):
- Write rate: 250-500 MB/s
- I/O: 500 MB/s (EBS)
- Compaction parallelism: 4-6 jobs
- Expect: Write stalls, L1+ compactions, high write amplification
Gentle Workload (Steady State):
- Write rate: 10-50 MB/s
- I/O: 500 MB/s
- Compaction parallelism: 2 jobs
- Expect: Stable L0, periodic compactions, moderate amplification
Backend Logs:
[SCORE]: Compaction scoring output (every check)[SCHEDULE]: When compactions are scheduled[EXECUTE]: Compaction execution details- View:
tail -f /tmp/server.log
Frontend:
- Metrics history:
metricsHistoryin Zustand store - Connection issues: Check
connectionStatusstate - Message flow: Uncomment
console.loginstore.ts(careful: causes memory bloat)
Browser Memory Issues:
- Reduce UI update frequency (currently 500ms)
- Limit metrics history size (currently 500 entries)
- Disable charts (Recharts has memory leaks with frequent updates)
- Cap files per level (currently 20)
Simulation Speed:
- Increase step size (currently 0.1s per step)
- Reduce UI update frequency
- Profile with Go pprof if needed
- Add tooltips to all configuration options
- Backend auto-start/stop frontend (unified binary)
- Implement intra-L0 compactions
- Add read workload simulation
- Subcompactions for L0→L1 parallelization
-
level_compaction_dynamic_level_bytes=truemode - Bloom filter modeling (reduce read amplification)
- Chart improvements (fix memory issues, re-enable)
- CLI tool for batch simulations
- Export metrics to CSV/JSON
- Comparison mode (run multiple configs side-by-side)
- Animation speed control (faster/slower simulation playback)
When adding features:
- Maintain fidelity: Check RocksDB docs for correct behavior
- Add debug logging: Use
fmt.Printfwith prefixes like[FEATURE] - Test extreme configs: High write rates, many jobs, stress test
- Update docs: Keep README and this file in sync
- Memory conscious: Profile browser memory if adding UI updates
MIT