Skip to content

feat: Upgrade rocksdb to v10.9.1 and add tuning profiles.#1977

Merged
thomas-nguy merged 21 commits intocrypto-org-chain:mainfrom
JayT106:jt/update-rocksdb
Mar 10, 2026
Merged

feat: Upgrade rocksdb to v10.9.1 and add tuning profiles.#1977
thomas-nguy merged 21 commits intocrypto-org-chain:mainfrom
JayT106:jt/update-rocksdb

Conversation

@JayT106
Copy link
Copy Markdown
Contributor

@JayT106 JayT106 commented Feb 25, 2026

Summary

Provides configurable RocksDB tuning profiles (Validator, RPC, Archive) out-of-the-box to optimize point lookups, lock-free caching, and background async I/O based on the node workload.

  • Defines RocksDBConfig and node_type enum ("", "validator", "rpc", "archive") in app.toml.
  • Enables HyperClockCache, OptimizeForPointLookup, and AsyncIO/AutoReadaheadSize dynamically inside opendb_rocksdb.go.
  • Added concurrent stress benchmark tests (opendb_rocksdb_stress_bench_test.go) validating the RPS and throughput wins on the various Node Type profiles.

Summary by CodeRabbit

  • New Features

    • Added RocksDB configuration and template with node-type tuning profiles (validator, RPC, archive) and integrated it into app config.
  • Enhancements

    • Per-node RocksDB tuning applied (async I/O, auto readahead, point-lookup optimizations, alternative block-cache/hyperclock).
  • Tests

    • Added a RocksDB stress benchmark for large-scale concurrent reads and mixed workloads.
  • Chores

    • Upgraded RocksDB, native bindings, and dependency metadata.

@JayT106 JayT106 requested a review from a team as a code owner February 25, 2026 20:36
@JayT106 JayT106 requested review from calvinaco and randy-cro and removed request for a team February 25, 2026 20:36
@github-actions

This comment has been minimized.

@github-actions github-actions bot added the nix label Feb 25, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds RocksDB tuning/profile support (rocksdb.node_type), propagates tune options through RocksDB initialization APIs, updates call sites and dependencies, adds a concurrent RocksDB stress benchmark, and bumps grocksdb/RocksDB versions and packaging adjustments.

Changes

Cohort / File(s) Summary
App config integration
cmd/cronosd/cmd/root.go, cmd/cronosd/config/config.go, cmd/cronosd/config/toml.go
Adds RocksDBConfig with node_type, validation, default and TOML template; integrates rocksdb into CustomAppConfig and returned template.
Call-site updates
cmd/cronosd/cmd/versiondb.go, cmd/cronosd/dbmigrate/migrate_rocksdb.go
Call sites updated to pass an opendb.RocksDBTuneUpOptions{} argument to NewRocksdbOptions / PrepareRocksDBOptions.
RocksDB core & API changes
cmd/cronosd/opendb/opendb_rocksdb.go
Introduces RocksDBTuneUpOptions; updates OpenDB, openRocksdb, loadLatestOptions, and NewRocksdbOptions signatures; applies per-node tuning flags (async IO, readahead, point-lookup, hyperclock cache) through options and read configuration.
Benchmarking
cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go
Adds BenchmarkRocksDBStressConcurrent exercising validator/rpc/archive/default tuning profiles with concurrent read/write/scan workloads.
Dependency & packaging
go.mod, gomod2nix.toml, nix/rocksdb.nix, default.nix
Bumps grocksdb and RocksDB versions, updates cronos-store pseudo-versions/hashes, and adds platform-specific packaging/fixups for macOS/Linux.
Changelog
CHANGELOG.md
Adds UNRELEASED entry noting RocksDB upgrade and tuning profiles.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Config as Config Parser
    participant OpenDB as opendb.OpenDB
    participant Options as NewRocksdbOptions
    participant Rocks as RocksDB

    App->>Config: load config (includes rocksdb.node_type)
    Config-->>OpenDB: provide AppOptions / RocksDBConfig
    OpenDB->>OpenDB: build RocksDBTuneUpOptions from node_type
    OpenDB->>Options: NewRocksdbOptions(..., tuneUpOpts)
    Options->>Rocks: create grocksdb.Options with tune flags
    OpenDB->>Rocks: openRocksdb(dir, readonly, tuneUpOpts)
    Rocks-->>OpenDB: DB handle
    OpenDB-->>App: return initialized DB
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • crypto-org-chain/cronos#1511: updates RocksDB dependency/versioning and related packaging changes — closely related to the dependency and nix updates in this PR.

Suggested labels

cronos, build

Suggested reviewers

  • randy-cro
  • calvinaco
  • thomas-nguy

Poem

🐰 I hop through configs, tiny paws tap keys,
tuning stone caches with a carrot breeze,
profiles set — reads race, writes align,
benchmarks hum, concurrent and fine,
nibble bits, and watch the DB please 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: Upgrade rocksdb to v10.9.1 and add tuning profiles.' accurately reflects the main changes: upgrading RocksDB and adding tuning profiles (Validator, RPC, Archive node types). The title is specific, concise, and clearly summarizes the primary change.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@JayT106 JayT106 changed the title feat:(db) use rocksdb v10.9.1 and add tuning profiles. feat: update rocksdb to v10.9.1 and add tuning profiles. Feb 25, 2026
@socket-security
Copy link
Copy Markdown

socket-security bot commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)

99-150: ⚠️ Potential issue | 🟠 Major

Two independent 3 GB caches are allocated per DB open.

loadLatestOptions creates a grocksdb.Cache (lines 100–105) solely for the LoadLatestOptions call. NewRocksdbOptions then creates a second, separate 3 GB cache (lines 146–150) for bbto.SetBlockCache. The first cache is never shared with the block table factory and is effectively discarded after option loading, meaning every openRocksdb call on an existing DB allocates ~6 GB of cache objects — and unlike LRU, HyperClockCache pre-allocates its hash table at construction time, so both allocations are immediately resident.

The cache should be constructed once in openRocksdb and passed into both helpers.

♻️ Proposed fix
-func openRocksdb(dir string, readonly bool, tuneUpOpts RocksDBTuneUpOptions) (dbm.DB, error) {
-	opts, err := loadLatestOptions(dir, tuneUpOpts.EnableHyperClockCache)
+func newCache(enableHyperClockCache bool) *grocksdb.Cache {
+	if enableHyperClockCache {
+		return grocksdb.NewHyperClockCache(BlockCacheSize, 0)
+	}
+	return grocksdb.NewLRUCache(BlockCacheSize)
+}
+
+func openRocksdb(dir string, readonly bool, tuneUpOpts RocksDBTuneUpOptions) (dbm.DB, error) {
+	cache := newCache(tuneUpOpts.EnableHyperClockCache)
+	opts, err := loadLatestOptions(dir, cache)
 	if err != nil {
 		return nil, err
 	}
-	opts = NewRocksdbOptions(opts, false, tuneUpOpts)
+	opts = NewRocksdbOptions(opts, false, tuneUpOpts, cache)
 	...
 }

-func loadLatestOptions(dir string, enableHyperClockCache bool) (*grocksdb.Options, error) {
-	var cache *grocksdb.Cache
-	if enableHyperClockCache {
-		cache = grocksdb.NewHyperClockCache(BlockCacheSize, 0)
-	} else {
-		cache = grocksdb.NewLRUCache(BlockCacheSize)
-	}
+func loadLatestOptions(dir string, cache *grocksdb.Cache) (*grocksdb.Options, error) {
 	opts, err := grocksdb.LoadLatestOptions(dir, grocksdb.NewDefaultEnv(), true, cache)
 	...
 }

-func NewRocksdbOptions(opts *grocksdb.Options, sstFileWriter bool, tuneUpOpts RocksDBTuneUpOptions) *grocksdb.Options {
+func NewRocksdbOptions(opts *grocksdb.Options, sstFileWriter bool, tuneUpOpts RocksDBTuneUpOptions, cache *grocksdb.Cache) *grocksdb.Options {
 	...
-	if tuneUpOpts.EnableHyperClockCache {
-		bbto.SetBlockCache(grocksdb.NewHyperClockCache(BlockCacheSize, 0))
-	} else {
-		bbto.SetBlockCache(grocksdb.NewLRUCache(BlockCacheSize))
-	}
+	bbto.SetBlockCache(cache)
 	...
 }

Note: NewRocksdbOptions is also called externally (e.g., cmd/cronosd/cmd/versiondb.go and cmd/cronosd/dbmigrate/migrate_rocksdb.go); those callers can construct and pass their own cache, or a nil cache with a fallback inside NewRocksdbOptions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 99 - 150,
loadLatestOptions currently allocates a grocksdb.Cache for LoadLatestOptions and
NewRocksdbOptions allocates another one, causing duplicated 3GB caches; change
the design so openRocksdb constructs a single grocksdb.Cache and passes it into
both loadLatestOptions and NewRocksdbOptions (add a cache parameter to both
functions), remove the cache allocation inside loadLatestOptions, and in
NewRocksdbOptions accept a nil cache fallback so external callers can pass their
own cache or nil; ensure all call sites (including cmd/cronosd/cmd/versiondb.go
and dbmigrate/migrate_rocksdb.go) are updated to pass the cache or rely on the
nil fallback, and keep behavior same when opts is nil.
🧹 Nitpick comments (4)
cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go (2)

127-134: Mixed workload silently discards errors, unlike the reads-only benchmark.

In ConcurrentRandomReads, errors from db.Get are reported via b.Error. In ConcurrentMixed, both db.Get (Line 130) and db.Set (Line 134) errors are silently ignored. For a benchmark meant to validate correctness under load, consider reporting errors consistently.

♻️ Suggested consistency fix
 						if op < 8 {
 							// 80% Read
 							idx := (i * 17) % uint64(numKeys)
-							_, _ = db.Get(keys[idx])
+							val, err := db.Get(keys[idx])
+							if err != nil {
+								b.Error(err)
+							}
+							if val == nil {
+								b.Error("value is nil for read in mixed workload")
+							}
 						} else if op == 8 {
 							// 10% Write
 							idx := (i * 31) % uint64(numKeys)
-							_ = db.Set(keys[idx], values[idx])
+							if err := db.Set(keys[idx], values[idx]); err != nil {
+								b.Error(err)
+							}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go` around lines 127 -
134, In ConcurrentMixed (the mixed-workload benchmark) the results of db.Get and
db.Set are currently discarded; change the op branches that call
db.Get(keys[idx]) and db.Set(keys[idx], values[idx]) to capture their returned
errors and report them via the benchmark helper (use b.Error or b.Errorf) just
like ConcurrentRandomReads does, e.g., check err := db.Get(...) / err :=
db.Set(...) and call b.Error with a brief contextual message so both read and
write failures are surfaced during the benchmark.

66-70: Minor: crypto/rand is significantly slower than math/rand for benchmark data generation.

Since this data doesn't require cryptographic randomness, using math/rand would be faster for pre-generating the 50k values. This is a one-time cost outside the timed benchmark loop, so the impact is negligible in practice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go` around lines 66 - 70,
Replace the use of crypto/rand in the pre-benchmark data generation loop with
math/rand for speed: in the block that fills keys and values (the for i := 0; i
< numKeys; i++ loop that sets keys[i] and values[i]), swap the crypto/rand
import for math/rand, seed it once (e.g. rand.Seed(time.Now().UnixNano()) or a
fixed seed for reproducible tests) and use rand.Read(values[i]) (or a fast
math/rand-based fill) to populate values; this keeps the same behavior but
avoids the slower crypto RNG during setup.
cmd/cronosd/config/config.go (1)

19-23: Add validation for NodeType to catch invalid configuration values early.

Invalid node_type values (e.g., a typo like "validtor") are silently accepted in the config and fall through the switch statement in OpenDB(), resulting in no tuning being applied. Introducing named constants and a validation function will prevent silent misconfigurations from being discovered only during production operation.

♻️ Proposed addition
+const (
+	NodeTypeDefault   = ""
+	NodeTypeValidator = "validator"
+	NodeTypeRPC       = "rpc"
+	NodeTypeArchive   = "archive"
+)
+
 type RocksDBConfig struct {
 	// Defines the tuning profile for RocksDB based on the node's primary workload.
 	// Valid values: "", "validator", "rpc", "archive"
 	NodeType string `mapstructure:"node_type"`
 }
+
+func (c RocksDBConfig) Validate() error {
+	switch c.NodeType {
+	case NodeTypeDefault, NodeTypeValidator, NodeTypeRPC, NodeTypeArchive:
+		return nil
+	default:
+		return fmt.Errorf("invalid rocksdb node_type %q: must be one of \"\", \"validator\", \"rpc\", \"archive\"", c.NodeType)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/config/config.go` around lines 19 - 23, The RocksDBConfig should
validate NodeType early: add named constants for allowed node types (e.g.,
NodeTypeValidator, NodeTypeRPC, NodeTypeArchive, NodeTypeDefault/Empty) and
implement a validation function/method (e.g., func (c *RocksDBConfig) Validate()
error or ValidateNodeType()) that normalizes (strings.ToLower/TrimSpace) and
returns an error listing allowed values when NodeType is invalid; call this
validator from OpenDB() before the switch that applies tuning so mis-typed
values (like "validtor") produce a clear error instead of silently skipping
tuning.
cmd/cronosd/opendb/opendb_rocksdb.go (1)

28-55: tuneUpOpts parsing runs unconditionally regardless of backendType.

The entire appOpts / profile parse block (lines 31-48) executes even when backendType != dbm.RocksDBBackend, where it has no effect. Moving it inside the if backendType == dbm.RocksDBBackend branch would make intent clearer.

♻️ Proposed refactor
 func OpenDB(appOpts types.AppOptions, home string, backendType dbm.BackendType) (dbm.DB, error) {
 	dataDir := filepath.Join(home, "data")
-	
-	tuneUpOpts := RocksDBTuneUpOptions{}
-	if appOpts != nil {
-		if v := appOpts.Get("rocksdb.node_type"); v != nil {
-			nodeType := cast.ToString(v)
-			switch strings.ToLower(nodeType) {
-			case "validator":
-				tuneUpOpts.EnableOptimizeForPointLookup = true
-			case "rpc":
-				tuneUpOpts.EnableAutoReadaheadSize = true
-				tuneUpOpts.EnableOptimizeForPointLookup = true
-				tuneUpOpts.EnableHyperClockCache = true
-			case "archive":
-				tuneUpOpts.EnableAsyncIo = true
-				tuneUpOpts.EnableAutoReadaheadSize = true
-				tuneUpOpts.EnableHyperClockCache = true
-			}
-		}
-	}
 
 	if backendType == dbm.RocksDBBackend {
+		tuneUpOpts := RocksDBTuneUpOptions{}
+		if appOpts != nil {
+			if v := appOpts.Get("rocksdb.node_type"); v != nil {
+				nodeType := cast.ToString(v)
+				switch strings.ToLower(nodeType) {
+				case "validator":
+					tuneUpOpts.EnableOptimizeForPointLookup = true
+				case "rpc":
+					tuneUpOpts.EnableAutoReadaheadSize = true
+					tuneUpOpts.EnableOptimizeForPointLookup = true
+					tuneUpOpts.EnableHyperClockCache = true
+				case "archive":
+					tuneUpOpts.EnableAsyncIo = true
+					tuneUpOpts.EnableAutoReadaheadSize = true
+					tuneUpOpts.EnableHyperClockCache = true
+				}
+			}
+		}
 		return openRocksdb(filepath.Join(dataDir, "application.db"), false, tuneUpOpts)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 28 - 55, The parsing of
appOpts into tuneUpOpts currently runs unconditionally in OpenDB even when
backendType != dbm.RocksDBBackend; move the entire appOpts/profile parsing block
so it executes only inside the backendType == dbm.RocksDBBackend branch just
before calling openRocksdb; specifically, declare tuneUpOpts as needed, then
perform the Get("rocksdb.node_type") handling and switches (setting
EnableOptimizeForPointLookup, EnableAutoReadaheadSize, EnableHyperClockCache,
EnableAsyncIo) only when backendType == dbm.RocksDBBackend, and then pass
tuneUpOpts into openRocksdb — leaving the non-RocksDB return dbm.NewDB(...) path
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 99-150: loadLatestOptions currently allocates a grocksdb.Cache for
LoadLatestOptions and NewRocksdbOptions allocates another one, causing
duplicated 3GB caches; change the design so openRocksdb constructs a single
grocksdb.Cache and passes it into both loadLatestOptions and NewRocksdbOptions
(add a cache parameter to both functions), remove the cache allocation inside
loadLatestOptions, and in NewRocksdbOptions accept a nil cache fallback so
external callers can pass their own cache or nil; ensure all call sites
(including cmd/cronosd/cmd/versiondb.go and dbmigrate/migrate_rocksdb.go) are
updated to pass the cache or rely on the nil fallback, and keep behavior same
when opts is nil.

---

Nitpick comments:
In `@cmd/cronosd/config/config.go`:
- Around line 19-23: The RocksDBConfig should validate NodeType early: add named
constants for allowed node types (e.g., NodeTypeValidator, NodeTypeRPC,
NodeTypeArchive, NodeTypeDefault/Empty) and implement a validation
function/method (e.g., func (c *RocksDBConfig) Validate() error or
ValidateNodeType()) that normalizes (strings.ToLower/TrimSpace) and returns an
error listing allowed values when NodeType is invalid; call this validator from
OpenDB() before the switch that applies tuning so mis-typed values (like
"validtor") produce a clear error instead of silently skipping tuning.

In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go`:
- Around line 127-134: In ConcurrentMixed (the mixed-workload benchmark) the
results of db.Get and db.Set are currently discarded; change the op branches
that call db.Get(keys[idx]) and db.Set(keys[idx], values[idx]) to capture their
returned errors and report them via the benchmark helper (use b.Error or
b.Errorf) just like ConcurrentRandomReads does, e.g., check err := db.Get(...) /
err := db.Set(...) and call b.Error with a brief contextual message so both read
and write failures are surfaced during the benchmark.
- Around line 66-70: Replace the use of crypto/rand in the pre-benchmark data
generation loop with math/rand for speed: in the block that fills keys and
values (the for i := 0; i < numKeys; i++ loop that sets keys[i] and values[i]),
swap the crypto/rand import for math/rand, seed it once (e.g.
rand.Seed(time.Now().UnixNano()) or a fixed seed for reproducible tests) and use
rand.Read(values[i]) (or a fast math/rand-based fill) to populate values; this
keeps the same behavior but avoids the slower crypto RNG during setup.

In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 28-55: The parsing of appOpts into tuneUpOpts currently runs
unconditionally in OpenDB even when backendType != dbm.RocksDBBackend; move the
entire appOpts/profile parsing block so it executes only inside the backendType
== dbm.RocksDBBackend branch just before calling openRocksdb; specifically,
declare tuneUpOpts as needed, then perform the Get("rocksdb.node_type") handling
and switches (setting EnableOptimizeForPointLookup, EnableAutoReadaheadSize,
EnableHyperClockCache, EnableAsyncIo) only when backendType ==
dbm.RocksDBBackend, and then pass tuneUpOpts into openRocksdb — leaving the
non-RocksDB return dbm.NewDB(...) path unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21f1c4d and 742f081.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (10)
  • cmd/cronosd/cmd/root.go
  • cmd/cronosd/cmd/versiondb.go
  • cmd/cronosd/config/config.go
  • cmd/cronosd/config/toml.go
  • cmd/cronosd/dbmigrate/migrate_rocksdb.go
  • cmd/cronosd/opendb/opendb_rocksdb.go
  • cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go
  • go.mod
  • gomod2nix.toml
  • nix/rocksdb.nix

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Line 6: Update the changelog entry that currently reads "*
[`#1977`](https://github.com/crypto-org-chain/cronos/pull/1977) feat: update
rocksdb to v10.9.1 and add tuning profiles." by removing the "feat:"
conventional-commit prefix and converting it to sentence-case to match other
entries (e.g., "* [`#1977`](https://github.com/crypto-org-chain/cronos/pull/1977)
Update rocksdb to v10.9.1 and add tuning profiles."). Ensure the change is
applied to the exact line containing the PR link text.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 742f081 and 525f58a.

📒 Files selected for processing (1)
  • CHANGELOG.md

@JayT106 JayT106 changed the title feat: update rocksdb to v10.9.1 and add tuning profiles. feat: Upgrade rocksdb to v10.9.1 and add tuning profiles. Feb 25, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 25.67%. Comparing base (a3c2f70) to head (cd46b28).
⚠️ Report is 146 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1977      +/-   ##
==========================================
+ Coverage   16.87%   25.67%   +8.80%     
==========================================
  Files          72       78       +6     
  Lines        6163     5117    -1046     
==========================================
+ Hits         1040     1314     +274     
+ Misses       5000     3641    -1359     
- Partials      123      162      +39     

see 72 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JayT106 JayT106 marked this pull request as draft February 25, 2026 22:42
@JayT106 JayT106 changed the title feat: Upgrade rocksdb to v10.9.1 and add tuning profiles. feat: Upgrade rocksdb to v10.10.1 and add tuning profiles. Feb 26, 2026
@JayT106 JayT106 marked this pull request as ready for review February 26, 2026 03:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)

102-116: ⚠️ Potential issue | 🟡 Minor

Cache memory leak in error paths.

When LoadLatestOptions fails (lines 112-115), the cache allocated on lines 104-108 is not released. Since NewHyperClockCache and NewLRUCache allocate C/C++ memory that Go's garbage collector cannot manage, the cache must be explicitly destroyed with cache.Destroy() before returning.

Fix: release cache on error returns
 func loadLatestOptions(dir string, enableHyperClockCache bool) (*grocksdb.Options, error) {
 	var cache *grocksdb.Cache
 	if enableHyperClockCache {
 		cache = grocksdb.NewHyperClockCache(BlockCacheSize, 0)
 	} else {
 		cache = grocksdb.NewLRUCache(BlockCacheSize)
 	}
 	opts, err := grocksdb.LoadLatestOptions(dir, grocksdb.NewDefaultEnv(), true, cache)
 	if err != nil {
 		// not found is not an error
 		if strings.HasPrefix(err.Error(), "NotFound: ") {
+			cache.Destroy()
 			return nil, nil
 		}
+		cache.Destroy()
 		return nil, err
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 102 - 116, In
loadLatestOptions, the cache created by NewHyperClockCache/NewLRUCache is not
freed when grocksdb.LoadLatestOptions returns an error; ensure you call
cache.Destroy() before any early return on error (including the NotFound branch)
so the C/C++ memory is released — e.g., after err != nil, call cache.Destroy()
then handle the NotFound case or return the error; keep the cache variable and
NewHyperClockCache/NewLRUCache usage as-is but add explicit cache.Destroy() in
each error path.
🧹 Nitpick comments (1)
cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go (1)

100-116: Consider using b.Fatal for benchmark-terminating errors.

Using b.Error on lines 109 and 112 allows the benchmark to continue after encountering errors, which could produce misleading results. For benchmark correctness, failing fast is typically preferred.

♻️ Suggested improvement
 				for pb.Next() {
 					i := atomic.AddUint64(&counter, 1)
 					idx := (i * 17) % uint64(numKeys)
 					val, err := db.Get(keys[idx])
 					if err != nil {
-						b.Error(err)
+						b.Fatal(err)
 					}
 					if val == nil {
-						b.Error("value is nil")
+						b.Fatal("value is nil")
 					}
 				}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go` around lines 100 -
116, In the ConcurrentRandomReads benchmark inside Test file
opendb_rocksdb_stress_bench_test.go replace non-fatal error calls with fatal
ones so the benchmark stops on first failure: change the b.Error calls in the
anonymous b.Run("ConcurrentRandomReads", ...) closure — specifically the error
handling after db.Get (currently calling b.Error(err)) and the nil-value check
(currently calling b.Error("value is nil")) — to b.Fatal with appropriate
messages so the benchmark fails fast when db.Get returns an error or a nil
value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 102-116: In loadLatestOptions, the cache created by
NewHyperClockCache/NewLRUCache is not freed when grocksdb.LoadLatestOptions
returns an error; ensure you call cache.Destroy() before any early return on
error (including the NotFound branch) so the C/C++ memory is released — e.g.,
after err != nil, call cache.Destroy() then handle the NotFound case or return
the error; keep the cache variable and NewHyperClockCache/NewLRUCache usage
as-is but add explicit cache.Destroy() in each error path.

---

Nitpick comments:
In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go`:
- Around line 100-116: In the ConcurrentRandomReads benchmark inside Test file
opendb_rocksdb_stress_bench_test.go replace non-fatal error calls with fatal
ones so the benchmark stops on first failure: change the b.Error calls in the
anonymous b.Run("ConcurrentRandomReads", ...) closure — specifically the error
handling after db.Get (currently calling b.Error(err)) and the nil-value check
(currently calling b.Error("value is nil")) — to b.Fatal with appropriate
messages so the benchmark fails fast when db.Get returns an error or a nil
value.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 525f58a and 981668a.

📒 Files selected for processing (6)
  • CHANGELOG.md
  • cmd/cronosd/config/config.go
  • cmd/cronosd/opendb/opendb_rocksdb.go
  • cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go
  • default.nix
  • nix/rocksdb.nix
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md

JayT106 and others added 2 commits February 25, 2026 23:14
@randy-cro
Copy link
Copy Markdown
Contributor

randy-cro commented Feb 26, 2026

  • should be v10.9.1?
  • also, just merged into master for cronos-store, probably can use commit v0.0.0-20260226040813-703ea3d46c70 for cronos-store?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)

64-64: Read-only open path currently skips profile tuning.

Line 64 always passes RocksDBTuneUpOptions{}, so read-only workflows won’t inherit validator/rpc/archive tuning. Consider plumbing types.AppOptions into OpenReadOnlyDB and reusing the same profile mapping logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb.go` at line 64, OpenReadOnlyDB currently
calls openRocksdb(..., true, RocksDBTuneUpOptions{}) which skips profile-based
tuning; change OpenReadOnlyDB to accept a types.AppOptions parameter (or
otherwise receive the same app options used for the writable path), reuse the
existing profile → RocksDBTuneUpOptions mapping logic used by the writable
OpenDB path to produce the proper RocksDBTuneUpOptions, and pass that options
struct into openRocksdb instead of the empty RocksDBTuneUpOptions{} so read-only
opens get validator/rpc/archive tuning applied.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 143-145: The OptimizeForPointLookup call on opts is being
clobbered by the later opts.SetBlockBasedTableFactory(bbto); instead, when
tuneUpOpts.EnableOptimizeForPointLookup is true, set the equivalent point-lookup
knobs directly on the BlockBasedTableOptions instance (bbto) before calling
opts.SetBlockBasedTableFactory(bbto) — e.g., enable a hash-based block index and
tune block cache/partition settings on bbto so the point-lookup optimizations
take effect rather than relying on opts.OptimizeForPointLookup which is
overwritten.

---

Nitpick comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Line 64: OpenReadOnlyDB currently calls openRocksdb(..., true,
RocksDBTuneUpOptions{}) which skips profile-based tuning; change OpenReadOnlyDB
to accept a types.AppOptions parameter (or otherwise receive the same app
options used for the writable path), reuse the existing profile →
RocksDBTuneUpOptions mapping logic used by the writable OpenDB path to produce
the proper RocksDBTuneUpOptions, and pass that options struct into openRocksdb
instead of the empty RocksDBTuneUpOptions{} so read-only opens get
validator/rpc/archive tuning applied.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 981668a and 5a7e403.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • cmd/cronosd/opendb/opendb_rocksdb.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md

@JayT106 JayT106 changed the title feat: Upgrade rocksdb to v10.10.1 and add tuning profiles. feat: Upgrade rocksdb to v10.9.1 and add tuning profiles. Feb 26, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)

111-130: ⚠️ Potential issue | 🟠 Major

LoadLatestOptions lifecycle cleanup is missing (C-memory leak risk).

Line 112 allocates LatestOptions and Env, but neither is ever released. In grocksdb, both LatestOptions and Env have explicit Destroy() methods that must be called to clean up underlying C resources. Additionally, the returned Options objects reference memory owned by LatestOptions, so they must be cloned before LatestOptions is destroyed to avoid dangling pointers.

♻️ Proposed fix
 func loadLatestOptions(dir string, cache *grocksdb.Cache) (*grocksdb.Options, error) {
-	opts, err := grocksdb.LoadLatestOptions(dir, grocksdb.NewDefaultEnv(), true, cache)
+	env := grocksdb.NewDefaultEnv()
+	defer env.Destroy()
+
+	latest, err := grocksdb.LoadLatestOptions(dir, env, true, cache)
 	if err != nil {
 		// not found is not an error
 		if strings.HasPrefix(err.Error(), "NotFound: ") {
 			return nil, nil
 		}
 		return nil, err
 	}
+	defer latest.Destroy()
 
-	cfNames := opts.ColumnFamilyNames()
-	cfOpts := opts.ColumnFamilyOpts()
+	cfNames := latest.ColumnFamilyNames()
+	cfOpts := latest.ColumnFamilyOpts()
 
 	for i := 0; i < len(cfNames); i++ {
 		if cfNames[i] == "default" {
-			return &cfOpts[i], nil
+			return cfOpts[i].Clone(), nil
 		}
 	}
 
-	return opts.Options(), nil
+	return latest.Options().Clone(), nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 111 - 130, The
loadLatestOptions function currently leaks C resources because LatestOptions
(returned by grocksdb.LoadLatestOptions) and the Env created by
grocksdb.NewDefaultEnv() are never destroyed; modify loadLatestOptions to call
Destroy() on the LatestOptions and Env in all code paths (use defer after
successful LoadLatestOptions) and return a cloned go-owned Options object
instead of a pointer into LatestOptions-managed memory (e.g. clone cfOpts[i] via
its Clone()/Copy() method or clone opts.Options() before destroying
LatestOptions); also ensure the "NotFound" early return destroys the Env (and
any created objects) before returning nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go`:
- Line 69: The rand.Read call in opendb_rocksdb_stress_bench_test.go currently
drops its error which can silently corrupt the benchmark setup; capture the
returned error from rand.Read(values[i]) and fail the test/benchmark immediately
(using the test's failure helper, e.g. b.Fatalf or t.Fatalf depending on whether
this is a Benchmark or Test function) with a descriptive message including the
error so setup cannot proceed on failure.
- Around line 127-149: The mixed-workload loop currently ignores errors from
db.Get, db.Set and only partially checks db.Iterator; update the block to
capture and handle all returned errors: check the error return from db.Get and
db.Set and fail the test or record the failure (use the test benchmark handle in
scope, e.g., b.Fatalf/b.Error or t.Fatalf/t.Error) when err != nil; check the
error returned by db.Iterator and fail/record on non-nil error; ensure the
iterator is closed in a defer or after use and handle any iterator method
errors/signals appropriately (references: db.Get, db.Set, db.Iterator, itr,
itr.Close, itr.Next, itr.Valid, itr.Key, itr.Value) so benchmark results don’t
hide operation failures.

In `@nix/rocksdb.nix`:
- Line 26: Update the pinned RocksDB version string from "10.9.1" to "10.10.1"
(replace version = "10.9.1"; with version = "10.10.1";) and also replace the
existing sha256 value with the correct checksum for the v10.10.1 release so the
fetch succeeds; ensure the sha256 field that accompanies the
fetch/fetchFromGitHub entry is updated to the new hash for v10.10.1.

---

Outside diff comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 111-130: The loadLatestOptions function currently leaks C
resources because LatestOptions (returned by grocksdb.LoadLatestOptions) and the
Env created by grocksdb.NewDefaultEnv() are never destroyed; modify
loadLatestOptions to call Destroy() on the LatestOptions and Env in all code
paths (use defer after successful LoadLatestOptions) and return a cloned
go-owned Options object instead of a pointer into LatestOptions-managed memory
(e.g. clone cfOpts[i] via its Clone()/Copy() method or clone opts.Options()
before destroying LatestOptions); also ensure the "NotFound" early return
destroys the Env (and any created objects) before returning nil.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a7e403 and 99b78d4.

📒 Files selected for processing (3)
  • cmd/cronosd/opendb/opendb_rocksdb.go
  • cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go
  • nix/rocksdb.nix

@JayT106
Copy link
Copy Markdown
Contributor Author

JayT106 commented Feb 26, 2026

  • should be v10.9.1?

Yes, just solved the build issue and rollback to version v10.9.1

  • also, just merged into master for cronos-store, probably can use commit v0.0.0-20260226040813-703ea3d46c70 for cronos-store?

Updated !

@songgaoye
Copy link
Copy Markdown
Contributor

good job, lgtm.

@songgaoye songgaoye self-requested a review March 4, 2026 02:41
@thomas-nguy thomas-nguy added this pull request to the merge queue Mar 10, 2026
Merged via the queue into crypto-org-chain:main with commit 71fc739 Mar 10, 2026
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants