feat: Upgrade rocksdb to v10.9.1 and add tuning profiles.#1977
feat: Upgrade rocksdb to v10.9.1 and add tuning profiles.#1977thomas-nguy merged 21 commits intocrypto-org-chain:mainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds RocksDB tuning/profile support (rocksdb.node_type), propagates tune options through RocksDB initialization APIs, updates call sites and dependencies, adds a concurrent RocksDB stress benchmark, and bumps grocksdb/RocksDB versions and packaging adjustments. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Config as Config Parser
participant OpenDB as opendb.OpenDB
participant Options as NewRocksdbOptions
participant Rocks as RocksDB
App->>Config: load config (includes rocksdb.node_type)
Config-->>OpenDB: provide AppOptions / RocksDBConfig
OpenDB->>OpenDB: build RocksDBTuneUpOptions from node_type
OpenDB->>Options: NewRocksdbOptions(..., tuneUpOpts)
Options->>Rocks: create grocksdb.Options with tune flags
OpenDB->>Rocks: openRocksdb(dir, readonly, tuneUpOpts)
Rocks-->>OpenDB: DB handle
OpenDB-->>App: return initialized DB
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)
99-150:⚠️ Potential issue | 🟠 MajorTwo independent 3 GB caches are allocated per DB open.
loadLatestOptionscreates agrocksdb.Cache(lines 100–105) solely for theLoadLatestOptionscall.NewRocksdbOptionsthen creates a second, separate 3 GB cache (lines 146–150) forbbto.SetBlockCache. The first cache is never shared with the block table factory and is effectively discarded after option loading, meaning everyopenRocksdbcall on an existing DB allocates ~6 GB of cache objects — and unlike LRU,HyperClockCachepre-allocates its hash table at construction time, so both allocations are immediately resident.The cache should be constructed once in
openRocksdband passed into both helpers.♻️ Proposed fix
-func openRocksdb(dir string, readonly bool, tuneUpOpts RocksDBTuneUpOptions) (dbm.DB, error) { - opts, err := loadLatestOptions(dir, tuneUpOpts.EnableHyperClockCache) +func newCache(enableHyperClockCache bool) *grocksdb.Cache { + if enableHyperClockCache { + return grocksdb.NewHyperClockCache(BlockCacheSize, 0) + } + return grocksdb.NewLRUCache(BlockCacheSize) +} + +func openRocksdb(dir string, readonly bool, tuneUpOpts RocksDBTuneUpOptions) (dbm.DB, error) { + cache := newCache(tuneUpOpts.EnableHyperClockCache) + opts, err := loadLatestOptions(dir, cache) if err != nil { return nil, err } - opts = NewRocksdbOptions(opts, false, tuneUpOpts) + opts = NewRocksdbOptions(opts, false, tuneUpOpts, cache) ... } -func loadLatestOptions(dir string, enableHyperClockCache bool) (*grocksdb.Options, error) { - var cache *grocksdb.Cache - if enableHyperClockCache { - cache = grocksdb.NewHyperClockCache(BlockCacheSize, 0) - } else { - cache = grocksdb.NewLRUCache(BlockCacheSize) - } +func loadLatestOptions(dir string, cache *grocksdb.Cache) (*grocksdb.Options, error) { opts, err := grocksdb.LoadLatestOptions(dir, grocksdb.NewDefaultEnv(), true, cache) ... } -func NewRocksdbOptions(opts *grocksdb.Options, sstFileWriter bool, tuneUpOpts RocksDBTuneUpOptions) *grocksdb.Options { +func NewRocksdbOptions(opts *grocksdb.Options, sstFileWriter bool, tuneUpOpts RocksDBTuneUpOptions, cache *grocksdb.Cache) *grocksdb.Options { ... - if tuneUpOpts.EnableHyperClockCache { - bbto.SetBlockCache(grocksdb.NewHyperClockCache(BlockCacheSize, 0)) - } else { - bbto.SetBlockCache(grocksdb.NewLRUCache(BlockCacheSize)) - } + bbto.SetBlockCache(cache) ... }Note:
NewRocksdbOptionsis also called externally (e.g.,cmd/cronosd/cmd/versiondb.goandcmd/cronosd/dbmigrate/migrate_rocksdb.go); those callers can construct and pass their own cache, or anilcache with a fallback insideNewRocksdbOptions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 99 - 150, loadLatestOptions currently allocates a grocksdb.Cache for LoadLatestOptions and NewRocksdbOptions allocates another one, causing duplicated 3GB caches; change the design so openRocksdb constructs a single grocksdb.Cache and passes it into both loadLatestOptions and NewRocksdbOptions (add a cache parameter to both functions), remove the cache allocation inside loadLatestOptions, and in NewRocksdbOptions accept a nil cache fallback so external callers can pass their own cache or nil; ensure all call sites (including cmd/cronosd/cmd/versiondb.go and dbmigrate/migrate_rocksdb.go) are updated to pass the cache or rely on the nil fallback, and keep behavior same when opts is nil.
🧹 Nitpick comments (4)
cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go (2)
127-134: Mixed workload silently discards errors, unlike the reads-only benchmark.In
ConcurrentRandomReads, errors fromdb.Getare reported viab.Error. InConcurrentMixed, bothdb.Get(Line 130) anddb.Set(Line 134) errors are silently ignored. For a benchmark meant to validate correctness under load, consider reporting errors consistently.♻️ Suggested consistency fix
if op < 8 { // 80% Read idx := (i * 17) % uint64(numKeys) - _, _ = db.Get(keys[idx]) + val, err := db.Get(keys[idx]) + if err != nil { + b.Error(err) + } + if val == nil { + b.Error("value is nil for read in mixed workload") + } } else if op == 8 { // 10% Write idx := (i * 31) % uint64(numKeys) - _ = db.Set(keys[idx], values[idx]) + if err := db.Set(keys[idx], values[idx]); err != nil { + b.Error(err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go` around lines 127 - 134, In ConcurrentMixed (the mixed-workload benchmark) the results of db.Get and db.Set are currently discarded; change the op branches that call db.Get(keys[idx]) and db.Set(keys[idx], values[idx]) to capture their returned errors and report them via the benchmark helper (use b.Error or b.Errorf) just like ConcurrentRandomReads does, e.g., check err := db.Get(...) / err := db.Set(...) and call b.Error with a brief contextual message so both read and write failures are surfaced during the benchmark.
66-70: Minor:crypto/randis significantly slower thanmath/randfor benchmark data generation.Since this data doesn't require cryptographic randomness, using
math/randwould be faster for pre-generating the 50k values. This is a one-time cost outside the timed benchmark loop, so the impact is negligible in practice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go` around lines 66 - 70, Replace the use of crypto/rand in the pre-benchmark data generation loop with math/rand for speed: in the block that fills keys and values (the for i := 0; i < numKeys; i++ loop that sets keys[i] and values[i]), swap the crypto/rand import for math/rand, seed it once (e.g. rand.Seed(time.Now().UnixNano()) or a fixed seed for reproducible tests) and use rand.Read(values[i]) (or a fast math/rand-based fill) to populate values; this keeps the same behavior but avoids the slower crypto RNG during setup.cmd/cronosd/config/config.go (1)
19-23: Add validation forNodeTypeto catch invalid configuration values early.Invalid
node_typevalues (e.g., a typo like"validtor") are silently accepted in the config and fall through the switch statement inOpenDB(), resulting in no tuning being applied. Introducing named constants and a validation function will prevent silent misconfigurations from being discovered only during production operation.♻️ Proposed addition
+const ( + NodeTypeDefault = "" + NodeTypeValidator = "validator" + NodeTypeRPC = "rpc" + NodeTypeArchive = "archive" +) + type RocksDBConfig struct { // Defines the tuning profile for RocksDB based on the node's primary workload. // Valid values: "", "validator", "rpc", "archive" NodeType string `mapstructure:"node_type"` } + +func (c RocksDBConfig) Validate() error { + switch c.NodeType { + case NodeTypeDefault, NodeTypeValidator, NodeTypeRPC, NodeTypeArchive: + return nil + default: + return fmt.Errorf("invalid rocksdb node_type %q: must be one of \"\", \"validator\", \"rpc\", \"archive\"", c.NodeType) + } +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/config/config.go` around lines 19 - 23, The RocksDBConfig should validate NodeType early: add named constants for allowed node types (e.g., NodeTypeValidator, NodeTypeRPC, NodeTypeArchive, NodeTypeDefault/Empty) and implement a validation function/method (e.g., func (c *RocksDBConfig) Validate() error or ValidateNodeType()) that normalizes (strings.ToLower/TrimSpace) and returns an error listing allowed values when NodeType is invalid; call this validator from OpenDB() before the switch that applies tuning so mis-typed values (like "validtor") produce a clear error instead of silently skipping tuning.cmd/cronosd/opendb/opendb_rocksdb.go (1)
28-55:tuneUpOptsparsing runs unconditionally regardless ofbackendType.The entire
appOpts/ profile parse block (lines 31-48) executes even whenbackendType != dbm.RocksDBBackend, where it has no effect. Moving it inside theif backendType == dbm.RocksDBBackendbranch would make intent clearer.♻️ Proposed refactor
func OpenDB(appOpts types.AppOptions, home string, backendType dbm.BackendType) (dbm.DB, error) { dataDir := filepath.Join(home, "data") - - tuneUpOpts := RocksDBTuneUpOptions{} - if appOpts != nil { - if v := appOpts.Get("rocksdb.node_type"); v != nil { - nodeType := cast.ToString(v) - switch strings.ToLower(nodeType) { - case "validator": - tuneUpOpts.EnableOptimizeForPointLookup = true - case "rpc": - tuneUpOpts.EnableAutoReadaheadSize = true - tuneUpOpts.EnableOptimizeForPointLookup = true - tuneUpOpts.EnableHyperClockCache = true - case "archive": - tuneUpOpts.EnableAsyncIo = true - tuneUpOpts.EnableAutoReadaheadSize = true - tuneUpOpts.EnableHyperClockCache = true - } - } - } if backendType == dbm.RocksDBBackend { + tuneUpOpts := RocksDBTuneUpOptions{} + if appOpts != nil { + if v := appOpts.Get("rocksdb.node_type"); v != nil { + nodeType := cast.ToString(v) + switch strings.ToLower(nodeType) { + case "validator": + tuneUpOpts.EnableOptimizeForPointLookup = true + case "rpc": + tuneUpOpts.EnableAutoReadaheadSize = true + tuneUpOpts.EnableOptimizeForPointLookup = true + tuneUpOpts.EnableHyperClockCache = true + case "archive": + tuneUpOpts.EnableAsyncIo = true + tuneUpOpts.EnableAutoReadaheadSize = true + tuneUpOpts.EnableHyperClockCache = true + } + } + } return openRocksdb(filepath.Join(dataDir, "application.db"), false, tuneUpOpts) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 28 - 55, The parsing of appOpts into tuneUpOpts currently runs unconditionally in OpenDB even when backendType != dbm.RocksDBBackend; move the entire appOpts/profile parsing block so it executes only inside the backendType == dbm.RocksDBBackend branch just before calling openRocksdb; specifically, declare tuneUpOpts as needed, then perform the Get("rocksdb.node_type") handling and switches (setting EnableOptimizeForPointLookup, EnableAutoReadaheadSize, EnableHyperClockCache, EnableAsyncIo) only when backendType == dbm.RocksDBBackend, and then pass tuneUpOpts into openRocksdb — leaving the non-RocksDB return dbm.NewDB(...) path unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 99-150: loadLatestOptions currently allocates a grocksdb.Cache for
LoadLatestOptions and NewRocksdbOptions allocates another one, causing
duplicated 3GB caches; change the design so openRocksdb constructs a single
grocksdb.Cache and passes it into both loadLatestOptions and NewRocksdbOptions
(add a cache parameter to both functions), remove the cache allocation inside
loadLatestOptions, and in NewRocksdbOptions accept a nil cache fallback so
external callers can pass their own cache or nil; ensure all call sites
(including cmd/cronosd/cmd/versiondb.go and dbmigrate/migrate_rocksdb.go) are
updated to pass the cache or rely on the nil fallback, and keep behavior same
when opts is nil.
---
Nitpick comments:
In `@cmd/cronosd/config/config.go`:
- Around line 19-23: The RocksDBConfig should validate NodeType early: add named
constants for allowed node types (e.g., NodeTypeValidator, NodeTypeRPC,
NodeTypeArchive, NodeTypeDefault/Empty) and implement a validation
function/method (e.g., func (c *RocksDBConfig) Validate() error or
ValidateNodeType()) that normalizes (strings.ToLower/TrimSpace) and returns an
error listing allowed values when NodeType is invalid; call this validator from
OpenDB() before the switch that applies tuning so mis-typed values (like
"validtor") produce a clear error instead of silently skipping tuning.
In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go`:
- Around line 127-134: In ConcurrentMixed (the mixed-workload benchmark) the
results of db.Get and db.Set are currently discarded; change the op branches
that call db.Get(keys[idx]) and db.Set(keys[idx], values[idx]) to capture their
returned errors and report them via the benchmark helper (use b.Error or
b.Errorf) just like ConcurrentRandomReads does, e.g., check err := db.Get(...) /
err := db.Set(...) and call b.Error with a brief contextual message so both read
and write failures are surfaced during the benchmark.
- Around line 66-70: Replace the use of crypto/rand in the pre-benchmark data
generation loop with math/rand for speed: in the block that fills keys and
values (the for i := 0; i < numKeys; i++ loop that sets keys[i] and values[i]),
swap the crypto/rand import for math/rand, seed it once (e.g.
rand.Seed(time.Now().UnixNano()) or a fixed seed for reproducible tests) and use
rand.Read(values[i]) (or a fast math/rand-based fill) to populate values; this
keeps the same behavior but avoids the slower crypto RNG during setup.
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 28-55: The parsing of appOpts into tuneUpOpts currently runs
unconditionally in OpenDB even when backendType != dbm.RocksDBBackend; move the
entire appOpts/profile parsing block so it executes only inside the backendType
== dbm.RocksDBBackend branch just before calling openRocksdb; specifically,
declare tuneUpOpts as needed, then perform the Get("rocksdb.node_type") handling
and switches (setting EnableOptimizeForPointLookup, EnableAutoReadaheadSize,
EnableHyperClockCache, EnableAsyncIo) only when backendType ==
dbm.RocksDBBackend, and then pass tuneUpOpts into openRocksdb — leaving the
non-RocksDB return dbm.NewDB(...) path unchanged.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (10)
cmd/cronosd/cmd/root.gocmd/cronosd/cmd/versiondb.gocmd/cronosd/config/config.gocmd/cronosd/config/toml.gocmd/cronosd/dbmigrate/migrate_rocksdb.gocmd/cronosd/opendb/opendb_rocksdb.gocmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.gogo.modgomod2nix.tomlnix/rocksdb.nix
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 6: Update the changelog entry that currently reads "*
[`#1977`](https://github.com/crypto-org-chain/cronos/pull/1977) feat: update
rocksdb to v10.9.1 and add tuning profiles." by removing the "feat:"
conventional-commit prefix and converting it to sentence-case to match other
entries (e.g., "* [`#1977`](https://github.com/crypto-org-chain/cronos/pull/1977)
Update rocksdb to v10.9.1 and add tuning profiles."). Ensure the change is
applied to the exact line containing the PR link text.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1977 +/- ##
==========================================
+ Coverage 16.87% 25.67% +8.80%
==========================================
Files 72 78 +6
Lines 6163 5117 -1046
==========================================
+ Hits 1040 1314 +274
+ Misses 5000 3641 -1359
- Partials 123 162 +39 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)
102-116:⚠️ Potential issue | 🟡 MinorCache memory leak in error paths.
When
LoadLatestOptionsfails (lines 112-115), the cache allocated on lines 104-108 is not released. SinceNewHyperClockCacheandNewLRUCacheallocate C/C++ memory that Go's garbage collector cannot manage, the cache must be explicitly destroyed withcache.Destroy()before returning.Fix: release cache on error returns
func loadLatestOptions(dir string, enableHyperClockCache bool) (*grocksdb.Options, error) { var cache *grocksdb.Cache if enableHyperClockCache { cache = grocksdb.NewHyperClockCache(BlockCacheSize, 0) } else { cache = grocksdb.NewLRUCache(BlockCacheSize) } opts, err := grocksdb.LoadLatestOptions(dir, grocksdb.NewDefaultEnv(), true, cache) if err != nil { // not found is not an error if strings.HasPrefix(err.Error(), "NotFound: ") { + cache.Destroy() return nil, nil } + cache.Destroy() return nil, err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 102 - 116, In loadLatestOptions, the cache created by NewHyperClockCache/NewLRUCache is not freed when grocksdb.LoadLatestOptions returns an error; ensure you call cache.Destroy() before any early return on error (including the NotFound branch) so the C/C++ memory is released — e.g., after err != nil, call cache.Destroy() then handle the NotFound case or return the error; keep the cache variable and NewHyperClockCache/NewLRUCache usage as-is but add explicit cache.Destroy() in each error path.
🧹 Nitpick comments (1)
cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go (1)
100-116: Consider usingb.Fatalfor benchmark-terminating errors.Using
b.Erroron lines 109 and 112 allows the benchmark to continue after encountering errors, which could produce misleading results. For benchmark correctness, failing fast is typically preferred.♻️ Suggested improvement
for pb.Next() { i := atomic.AddUint64(&counter, 1) idx := (i * 17) % uint64(numKeys) val, err := db.Get(keys[idx]) if err != nil { - b.Error(err) + b.Fatal(err) } if val == nil { - b.Error("value is nil") + b.Fatal("value is nil") } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go` around lines 100 - 116, In the ConcurrentRandomReads benchmark inside Test file opendb_rocksdb_stress_bench_test.go replace non-fatal error calls with fatal ones so the benchmark stops on first failure: change the b.Error calls in the anonymous b.Run("ConcurrentRandomReads", ...) closure — specifically the error handling after db.Get (currently calling b.Error(err)) and the nil-value check (currently calling b.Error("value is nil")) — to b.Fatal with appropriate messages so the benchmark fails fast when db.Get returns an error or a nil value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 102-116: In loadLatestOptions, the cache created by
NewHyperClockCache/NewLRUCache is not freed when grocksdb.LoadLatestOptions
returns an error; ensure you call cache.Destroy() before any early return on
error (including the NotFound branch) so the C/C++ memory is released — e.g.,
after err != nil, call cache.Destroy() then handle the NotFound case or return
the error; keep the cache variable and NewHyperClockCache/NewLRUCache usage
as-is but add explicit cache.Destroy() in each error path.
---
Nitpick comments:
In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go`:
- Around line 100-116: In the ConcurrentRandomReads benchmark inside Test file
opendb_rocksdb_stress_bench_test.go replace non-fatal error calls with fatal
ones so the benchmark stops on first failure: change the b.Error calls in the
anonymous b.Run("ConcurrentRandomReads", ...) closure — specifically the error
handling after db.Get (currently calling b.Error(err)) and the nil-value check
(currently calling b.Error("value is nil")) — to b.Fatal with appropriate
messages so the benchmark fails fast when db.Get returns an error or a nil
value.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
CHANGELOG.mdcmd/cronosd/config/config.gocmd/cronosd/opendb/opendb_rocksdb.gocmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.godefault.nixnix/rocksdb.nix
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
Signed-off-by: JayT106 <JayT106@users.noreply.github.com>
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)
64-64: Read-only open path currently skips profile tuning.Line 64 always passes
RocksDBTuneUpOptions{}, so read-only workflows won’t inheritvalidator/rpc/archivetuning. Consider plumbingtypes.AppOptionsintoOpenReadOnlyDBand reusing the same profile mapping logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb.go` at line 64, OpenReadOnlyDB currently calls openRocksdb(..., true, RocksDBTuneUpOptions{}) which skips profile-based tuning; change OpenReadOnlyDB to accept a types.AppOptions parameter (or otherwise receive the same app options used for the writable path), reuse the existing profile → RocksDBTuneUpOptions mapping logic used by the writable OpenDB path to produce the proper RocksDBTuneUpOptions, and pass that options struct into openRocksdb instead of the empty RocksDBTuneUpOptions{} so read-only opens get validator/rpc/archive tuning applied.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 143-145: The OptimizeForPointLookup call on opts is being
clobbered by the later opts.SetBlockBasedTableFactory(bbto); instead, when
tuneUpOpts.EnableOptimizeForPointLookup is true, set the equivalent point-lookup
knobs directly on the BlockBasedTableOptions instance (bbto) before calling
opts.SetBlockBasedTableFactory(bbto) — e.g., enable a hash-based block index and
tune block cache/partition settings on bbto so the point-lookup optimizations
take effect rather than relying on opts.OptimizeForPointLookup which is
overwritten.
---
Nitpick comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Line 64: OpenReadOnlyDB currently calls openRocksdb(..., true,
RocksDBTuneUpOptions{}) which skips profile-based tuning; change OpenReadOnlyDB
to accept a types.AppOptions parameter (or otherwise receive the same app
options used for the writable path), reuse the existing profile →
RocksDBTuneUpOptions mapping logic used by the writable OpenDB path to produce
the proper RocksDBTuneUpOptions, and pass that options struct into openRocksdb
instead of the empty RocksDBTuneUpOptions{} so read-only opens get
validator/rpc/archive tuning applied.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
CHANGELOG.mdcmd/cronosd/opendb/opendb_rocksdb.go
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/cronosd/opendb/opendb_rocksdb.go (1)
111-130:⚠️ Potential issue | 🟠 Major
LoadLatestOptionslifecycle cleanup is missing (C-memory leak risk).Line 112 allocates
LatestOptionsandEnv, but neither is ever released. Ingrocksdb, bothLatestOptionsandEnvhave explicitDestroy()methods that must be called to clean up underlying C resources. Additionally, the returnedOptionsobjects reference memory owned byLatestOptions, so they must be cloned beforeLatestOptionsis destroyed to avoid dangling pointers.♻️ Proposed fix
func loadLatestOptions(dir string, cache *grocksdb.Cache) (*grocksdb.Options, error) { - opts, err := grocksdb.LoadLatestOptions(dir, grocksdb.NewDefaultEnv(), true, cache) + env := grocksdb.NewDefaultEnv() + defer env.Destroy() + + latest, err := grocksdb.LoadLatestOptions(dir, env, true, cache) if err != nil { // not found is not an error if strings.HasPrefix(err.Error(), "NotFound: ") { return nil, nil } return nil, err } + defer latest.Destroy() - cfNames := opts.ColumnFamilyNames() - cfOpts := opts.ColumnFamilyOpts() + cfNames := latest.ColumnFamilyNames() + cfOpts := latest.ColumnFamilyOpts() for i := 0; i < len(cfNames); i++ { if cfNames[i] == "default" { - return &cfOpts[i], nil + return cfOpts[i].Clone(), nil } } - return opts.Options(), nil + return latest.Options().Clone(), nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cronosd/opendb/opendb_rocksdb.go` around lines 111 - 130, The loadLatestOptions function currently leaks C resources because LatestOptions (returned by grocksdb.LoadLatestOptions) and the Env created by grocksdb.NewDefaultEnv() are never destroyed; modify loadLatestOptions to call Destroy() on the LatestOptions and Env in all code paths (use defer after successful LoadLatestOptions) and return a cloned go-owned Options object instead of a pointer into LatestOptions-managed memory (e.g. clone cfOpts[i] via its Clone()/Copy() method or clone opts.Options() before destroying LatestOptions); also ensure the "NotFound" early return destroys the Env (and any created objects) before returning nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.go`:
- Line 69: The rand.Read call in opendb_rocksdb_stress_bench_test.go currently
drops its error which can silently corrupt the benchmark setup; capture the
returned error from rand.Read(values[i]) and fail the test/benchmark immediately
(using the test's failure helper, e.g. b.Fatalf or t.Fatalf depending on whether
this is a Benchmark or Test function) with a descriptive message including the
error so setup cannot proceed on failure.
- Around line 127-149: The mixed-workload loop currently ignores errors from
db.Get, db.Set and only partially checks db.Iterator; update the block to
capture and handle all returned errors: check the error return from db.Get and
db.Set and fail the test or record the failure (use the test benchmark handle in
scope, e.g., b.Fatalf/b.Error or t.Fatalf/t.Error) when err != nil; check the
error returned by db.Iterator and fail/record on non-nil error; ensure the
iterator is closed in a defer or after use and handle any iterator method
errors/signals appropriately (references: db.Get, db.Set, db.Iterator, itr,
itr.Close, itr.Next, itr.Valid, itr.Key, itr.Value) so benchmark results don’t
hide operation failures.
In `@nix/rocksdb.nix`:
- Line 26: Update the pinned RocksDB version string from "10.9.1" to "10.10.1"
(replace version = "10.9.1"; with version = "10.10.1";) and also replace the
existing sha256 value with the correct checksum for the v10.10.1 release so the
fetch succeeds; ensure the sha256 field that accompanies the
fetch/fetchFromGitHub entry is updated to the new hash for v10.10.1.
---
Outside diff comments:
In `@cmd/cronosd/opendb/opendb_rocksdb.go`:
- Around line 111-130: The loadLatestOptions function currently leaks C
resources because LatestOptions (returned by grocksdb.LoadLatestOptions) and the
Env created by grocksdb.NewDefaultEnv() are never destroyed; modify
loadLatestOptions to call Destroy() on the LatestOptions and Env in all code
paths (use defer after successful LoadLatestOptions) and return a cloned
go-owned Options object instead of a pointer into LatestOptions-managed memory
(e.g. clone cfOpts[i] via its Clone()/Copy() method or clone opts.Options()
before destroying LatestOptions); also ensure the "NotFound" early return
destroys the Env (and any created objects) before returning nil.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cmd/cronosd/opendb/opendb_rocksdb.gocmd/cronosd/opendb/opendb_rocksdb_stress_bench_test.gonix/rocksdb.nix
Yes, just solved the build issue and rollback to version v10.9.1
Updated ! |
|
good job, lgtm. |
Summary
Provides configurable RocksDB tuning profiles (Validator, RPC, Archive) out-of-the-box to optimize point lookups, lock-free caching, and background async I/O based on the node workload.
RocksDBConfigandnode_typeenum ("","validator","rpc","archive") inapp.toml.HyperClockCache,OptimizeForPointLookup, andAsyncIO/AutoReadaheadSizedynamically insideopendb_rocksdb.go.opendb_rocksdb_stress_bench_test.go) validating the RPS and throughput wins on the various Node Type profiles.Summary by CodeRabbit
New Features
Enhancements
Tests
Chores