Skip to content

feat: add orchard zk#407

Open
QuantumExplorer wants to merge 43 commits intodevelopfrom
feat/orchardZK
Open

feat: add orchard zk#407
QuantumExplorer wants to merge 43 commits intodevelopfrom
feat/orchardZK

Conversation

@QuantumExplorer
Copy link
Member

@QuantumExplorer QuantumExplorer commented Feb 10, 2026

Issue being fixed or feature implemented

Adds Orchard-style zero-knowledge shielded pool support to GroveDB — a new CommitmentTree element type backed by Zcash's Sinsemilla Merkle tree (depth-32, append-only, over the Pallas curve). This is the GroveDB foundation for Dash Platform 3.x shielded transactions.

What was done?

New crate: grovedb-commitment-tree

  • Lightweight frontier-based Sinsemilla Merkle tree wrapping incrementalmerkletree::Frontier<MerkleHashOrchard, 32>
  • CommitmentFrontier: O(1) append and root hash computation, ~1KB constant size regardless of tree size
  • Custom binary serialization (no serde dependency)
  • Re-exports all Orchard types Platform needs: Builder, BundleType, BatchValidator, ProvingKey, VerifyingKey, Proof, SpendingKey, FullViewingKey, key hierarchy, Note,
    Anchor, MerklePath, TransmittedNoteCiphertext, ValueCommitment, redpallas

New Element variant: Element::CommitmentTree

  • Discriminant 11, TreeType::CommitmentTree = 7
  • 4 fields: (Option<Vec>, [u8; 32], CountValue, Option) — root_key, sinsemilla_root, count, flags
  • Behaves like a CountTree for Merk purposes (CountNode, CountedMerkNode, count aggregation) with an additional sinsemilla root field
  • Items stored as cmx (32 bytes) || payload with sequential u64 BE keys
  • Updated across all match sites: element, merk, costs, delete, get, tree_type, batch, query, proof, node-grove converter

GroveDB operations (grovedb/src/operations/commitment_tree.rs)

  • commitment_tree_insert(path, key, cmx, payload, tx, gv) → ([u8; 32], u64) — append cmx to frontier, insert item into subtree, return new anchor + position
  • commitment_tree_anchor(path, key, tx, gv) → Anchor — read current Sinsemilla root as an Orchard Anchor type
  • Frontier persisted in aux storage via load/modify/save pattern (~1KB per commitment tree)
  • Historical anchors are not managed by GroveDB — Platform stores these in a separate provable tree

Batch operations

  • GroveOp::CommitmentTreeInsert { cmx, payload } — batchable insert
  • QualifiedGroveDbOp::commitment_tree_insert_op() constructor
  • Preprocessing converts CommitmentTreeInsert ops → item inserts + ReplaceTreeRootKey before batch execution
  • Updated average/worst case cost estimation

Root hash propagation

  • Sinsemilla root is stored in the Element and authenticated through the Merk hash chain
  • Insert operations propagate changes up through the GroveDB Merk hierarchy via update_tree_item_preserve_flag + propagate_changes_with_transaction
  • The sinsemilla root is preserved during normal Merk propagation (child modifications don't overwrite it)

Design decisions

  • Frontier, not ShardTree: Only the rightmost path is stored (~1KB), not the full tree. Witness generation is a client/wallet concern, not a consensus node concern (same
    as Zebra's architecture)
  • CountTree behavior: Items inside the commitment tree are queryable, provable, and countable through standard GroveDB mechanisms
  • Historical anchors are Platform's responsibility: Platform stores anchors as Items in a separate GroveDB tree, making them provable, queryable by block height, and
    prunable. GroveDB only tracks the current frontier state.
  • BatchValidator re-exported (not just verify_proof) — Platform MUST use BatchValidator to verify binding + SpendAuth signatures, not just ZK proofs

How Has This Been Tested?

12 unit tests in grovedb-commitment-tree + 22 integration tests in grovedb/src/tests/commitment_tree_tests.rs:

Element basics (4): insert at root, under normal tree, with flags, serialization roundtrip

Insert operations (4): single insert with cmx+payload verification, multiple sequential inserts, insert with transaction, transaction rollback

Anchor/frontier (3): empty anchor, anchor changes after insert, deterministic anchors across databases

Root hash propagation (2): insert propagates to GroveDB root, nested propagation (parent tree → commitment tree → items)

Count aggregation (1): count reflects number of inserted items

Batch operations (3): batch insert multiple items, batch with transaction, batch-vs-nonbatch consistency (same GroveDB root hash)

Delete (2): delete non-empty commitment tree, delete and recreate

Error handling (2): insert on non-commitment-tree fails, anchor on non-commitment-tree fails

Multi-pool (1): two independent commitment trees with different anchors and counts

Crate-level (12): empty frontier, append changes root, position tracking, deterministic roots, different leaves different roots, serialize empty/roundtrip/many-leaves,
invalid field element, invalid data, root hash size, empty root matches Orchard

cargo test -p grovedb-commitment-tree # 12 tests
cargo test -p grovedb -- commitment_tree # 22 tests
cargo test -p grovedb -- batch # no regressions

Breaking Changes

  • New Element::CommitmentTree variant (discriminant 11): Any exhaustive match on Element will need updating. Serialization format extended — older versions cannot
    deserialize databases containing this element type.
  • New ElementType::CommitmentTree = 11 and TreeType::CommitmentTree = 7: Same compatibility concern for code matching on these enums.
  • New GroveOp::CommitmentTreeInsert (10): Batch operation consumers matching on GroveOp must handle the new variant.
  • New Error::CommitmentTreeError(String) variant on grovedb::Error.
  • New workspace dependency: grovedb-commitment-tree crate added, which pulls in orchard (with circuit feature), incrementalmerkletree, and transitively the Pallas/Vesta
    curve implementations.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

Release Notes

  • New Features

    • Added commitment tree support with Sinsemilla frontier for zero-knowledge protocol integration, including commitment insertion, anchor retrieval, and Merkle witness generation
    • Added batch operation support for commitment tree inserts
    • Introduced client-side and server-side commitment tree implementations via feature flags
  • Infrastructure

    • Extended cost tracking to include Sinsemilla hash operation metrics
    • Expanded workspace to include the new commitment tree module

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Adds a new crate grovedb-commitment-tree (Orchard Sinsemilla commitment tree, shard/KV storage, serializers, in-memory store) and integrates a CommitmentTree element into GroveDB: ops, batching/preprocessing, aux persistence, cost/merk mappings, propagation, deletion cleanup, and extensive tests.

Changes

Cohort / File(s) Summary
New crate: commitment tree
grovedb-commitment-tree/Cargo.toml, grovedb-commitment-tree/src/lib.rs, grovedb-commitment-tree/src/kv_store.rs, grovedb-commitment-tree/src/serialization.rs, grovedb-commitment-tree/src/storage.rs
Adds new crate implementing CommitmentTree (ShardTree-backed), KvStore trait + MemKvStore, binary serializers for prunable trees/checkpoints, in-memory test store, Orchard/shardtree re-exports, public API, and tests.
Workspace & grovedb manifest
Cargo.toml, grovedb/Cargo.toml
Updates workspace members ordering and adds optional grovedb-commitment-tree dependency enabled by the minimal feature.
Element model & helpers
grovedb-element/src/element/mod.rs, grovedb-element/src/element_type.rs, grovedb-element/src/element/constructor.rs, grovedb-element/src/element/helpers.rs, grovedb-element/src/element/visualize.rs
Introduces Element::CommitmentTree(Option<Vec<u8>>, Option<ElementFlags>); adds constructors, type mapping, flag helpers, visualization, and detection helpers.
Merk/TreeType & costs
merk/src/tree_type/mod.rs, merk/src/tree_type/costs.rs, merk/src/element/tree_type.rs, merk/src/element/costs.rs, merk/src/element/get.rs, merk/src/element/delete.rs, merk/src/tree_type/costs.rs
Adds TreeType::CommitmentTree, extends tree-type mappings, cost calculations, get/delete dispatch, and feature-type handling to treat CommitmentTree as a tree-like type.
GroveDB ops & batching
grovedb/src/batch/mod.rs, grovedb/src/batch/batch_structure.rs, grovedb/src/batch/estimated_costs/average_case_costs.rs, grovedb/src/batch/estimated_costs/worst_case_costs.rs
Adds GroveOp variants CommitmentTreeAppend and CommitmentTreeCheckpoint, constructors, debug formatting, preprocessing hooks, and maps their cost estimation to ReplaceTreeRootKey-like paths.
Commitment-tree operations
grovedb/src/operations/commitment_tree.rs, grovedb/src/operations/mod.rs, grovedb/src/operations/get/mod.rs, grovedb/src/operations/get/query.rs, grovedb/src/operations/proof/generate.rs, grovedb/src/operations/proof/verify.rs
New operations module implementing CT APIs (append, checkpoint, root/witness queries, prepare_spend), aux load/persist, batch preprocessing, and integrates CommitmentTree into query/proof flows (gated by minimal).
Core integration & errors
grovedb/src/lib.rs, grovedb/src/error.rs
Extends propagation/verification to handle CommitmentTree elements and adds Error::CommitmentTreeError(String).
Delete cleanup
grovedb/src/operations/delete/mod.rs
Deletes commitment-tree auxiliary aux data when removing a CommitmentTree element to avoid orphaned aux state.
Tests & tooling
grovedb/src/tests/commitment_tree_tests.rs, grovedb/src/tests/mod.rs, node-grove/src/converter.rs
Adds extensive integration tests covering CT lifecycle, batch semantics, Orchard witness/inclusion checks, and updates node-grove converter to recognize CommitmentTree.

Sequence Diagram

sequenceDiagram
    participant Client as Client/Batch
    participant GroveDB as GroveDB
    participant Preproc as BatchPreprocessor
    participant Aux as AuxKvStore
    participant Ct as CommitmentTree

    Client->>GroveDB: submit batch (may include CT append/checkpoint)
    activate GroveDB

    GroveDB->>Preproc: preprocess_commitment_tree_ops(batch)
    Preproc-->>GroveDB: expanded ops + aux mutations

    GroveDB->>Aux: load CT aux (path, key)
    Aux-->>GroveDB: serialized CT bytes

    GroveDB->>Ct: CommitmentTree::new(store, max_checkpoints)
    Ct-->>GroveDB: Ct instance

    GroveDB->>Ct: append(leaf) / checkpoint(id)
    Ct->>Ct: ShardTree append/checkpoint -> compute root & position
    Ct-->>GroveDB: (root_hash, position)

    GroveDB->>Aux: persist CT aux (serialized shards & checkpoints)
    Aux-->>GroveDB: persisted

    GroveDB->>GroveDB: emit ReplaceTreeRootKey (propagate root)
    GroveDB-->>Client: return result
    deactivate GroveDB
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 I hopped through shards and left a sign,
Roots grew true from every tiny spine,
Checkpoints winked as leaves rearranged,
I bounded past bytes where commitments changed,
Hooray—new trees, new hops, and proofs well-ranged!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'feat: add orchard zk' is vague and uses non-descriptive terminology. While it references the feature addition, it does not clearly convey what 'orchard zk' entails or the primary change (introduction of CommitmentTree and Sinsemilla support). Consider a more specific title such as 'feat: add Orchard commitment tree with Sinsemilla support' or 'feat: introduce CommitmentTree element for shielded pool operations' to better reflect the actual scope of changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/orchardZK

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@grovedb-element/src/element_type.rs`:
- Around line 550-565: The test's discriminant coverage is missing the
ProvableCountSumTree variant and the assertion still expects 11 variants; update
the test_cases vector to include the ProvableCountSumTree tuple (matching the
pattern used: an Element::ProvableCountSumTree(...) entry, the corresponding
ElementType::ProvableCountSumTree, and the string "ProvableCountSumTree") and
then change the assert_eq! expected value from 11 to 12 so the test verifies all
enum discriminants are covered.

In `@grovedb/src/error.rs`:
- Around line 164-167: The Error::CommitmentTreeError(String) variant is missing
from the add_context match in the Error enum’s add_context function; add a match
arm for CommitmentTreeError(err) that returns
Error::CommitmentTreeError(format!("{}: {}", context, err)) (or equivalent
ordering you use for other String variants) so context is preserved, consistent
with how MissingReference, InternalError, NotSupported, etc. are handled in
add_context.

In `@grovedb/src/lib.rs`:
- Around line 730-733: The CommitmentTree branch currently maps errors with
.map_err(|e| e.into()) which loses context; update the error mapping where
Element::CommitmentTree is handled (the call to
Element::new_commitment_tree_with_flags(...).insert_subtree(...)) to wrap the
underlying error with contextual information (e.g., use Error::CorruptedData or
similar crate-specific error constructor and include formatted context like the
parent key, key_ref, or grove_version) instead of plain into(); apply the same
pattern to the other occurrences noted (around the block covering lines
~900–915) so every .map_err(|e| e.into()) on insert_subtree or similar
CommitmentTree propagation calls includes a clear descriptive message about what
operation and inputs failed.

In `@grovedb/src/operations/commitment_tree.rs`:
- Around line 224-277: commitment_tree_checkpoint currently loads aux storage
without verifying the target element is present and of type CommitmentTree
(unlike commitment_tree_append); update commitment_tree_checkpoint to first
check the element at the given path+key is present and its element type equals
CommitmentTree (use the same lookup/validation logic as commitment_tree_append),
and return an appropriate Error if missing or of the wrong type before calling
get_immediate_storage_context/loading mem_store and checkpointing (keep
references to commitment_tree_checkpoint, commitment_tree_append,
COMMITMENT_TREE_DATA_KEY, get_immediate_storage_context, and the
Error::CommitmentTreeError path to locate where to add the validation).

In `@grovedb/src/tests/commitment_tree_tests.rs`:
- Around line 14-1827: The tests exercise many state-changing ops but lack proof
round-trips: add at least one test that, after calling commitment_tree_append
(and one that uses apply_batch with commitment_tree_append ops), checkpoints the
tree, obtains a proof via commitment_tree_witness or
commitment_tree_prepare_spend, and verifies the proof by comparing the
reconstructed root/anchor to commitment_tree_root_hash / commitment_tree_anchor
(use ExtractedNoteCommitment::from_bytes to convert leaves and
MerklePath::root(cm) to compare); reference the existing tests' insert/append
setup and the functions commitment_tree_append, apply_batch,
commitment_tree_checkpoint, commitment_tree_witness,
commitment_tree_prepare_spend, commitment_tree_root_hash, and
commitment_tree_anchor to locate where to add these assertions.
🧹 Nitpick comments (4)
grovedb/Cargo.toml (1)

72-83: Adding grovedb-commitment-tree to minimal makes it a default dependency for all users.

The feature chain is defaultfullminimal, so grovedb-commitment-tree (with its orchard + Halo2 circuit dependencies) will be pulled in for every default build. This adds significant compile time and binary size. Consider whether this should be a separate, opt-in feature rather than bundled into minimal.

Suggested approach: separate feature gate
 minimal = [
     "grovedb-merk/minimal",
-    "grovedb-commitment-tree",
     "thiserror",
     "tempfile",
     "grovedb-storage/rocksdb_storage",
     "visualize",
     "itertools",
     "integer-encoding",
     "grovedb-costs",
     "intmap",
 ]
+commitment_tree = ["grovedb-commitment-tree", "minimal"]

Then gate the commitment tree operations behind #[cfg(feature = "commitment_tree")] instead of #[cfg(feature = "minimal")].

grovedb/src/batch/estimated_costs/worst_case_costs.rs (1)

129-153: Minor: use the already-imported TreeType alias instead of fully-qualified path.

TreeType is already imported at line 17. Using TreeType::NormalTree instead of grovedb_merk::tree_type::TreeType::NormalTree would be more concise and consistent with the rest of the file.

♻️ Suggested diff
-                    grovedb_merk::tree_type::TreeType::NormalTree,
+                    TreeType::NormalTree,

(Apply to both occurrences at lines 135 and 147.)

grovedb/src/batch/estimated_costs/average_case_costs.rs (1)

132-154: Use TreeType::CommitmentTree instead of hardcoded TreeType::NormalTree in both cost estimation files.

The cost_size for CommitmentTree is identical to NormalTree (both use TREE_COST_SIZE = 3) and they share the same inner node type (NormalNode). However, using TreeType::CommitmentTree is semantically more accurate and provides resilience: if commitment tree cost characteristics change in the future, the code will automatically reflect those changes.

This pattern appears in both average_case_costs.rs (lines 132, 144) and worst_case_costs.rs (lines 129, 142).

grovedb/src/tests/commitment_tree_tests.rs (1)

19-28: Prefer the standard CostResult discard pattern for consistency.
Consider using .unwrap().map_err(|e| e) (or a tiny helper) when discarding costs so test helpers can return Result consistently.

♻️ Example tweak
-    db.insert(
-        EMPTY_PATH,
-        b"commitments",
-        Element::empty_commitment_tree(),
-        None,
-        None,
-        grove_version,
-    )
-    .unwrap()
-    .expect("successful commitment tree insert at root");
+    db.insert(
+        EMPTY_PATH,
+        b"commitments",
+        Element::empty_commitment_tree(),
+        None,
+        None,
+        grove_version,
+    )
+    .unwrap()
+    .map_err(|e| e)
+    .expect("successful commitment tree insert at root");
Based on learnings: "In Rust test code for grovedb, when a function returns a cost-wrapped result (CostResult), using .unwrap().map_err(|e| e) is acceptable to discard cost tracking by unwrapping the inner Result and converting the error type for test helpers."

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@grovedb-commitment-tree/src/lib.rs`:
- Around line 164-173: The function root_at_checkpoint_depth currently converts
Ok(None) from self.inner.root_at_checkpoint_depth into an empty root, which
masks missing checkpoints; change it so that when
self.inner.root_at_checkpoint_depth(depth) returns Ok(None) and depth.is_some()
you return an Err signalling the missing checkpoint (e.g.
CommitmentTreeError::CheckpointNotFound(depth.unwrap()) or add a new
CommitmentTreeError variant like CheckpointUnavailable if none exists), while
preserving the existing behavior of returning
MerkleHashOrchard::empty_root(Level::from(NOTE_COMMITMENT_TREE_DEPTH as u8))
only when depth is None; update root_at_checkpoint_depth and adjust the
CommitmentTreeError enum accordingly.

In `@grovedb/src/operations/delete/mod.rs`:
- Around line 643-666: The aux cleanup (delete_aux for COMMITMENT_TREE_DATA_KEY)
is scheduled before the non-empty-tree guard, which can delete commitment-tree
aux data even when allow_deleting_non_empty_trees is false and the element is
not removed; fix by moving the commitment-tree aux deletion to occur only after
the emptiness check and only in the branches that actually proceed with removal
(reuse the already-derived subtree_merk_path_ref / SubtreePath rather than
calling path.derive_owned_with_child(key) earlier) so that delete_aux is
appended to the batch only when element.is_commitment_tree() and the function
will continue past the non-empty-tree guard.
🧹 Nitpick comments (4)
grovedb-commitment-tree/src/kv_store.rs (2)

85-98: Serialization panics on overflow — document the precondition or return Result.

serialize() uses .expect("entry count overflow"), .expect("key length overflow"), etc. The doc comment says "Panics if entry count, key length, or value length exceeds u32::MAX" — this is acknowledged. For commitment tree data this is practically unreachable, but returning a Result would be more defensive.


383-392: last_shard collects all shard entries to access the last one.

prefix_iter collects all entries into a Vec, then last() picks only the final element. For the in-memory MemKvStore with ≤65536 shards this is acceptable, but if the KvStore trait is ever implemented over a larger backing store, this would be wasteful. Consider adding a last_with_prefix method to the trait in the future.

grovedb/src/operations/commitment_tree.rs (1)

235-261: _grove_version is used but has an underscore prefix, which is misleading.

The _grove_version parameter is actually passed to validate_is_commitment_tree (line 258), so it's not unused. The underscore prefix conventionally signals an intentionally unused variable in Rust, which could confuse readers. This same pattern appears in commitment_tree_root_hash, commitment_tree_witness, commitment_tree_current_end_position, commitment_tree_anchor, commitment_tree_orchard_witness, and commitment_tree_prepare_spend.

Rename to `grove_version` (drop the underscore prefix)
     pub fn commitment_tree_checkpoint<'b, B, P>(
         &self,
         path: P,
         key: &[u8],
         checkpoint_id: u64,
         transaction: TransactionArg,
-        _grove_version: &GroveVersion,
+        grove_version: &GroveVersion,
     ) -> CostResult<(), Error>

Apply the same rename in all affected methods.

grovedb/src/tests/commitment_tree_tests.rs (1)

460-465: test_leaf_bytes produces only 31 distinct values — fragile for future test expansion.

The helper uses (index % 31) as u8 + 1 for the level parameter, meaning test_leaf_bytes(0) == test_leaf_bytes(31), test_leaf_bytes(1) == test_leaf_bytes(32), etc. All current test indices produce distinct leaves, but this could silently cause collisions if someone adds tests with higher indices.

Consider encoding more bits of the index into the leaf
 fn test_leaf_bytes(index: u64) -> [u8; 32] {
     use grovedb_commitment_tree::Level;
     let empty = MerkleHashOrchard::empty_leaf();
-    let leaf = MerkleHashOrchard::combine(Level::from((index % 31) as u8 + 1), &empty, &empty);
-    leaf.to_bytes()
+    // Use two levels of combining to get more distinct values
+    let inner = MerkleHashOrchard::combine(Level::from((index % 31) as u8 + 1), &empty, &empty);
+    let leaf = MerkleHashOrchard::combine(Level::from(((index / 31) % 31) as u8 + 1), &empty, &inner);
+    leaf.to_bytes()
 }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@grovedb/src/tests/commitment_tree_tests.rs`:
- Around line 2233-2237: The docstring for
test_commitment_tree_element_type_confusion claims it verifies overwriting a
normal tree with a commitment tree (and vice versa) but the body only checks
that commitment-tree ops fail on a normal tree; either remove that overwrite
claim or add the overwrite scenarios: extend the test to (1) create a normal
tree at a path, assert commitment-tree operations fail, then overwrite that path
with a commitment tree using the same tree-creation API used elsewhere in these
tests, assert commitment-tree operations now succeed and normal-tree-only ops
fail; and (2) repeat the reverse: create a commitment tree, assert commitment
ops work, overwrite with a normal tree, and assert commitment ops now fail—use
the existing helper/creation functions in this module to perform the overwrites
so assertions match other tests.
🧹 Nitpick comments (1)
grovedb/src/tests/commitment_tree_tests.rs (1)

459-465: test_leaf_bytes produces only 31 distinct values — fragile for future tests.

index % 31 means test_leaf_bytes(0) == test_leaf_bytes(31), etc. No current test hits this, but a future test using more than 31 leaves or unlucky index pairs will silently collide, causing confusing failures.

A hash-based approach avoids the limitation:

Suggested improvement
 fn test_leaf_bytes(index: u64) -> [u8; 32] {
     use grovedb_commitment_tree::Level;
-    let empty = MerkleHashOrchard::empty_leaf();
-    let leaf = MerkleHashOrchard::combine(Level::from((index % 31) as u8 + 1), &empty, &empty);
-    leaf.to_bytes()
+    // Build a unique leaf per index by chaining combine operations.
+    // Start from empty_leaf and fold index bytes into the hash so
+    // every index maps to a distinct valid Pallas element.
+    let mut node = MerkleHashOrchard::empty_leaf();
+    for (i, byte) in index.to_le_bytes().iter().enumerate() {
+        let level = Level::from((i as u8 % 31) + 1);
+        // Mix byte value by combining node at different levels
+        for _ in 0..(*byte as u16 + 1) {
+            node = MerkleHashOrchard::combine(level, &node, &MerkleHashOrchard::empty_leaf());
+        }
+    }
+    node.to_bytes()
 }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
grovedb/src/operations/proof/verify.rs (1)

402-410: ⚠️ Potential issue | 🟠 Major

CommitmentTree count extraction missing in trunk-proof flow.
Lines 409 and 528 treat CommitmentTree as a tree for presence proofs, but extract_count_from_element (used by trunk chunk verification) still ignores CommitmentTree, which will cause “target is not a count tree element” errors if trunk/branch proofs are used for commitment trees. If CommitmentTree is intended to behave like CountTree, please include it there.

🔧 Proposed fix
fn extract_count_from_element(element: &Element) -> Option<u64> {
    match element {
        Element::CountTree(_, count, _)
        | Element::CountSumTree(_, count, ..)
        | Element::ProvableCountTree(_, count, _)
        | Element::ProvableCountSumTree(_, count, ..)
+       | Element::CommitmentTree(_, _, count, _) => Some(*count),
        _ => None,
    }
}

Also applies to: 521-529

🤖 Fix all issues with AI agents
In `@grovedb/src/operations/commitment_tree.rs`:
- Around line 124-126: The call to storage_ctx.put_aux(COMMITMENT_TREE_DATA_KEY,
&serialized, None).map_err(|e| e.into()) drops context on failures; replace the
map_err to wrap the error with contextual information (e.g., use the project's
error type like Error::CorruptedData or similar) and include a descriptive
message mentioning COMMITMENT_TREE_DATA_KEY and the operation (writing
commitment tree data) so failures are traceable; apply the same pattern to the
other occurrence around lines 423-427 (the other put_aux call) so both
persistence errors are wrapped with formatted context.
- Around line 103-139: The aux frontier write is currently performed via
storage_ctx.put_aux immediately, which can advance the frontier without the
corresponding subtree StorageBatch updates if a crash occurs; instead,
create/use the same StorageBatch (StorageBatch::new or the existing batch) and
record the aux write into that batch (using the storage layer's "put aux into
batch" API) rather than calling storage_ctx.put_aux directly, remove the
immediate put_aux call that uses storage_ctx, then include the frontier
serialized data (from frontier.serialize()) in the shared batch alongside the
item insert (item_key/item_element) and finally commit both together via
commit_multi_context_batch so the aux frontier and Merk subtree updates are
atomic; adjust surrounding code to drop the immediate storage_ctx use
accordingly and keep load_frontier_from_aux, frontier.append, and
frontier.serialize unchanged.
🧹 Nitpick comments (1)
grovedb/src/debugger.rs (1)

722-727: CommitmentTree loses debugging metadata when mapped to generic Subtree.

The CommitmentTree element's sinsemilla_root and count fields are discarded when converting to grovedbg_types::Element::Subtree. This reduces debugging visibility compared to other tree variants (CountTree, ProvableCountTree, etc.) which preserve their metadata in the debugger output.

Since grovedbg_types::Element lacks a CommitmentTree variant, consider either:

  1. Adding a CommitmentTree variant to grovedbg_types with sinsemilla_root and count fields, or
  2. Adding a code comment explaining this interim mapping.

Comment on lines 103 to 139
let storage_ctx = self
.db
.get_immediate_storage_context(ct_path.clone(), tx.as_ref())
.unwrap_add_cost(&mut cost);

let mut frontier =
cost_return_on_error_no_add!(cost, load_frontier_from_aux(&storage_ctx, &mut cost));

let position = frontier.tree_size(); // next sequential position

let new_sinsemilla_root = cost_return_on_error_no_add!(
cost,
frontier
.append(cmx)
.map_err(|e| Error::CommitmentTreeError(format!("append failed: {}", e)))
);

// 4. Save frontier back to aux
let serialized = frontier.serialize();
cost_return_on_error!(
&mut cost,
storage_ctx
.put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None)
.map_err(|e| e.into())
);

#[allow(clippy::drop_non_drop)]
drop(storage_ctx);

// 5. Create the item and insert into the subtree
let item_key = position.to_be_bytes();
let mut item_value = Vec::with_capacity(32 + payload.len());
item_value.extend_from_slice(&cmx);
item_value.extend_from_slice(&payload);
let item_element = Element::new_item(item_value);

let batch = StorageBatch::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make aux frontier writes atomic with subtree updates.
put_aux is executed via an immediate context before the batch is committed. When transaction is None, a failure/crash between the aux write and commit_multi_context_batch can leave the frontier advanced without the corresponding Merk updates. Use a shared StorageBatch for the aux write so both updates commit atomically.

🛠️ Proposed fix (batch the aux write)
-        // 3. Load frontier from aux, append cmx, get new root and position
-        let storage_ctx = self
-            .db
-            .get_immediate_storage_context(ct_path.clone(), tx.as_ref())
-            .unwrap_add_cost(&mut cost);
+        // 3. Load frontier from aux, append cmx, get new root and position
+        // Use a shared batch so aux + merk updates commit atomically
+        let batch = StorageBatch::new();
+        let storage_ctx = self
+            .db
+            .get_transactional_storage_context(ct_path.clone(), Some(&batch), tx.as_ref())
+            .unwrap_add_cost(&mut cost);
@@
-        let batch = StorageBatch::new();
+        // batch already created above
🤖 Prompt for AI Agents
In `@grovedb/src/operations/commitment_tree.rs` around lines 103 - 139, The aux
frontier write is currently performed via storage_ctx.put_aux immediately, which
can advance the frontier without the corresponding subtree StorageBatch updates
if a crash occurs; instead, create/use the same StorageBatch (StorageBatch::new
or the existing batch) and record the aux write into that batch (using the
storage layer's "put aux into batch" API) rather than calling
storage_ctx.put_aux directly, remove the immediate put_aux call that uses
storage_ctx, then include the frontier serialized data (from
frontier.serialize()) in the shared batch alongside the item insert
(item_key/item_element) and finally commit both together via
commit_multi_context_batch so the aux frontier and Merk subtree updates are
atomic; adjust surrounding code to drop the immediate storage_ctx use
accordingly and keep load_frontier_from_aux, frontier.append, and
frontier.serialize unchanged.

Comment on lines 124 to 126
storage_ctx
.put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None)
.map_err(|e| e.into())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wrap aux persistence errors with context.
map_err(|e| e.into()) loses critical context for diagnosing frontier write failures.

🧩 Suggested fix
-            storage_ctx
-                .put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None)
-                .map_err(|e| e.into())
+            storage_ctx
+                .put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None)
+                .map_err(|e| {
+                    Error::CorruptedData(format!(
+                        "commitment tree frontier persist failed: {}",
+                        e
+                    ))
+                })
-            storage_ctx
-                .put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None)
-                .map_err(|e| e.into())
+            storage_ctx
+                .put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None)
+                .map_err(|e| {
+                    Error::CorruptedData(format!(
+                        "commitment tree frontier persist failed during preprocessing: {}",
+                        e
+                    ))
+                })
As per coding guidelines: "{grovedb,merk,storage,costs}/src/**/*.rs : Wrap errors with context (e.g., .map_err(|e| Error::CorruptedData(format!(\"context: {}\", e))))".

Also applies to: 423-427

🤖 Prompt for AI Agents
In `@grovedb/src/operations/commitment_tree.rs` around lines 124 - 126, The call
to storage_ctx.put_aux(COMMITMENT_TREE_DATA_KEY, &serialized, None).map_err(|e|
e.into()) drops context on failures; replace the map_err to wrap the error with
contextual information (e.g., use the project's error type like
Error::CorruptedData or similar) and include a descriptive message mentioning
COMMITMENT_TREE_DATA_KEY and the operation (writing commitment tree data) so
failures are traceable; apply the same pattern to the other occurrence around
lines 423-427 (the other put_aux call) so both persistence errors are wrapped
with formatted context.

@PastaPastaPasta
Copy link
Member

Code Review: PR #407 — feat: add orchard zk

Overview

This PR introduces Orchard-style zero-knowledge shielded pool support to GroveDB via a new
CommitmentTree element type. It adds:

  • A new grovedb-commitment-tree crate wrapping incrementalmerkletree::Frontier with Orchard's
    Sinsemilla hashing
  • A new Element::CommitmentTree variant (discriminant 11) that behaves like a CountTree with an
    additional 32-byte sinsemilla root
  • Direct and batch insertion APIs
  • Comprehensive test coverage (12 crate-level + 22 integration tests)

The design is architecturally sound: consensus nodes store only the frontier (~1KB), while
witness generation is a client/wallet concern.


Strengths

  • Clean separation of concerns: Server stores frontier only; client (behind feature flag) wraps
    ShardTree for witness generation - Thorough test coverage: Insert, delete/recreate, batch, transaction rollback, multi-pool,
    determinism, verify_grovedb corruption detection
  • Batch/non-batch consistency test: Explicitly verifies both APIs produce identical GroveDB root
    hashes
  • Forward-compatible serialization: Trailing bytes in frontier deserialization are tolerated
  • Cost tracking: New sinsemilla_hash_calls field properly threaded through all cost aggregation
    paths - Delete cleanup: Aux storage for commitment frontier is properly cleaned up on tree deletion

Issues and Suggestions

Critical / Correctness

  1. test_cmx may produce invalid Pallas field elements
    grovedb/src/tests/commitment_tree_tests.rs:2974-2981 — The helper just sets bytes[0] = index and
    clears the top bit. This doesn't guarantee a valid Pallas field element (the modulus is not
    simply 2^255). While most small values will work, edge cases may silently fail. Consider using
    the same approach as the crate-level tests (MerkleHashOrchard::combine on empty_leaf) which
    guarantees valid elements.
  2. ReplaceTreeRootKey validation was relaxed
    grovedb/src/batch/batch_structure.rs:1043-1049 — Previously, ReplaceTreeRootKey returned
    InvalidBatchOperation if submitted directly by users (it's internal-only). Now it returns Ok(())
    alongside CommitmentTreeInsert. This means users can now submit raw ReplaceTreeRootKey ops in a
    batch without error, which breaks the original safety invariant. These should be separate arms:
    GroveOp::CommitmentTreeInsert { .. } => Ok(()), // will be preprocessed
    GroveOp::ReplaceTreeRootKey { .. } => Err(Error::InvalidBatchOperation(
    "replace tree root key is an internal operation only",
    )),
  3. Average-case cost uses wrong tree type parameter
    grovedb/src/batch/estimated_costs/average_case_costs.rs:1069 — Uses in_tree_type while the
    worst-case equivalent at worst_case_costs.rs:1159 uses in_parent_tree_type. One of these is wrong
    — verify which parameter correctly represents the parent tree context.
  4. sinsemilla_root added to ReplaceTreeRootKey and InsertTreeWithRootHash leaks abstraction
    Adding sinsemilla_root: Option<[u8; 32]> to these generic GroveOp variants means every
    non-commitment-tree operation now carries a None field. There are ~20 places where
    sinsemilla_root: None is written. This is a code smell; consider either a dedicated
    ReplaceCommitmentTreeRootKey variant or storing the sinsemilla root in the AggregateData enum
    instead.

Moderate

  1. .. pattern in occupied entry match may hide future fields
    grovedb/src/batch/mod.rs:1926 — The comment says .. preserves sinsemilla_root, but this pattern
    will silently ignore any future fields added to ReplaceTreeRootKey. This is fragile — explicitly
    listing all fields would be safer.
  2. Frontier cost estimation assumes added_bytes: 0
    Both average and worst-case cost estimations assume added_bytes: 0 (only replaced_bytes). For the
    first insert into a new commitment tree, the frontier is being added to aux storage, not
    replacing existing data. This underestimates the initial insert cost.
  3. commitment_tree_insert creates its own StorageBatch per call
    grovedb/src/operations/commitment_tree.rs:2258 — Each non-batch insert creates a new StorageBatch
    and commits it. If called in a loop (as the tests do), this means N separate batch commits
    instead of one. The batch API handles this correctly, but document that the direct API should not
    be used for bulk inserts.
  4. Missing ProvableCountTree and ProvableCountSumTree in has_subtree_at_path
    grovedb/src/operations/get/mod.rs:2675-2677 — This match was missing ProvableCountTree and
    ProvableCountSumTree before this PR. The PR adds them alongside CommitmentTree, which is a bug
    fix but should be called out explicitly as it changes behavior for existing tree types.

Minor / Style

  1. _grove_version prefix in commitment_tree_anchor
    grovedb/src/operations/commitment_tree.rs:2353 — The parameter is named _grove_version (with
    underscore prefix) but is passed to validate_is_commitment_tree. This should just be
    grove_version since it's used.
  2. #[allow(clippy::drop_non_drop)] appears twice in commitment_tree.rs. Consider if the drop
    ordering can be restructured to avoid needing this.
  3. first_seen HashMap uses bool as value type (commitment_tree.rs:2576). A HashSet would be more
    idiomatic.
  4. Many sinsemilla_hash_calls: 0 additions in test files — Every existing cost test had to add
    this field. While unavoidable, this is a consequence of issue feat: Implement HADS using single RocksDB #4 above — a cleaner cost struct
    design could have avoided this churn.

Security Considerations

  • The frontier serialization correctly validates field elements and handles truncated/invalid
    input
  • The verify_grovedb extension properly detects frontier corruption
  • Multiple CommitmentTreeInsert ops on the same path/key are correctly allowed in batch
    consistency checks (append-only semantics)
  • CommitmentTreeInsert ops cannot be pointed to by references (correctly blocked)

Summary

The PR is well-structured and thoroughly tested for a feature of this scope. The main concerns
are:

  1. Fix the ReplaceTreeRootKey validation regression (issue chore: embed Merk #2) — this is a real safety hole
  2. Verify the average vs worst case cost parameter mismatch (issue fix: circular reference path #3)
  3. Consider refactoring sinsemilla_root out of the generic GroveOp variants to reduce abstraction
    leakage (issue feat: Implement HADS using single RocksDB #4)

The rest are minor improvements that could be addressed in follow-up PRs.

/// Format: `flag(1) + hash(32) [+ value_len(4 BE) + value_bytes]`
/// - flag 0x00 = internal node (no value)
/// - flag 0x01 = leaf node (has value)
pub fn serialize(&self) -> Result<Vec<u8>, MmrError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we change serializtion format in future? Don't we need versioning?

}

/// Generate a Merkle proof for the given positions.
pub fn gen_proof(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we use proove in existing grovedb codebase for this thing?

}

/// The internal MMR size (total node count including internal nodes).
pub fn mmr_size(&self) -> u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just size? It's already GroveMmr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think size is a reserved word if I'm not mistaken.


[dependencies]
blake3 = "1"
bincode = { version = "2.0.0-rc.3", features = ["derive"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=2.0.1

buffer_hash: [u8; 32],
start: u64,
end: u64,
get_aux: F,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aux store is for non consensus data, isnt?


for chunk_idx in first_chunk..=last_chunk {
let key = crate::chunk_key(chunk_idx);
let blob = get_aux(&key)?.ok_or_else(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi get?

let mut buffer_entries = Vec::with_capacity(buffer_count as usize);
for i in 0..buffer_count {
let key = crate::buffer_key(i);
let value = get_aux(&key)?.ok_or_else(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi get?

// 3. Recompute buffer_hash from all buffer entries
let mut computed_buffer_hash = [0u8; 32];
for entry in &self.buffer_entries {
computed_buffer_hash = chain_buffer_hash(&computed_buffer_hash, entry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too expenisve?

// Non-Merk data trees store data in the data namespace as non-Element
// entries. We cannot iterate them with Element::iterator, so just
// clear the storage directly.
if merk_to_clear.tree_type.uses_non_merk_data_storage() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Element is only Merk thing now?


let storage_ctx = self
.db
.get_immediate_storage_context(subtree_path, tx.as_ref())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we use aux storage? we usually store some random keys values there which outside of consenuss? won't be any collissions? shall we use just another column family for this?

QuantumExplorer and others added 14 commits February 20, 2026 02:43
Adds 3 tests demonstrating that reusing checkpoint IDs with ShardTree
checkpoint() silently returns Ok(false), leaving newly appended notes
unreachable by witness_at_checkpoint_depth. This was the root cause of
the Tree does not contain a root at address errors in PMT sync.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants