Status: Ready
Group multiple compatible sketches that share a hash layout so the hash is computed once per insert and fanned out to every sketch in the ensemble. Supports both frequency sketches (CountMin, Count) and cardinality sketches (HyperLogLog variants).
use asap_sketchlib::*;
use asap_sketchlib::HashSketchEnsemble;
// Two CMS + one HLL sharing one hash per insert
let mut ensemble = HashSketchEnsemble::<DefaultXxHasher>::new(vec![
CountMin::<Vector2D<i32>, FastPath>::with_dimensions(3, 4096).into(),
CountMin::<Vector2D<i32>, FastPath>::with_dimensions(3, 4096).into(),
HyperLogLog::<ErtlMLE>::default().into(),
]).unwrap();
// Insert — hashes once, updates all 3 sketches
ensemble.insert(&DataInput::U64(42));
// Query frequency (CMS at index 0)
let freq = ensemble.estimate(0, &DataInput::U64(42)).unwrap();
// Query cardinality (HLL at index 2)
let card = ensemble.cardinality(2).unwrap();
// Pre-computed hash path for hot loops
let hash = ensemble.hash_input(&DataInput::U64(42));
ensemble.insert_with_hash(&hash);
let freq = ensemble.estimate_with_hash(0, &hash).unwrap();Without an ensemble, sharing a hash across sketches requires manual coordination:
let hash = hash_for_matrix_seeded_generic::<MyHasher>(0, rows, cols, &input);
cms_a.fast_insert_with_hash_value(&hash);
cms_b.fast_insert_with_hash_value(&hash);
hll.insert_with_hash(hash.lower_64());HashSketchEnsemble wraps this pattern into a single structure that manages
the hash configuration, validates dimensional compatibility, and exposes a
uniform insert/query API.
HashSketchEnsemble<H = DefaultXxHasher>— the ensemble container, generic over the hasher.EnsembleSketch— enum wrapping the sketch variants that can live inside an ensemble:CountMinFast,CountFast,HllErtl,HllRegular,HllHip.
Only sketches with a prehashed insertion path are accepted:
CountMin<_, FastPath, _>— Count-Min Sketch (fast path)Count<_, FastPath, _>— Count (Count Sketch, fast path)HyperLogLog<ErtlMLE>/HyperLogLog<Classic>/HyperLogLogHIP
All matrix-backed sketches (CMS / Count) in one ensemble must share the same hash layout (rows × cols dimensions). HLL sketches can coexist with them because they only consume the lower 64 bits of the shared hash.
fn new(sketches: Vec<EnsembleSketch>) -> Result<Self, &'static str>
fn push(&mut self, sketch: EnsembleSketch) -> Result<(), &'static str>Sketches are converted into EnsembleSketch via From impls, so you can use .into():
use asap_sketchlib::{
CountMin, Count, FastPath, Vector2D, DefaultXxHasher,
HyperLogLog, ErtlMLE,
};
use asap_sketchlib::sketch_framework::hashlayer::{EnsembleSketch, HashSketchEnsemble};
let ensemble = HashSketchEnsemble::<DefaultXxHasher>::new(vec![
CountMin::<Vector2D<i32>, FastPath>::with_dimensions(3, 4096).into(),
Count::<Vector2D<i32>, FastPath>::with_dimensions(3, 4096).into(),
HyperLogLog::<ErtlMLE>::default().into(),
]).expect("compatible sketches");fn hash_input(&self, input: &DataInput) -> H::HashTypeComputes the shared hash for an input using the ensemble's hash configuration and hasher H.
fn insert(&mut self, val: &DataInput)
fn insert_with_hash(&mut self, hash: &H::HashType)
fn insert_at(&mut self, indices: &[usize], val: &DataInput)
fn insert_at_with_hash(&mut self, indices: &[usize], hash: &H::HashType)
fn bulk_insert(&mut self, values: &[DataInput])
fn bulk_insert_with_hashes(&mut self, hashes: &[H::HashType])
fn bulk_insert_at(&mut self, indices: &[usize], values: &[DataInput])
fn bulk_insert_at_with_hashes(&mut self, indices: &[usize], hashes: &[H::HashType])insert hashes once and fans out to all sketches. The _at variants target specific sketch indices. The bulk_ variants accept slices for batch processing.
Frequency and cardinality queries are split because CMS/Count and HLL answer fundamentally different questions.
fn estimate(&self, index: usize, val: &DataInput) -> Result<f64, &'static str>
fn estimate_with_hash(&self, index: usize, hash: &H::HashType) -> Result<f64, &'static str>Returns an error if the index is out of bounds or the sketch is not a frequency sketch.
fn cardinality(&self, index: usize) -> Result<f64, &'static str>Returns an error if the index is out of bounds or the sketch is not an HLL variant.
fn len(&self) -> usize
fn is_empty(&self) -> bool
fn get(&self, index: usize) -> Option<&EnsembleSketch>
fn get_mut(&mut self, index: usize) -> Option<&mut EnsembleSketch>No ensemble-level merge API.
No dedicated serialization API.
- All matrix-backed sketches in one ensemble must have identical dimensions.
- Calling
estimateon an HLL sketch returns an error; usecardinalityinstead (and vice versa).
Core optimization layer; actively used and tested.