A natively hybrid C++ vector similarity search library with CUDA acceleration. Handles both dense and sparse vectors efficiently.
- Dual vector support: Optimized for both dense and sparse vectors
- Distance metrics: L2 (Euclidean), Cosine, and Tanimoto
- GPU acceleration: CUDA kernels for fast distance computation
- CPU fallback: Always available when GPU is unavailable
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release # Add -DWITH_CUDA=ON for GPU support
make -j$(nproc)cd build && ctest --output-on-failuresparsity/
├── include/sparsity/ # Public C++ headers
│ ├── index.h # Main index interface
│ ├── dispatch.h # Index dispatch logic
│ ├── dense.h # Dense vector operations
│ ├── sparse.h # Sparse vector operations (CSR)
│ ├── ivf.h # Inverted File index
│ ├── metrics.h # Distance metric declarations
│ ├── cuda_*.h # GPU acceleration headers
│ ├── types.h # Core data types
│ └── ... # Other specialized headers
│
├── src/
│ ├── core/ # CPU implementations
│ │ ├── index.cpp # Index construction & search
│ │ ├── ivf.cpp # IVF index implementation
│ │ ├── sparse.cpp # Sparse vector handling
│ │ ├── metrics/ # CPU metric implementations
│ │ │ ├── l2.cpp
│ │ │ ├── cosine.cpp
│ │ │ ├── tanimoto.cpp
│ │ │ └── ...
│ │ └── ...
│ │
│ └── cuda/ # GPU implementations
│ ├── metrics/ # CUDA kernel implementations
│ │ ├── l2.cu
│ │ ├── cosine.cu
│ │ ├── tanimoto.cu
│ │ └── ...
│ ├── sparse_dot.cu # Sparse vector dot product
│ ├── topk.cu # Top-K selection kernels
│ └── ...
│
├── benchmarks/ # Performance evaluation
│ ├── bench_dense.cpp
│ ├── bench_sparse_vs_dense.cpp
│ ├── bench_ivf_vs_brute.cpp
│ ├── bench_gpu.cpp
│ └── ...
│
├── tests/ # Unit and correctness tests
│
├── CMakeLists.txt # Build configuration
└── README.md # This file
- Sparse vectors: CSR format with indices, values, and dimensions
- Dense vectors:
floatarrays - Sparsity definition: Fraction of zero elements (
sparsity = 1 - nnz/dim)