Skip to content

Latest commit

 

History

History
83 lines (70 loc) · 2.86 KB

File metadata and controls

83 lines (70 loc) · 2.86 KB

Sparsity

A natively hybrid C++ vector similarity search library with CUDA acceleration. Handles both dense and sparse vectors efficiently.

Features

  • Dual vector support: Optimized for both dense and sparse vectors
  • Distance metrics: L2 (Euclidean), Cosine, and Tanimoto
  • GPU acceleration: CUDA kernels for fast distance computation
  • CPU fallback: Always available when GPU is unavailable

Quick Start

Build

mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release   # Add -DWITH_CUDA=ON for GPU support
make -j$(nproc)

Test

cd build && ctest --output-on-failure

Project Structure

sparsity/
  ├── include/sparsity/        # Public C++ headers
  │   ├── index.h              # Main index interface
  │   ├── dispatch.h           # Index dispatch logic
  │   ├── dense.h              # Dense vector operations
  │   ├── sparse.h             # Sparse vector operations (CSR)
  │   ├── ivf.h                # Inverted File index
  │   ├── metrics.h            # Distance metric declarations
  │   ├── cuda_*.h             # GPU acceleration headers
  │   ├── types.h              # Core data types
  │   └── ...                  # Other specialized headers
  │
  ├── src/
  │   ├── core/                # CPU implementations
  │   │   ├── index.cpp        # Index construction & search
  │   │   ├── ivf.cpp          # IVF index implementation
  │   │   ├── sparse.cpp       # Sparse vector handling
  │   │   ├── metrics/         # CPU metric implementations
  │   │   │   ├── l2.cpp
  │   │   │   ├── cosine.cpp
  │   │   │   ├── tanimoto.cpp
  │   │   │   └── ...
  │   │   └── ...
  │   │
  │   └── cuda/                # GPU implementations
  │       ├── metrics/         # CUDA kernel implementations
  │       │   ├── l2.cu
  │       │   ├── cosine.cu
  │       │   ├── tanimoto.cu
  │       │   └── ...
  │       ├── sparse_dot.cu    # Sparse vector dot product
  │       ├── topk.cu          # Top-K selection kernels
  │       └── ...
  │
  ├── benchmarks/              # Performance evaluation
  │   ├── bench_dense.cpp
  │   ├── bench_sparse_vs_dense.cpp
  │   ├── bench_ivf_vs_brute.cpp
  │   ├── bench_gpu.cpp
  │   └── ...
  │
  ├── tests/                   # Unit and correctness tests
  │
  ├── CMakeLists.txt           # Build configuration
  └── README.md                # This file

Conventions

  • Sparse vectors: CSR format with indices, values, and dimensions
  • Dense vectors: float arrays
  • Sparsity definition: Fraction of zero elements (sparsity = 1 - nnz/dim)