Skip to content

afourniernv/CuVs-Bencher

cuvs-bencher

HTTP-based vector benchmarking for deployed search services (Elasticsearch GPU first). Benchmark recall, latency, throughput over REST—no in-process algorithm runs.

Install

uv pip install "cuvs-bencher[elastic]"

From workspace: uv sync --extra elastic

Download dataset

Pulls OpenAI 5M embeddings from object storage and converts to big-ann-bench format (base.*.fbin, queries.fbin, groundtruth*.ibin):

# From repo: install download deps, then run
uv sync --extra download
uv run python scripts/download_openai_5m.py ./data

Output: ./data/ with base.5M.fbin, queries.fbin, groundtruth.5M.neighbors.ibin (~28 GiB).

Run benchmark

cuvs-bench --data-dir ./data --host localhost --port 9200 --num-docs 100000 --json

Python API

Returns a BenchmarkResult with typed attributes, .summary(), and .to_dict().

import asyncio
from pathlib import Path
from cuvs_bencher_core import BenchmarkResult, run_vector_benchmark

result: BenchmarkResult = asyncio.run(run_vector_benchmark(
    backend="elastic",
    data_dir=Path("./data"),
    num_docs=100_000,
    num_search_queries=1000,
    host="localhost",
    port=9200,
))

print(result.summary())           # Indexed 100,000 (1536d) @ 8,123 docs/s; p50=2.10ms; ~950 qps; recall@10=0.9412
result.recall_at_k                 # 0.9412
result.search_latency_p50_ms       # 2.1
result.to_dict()                   # raw dict for JSON

Structure

cuvs-bencher/
├── packages/cuvs_bencher_core/   # telemetry, datasets, metrics, backend protocol
└── packages/cuvs_bencher_elastic/# Elasticsearch backend + cuvs-bench CLI

About

Benchmark vector search APIs—pluggable backends over HTTP/gRPC. Recall@k, latency, qps.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages