Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 692 Bytes

File metadata and controls

16 lines (12 loc) · 692 Bytes

OpenQuanta LLM Context

Universal vector compression library. Rust core, Python bindings.

API: import openquanta as oq

  • oq.compress(vectors, dim, bits=3, algo="turbo_mse") → CompressedData
  • oq.decompress(compressed) → list[float]
  • oq.similarity(query, candidates) → list[float]
  • oq.bench(vectors, dim, bits=3) → dict (passed, recall_at_10, kurtosis, ...)
  • oq.save_oq(compressed, path) / oq.load_oq(path) / oq.inspect_oq(path)

Critical: Two algorithms

  • turbo_mse = DEFAULT. Safe for everything. Use for KV cache, attention, RAG.
  • turbo_prod = OPT-IN. Adds QJL for vector search inner product. NEVER for KV cache.

Bits: 2.5, 3 (default), 3.5, 4