Releases · ShipItAndPray/turboquant · GitHub

25 Mar 17:06

TurboQuant v1.0.0 — 6x Compression for Vectors, Embeddings, and LLMs Latest

Latest

What's New

Vector Compression Engine (PolarQuant + QJL)

Based on Google's TurboQuant research — 6x compression with near-zero accuracy loss, no training required.

24 Drop-in Adapters

Plug-and-play compression for every major storage system:

Caches: Redis, Memcached, Ehcache, Hazelcast
Databases: PostgreSQL (pgvector), MySQL, SQLite, MongoDB (Atlas), DynamoDB, Cassandra
Vector DBs: Pinecone, Qdrant, ChromaDB, Milvus, Weaviate, FAISS
Search: Elasticsearch, OpenSearch
Storage: S3, GCS, Azure Blob
Embedded: LMDB, RocksDB
Streaming: Kafka

LLM Quantization CLI

Compress any HuggingFace model to GGUF/GPTQ/AWQ in one command.

GitHub Action

CI/CD pipeline for automated LLM quantization — use in your workflows.

Features

--target ollama|vllm|llamacpp|lmstudio — auto-selects best format
--push-to-hub — publish quantized models to HuggingFace
--eval — perplexity evaluation after quantization
--recommend — hardware-aware format recommendations

Assets 2