Releases: ShipItAndPray/turboquant
Releases · ShipItAndPray/turboquant
TurboQuant v1.0.0 — 6x Compression for Vectors, Embeddings, and LLMs
What's New
Vector Compression Engine (PolarQuant + QJL)
Based on Google's TurboQuant research — 6x compression with near-zero accuracy loss, no training required.
24 Drop-in Adapters
Plug-and-play compression for every major storage system:
- Caches: Redis, Memcached, Ehcache, Hazelcast
- Databases: PostgreSQL (pgvector), MySQL, SQLite, MongoDB (Atlas), DynamoDB, Cassandra
- Vector DBs: Pinecone, Qdrant, ChromaDB, Milvus, Weaviate, FAISS
- Search: Elasticsearch, OpenSearch
- Storage: S3, GCS, Azure Blob
- Embedded: LMDB, RocksDB
- Streaming: Kafka
LLM Quantization CLI
Compress any HuggingFace model to GGUF/GPTQ/AWQ in one command.
GitHub Action
CI/CD pipeline for automated LLM quantization — use in your workflows.
Features
--target ollama|vllm|llamacpp|lmstudio— auto-selects best format--push-to-hub— publish quantized models to HuggingFace--eval— perplexity evaluation after quantization--recommend— hardware-aware format recommendations