You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Vector-first compression for embeddings, retrieval, and KV-cache workloads.**
8
+
**Vector compression with TurboQuant codecs for embeddings, retrieval, and KV-cache. 10x compression, pure NumPy, no GPU required.**
9
9
10
10
Semafold is a vector-first compression toolkit for AI workloads that compresses embeddings, retrieval representations, and cache-shaped KV tensors with explicit byte accounting, typed encode/decode contracts, and validation evidence. It is designed for teams building AI infrastructure that need measurable storage reduction without losing visibility into distortion, artifact size, or integration boundaries.
11
11
@@ -20,6 +20,17 @@ It gives you:
20
20
- deterministic synthetic validation and benchmarks
-`ratio` is baseline bytes divided by measured artifact bytes
227
-
-`smaller` is percentage reduction in total stored bytes
228
-
- artifact size includes payload, sidecars, and metadata
229
-
- larger tensors amortize sidecar overhead better than very small blocks
230
-
231
-
One honest caveat: very small K/V blocks can still beat dense `float32`, while being only roughly equal to or slightly worse than dense `fp16`. The benchmark report shows that tradeoff explicitly.
0 commit comments