Commit 72ae6b2
quantcpp 0.9.0: KV compression ON by default in Python bindings
BREAKTHROUGH: kv_compress=1 was never broken in quant.h — the v0.8.1
abort was caused by the libc.free() cross-heap bug (fixed in v0.8.2
via quant_free_string), not by the UNIFORM_4B KV path. We isolated
the wrong variable because kv_compress=0 AND skip-free were changed
simultaneously in the v0.8.1 hotfix.
Verified in standalone C AND Python ctypes: kv_compress=1 (UNIFORM_4B)
works cleanly on SmolLM2-135M with quant_free_string. This is honest
correction #8: "we disabled a working feature because of incorrect
root cause analysis."
Changes:
- kv_compress default restored to 1 (was 0 since v0.8.1)
- kv_compress warning/fallback guard removed
- Version bumped to 0.9.0 (major: KV compression is now the default
experience for all pip users)
The headline value proposition now flows through both distribution
channels identically:
CLI: quant model.gguf -k turbo_kv_4b → 7x KV compression
Python: Model("model.gguf") → 4-bit KV compression
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 7b09851 commit 72ae6b2
2 files changed
+3
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
181 | 181 | | |
182 | 182 | | |
183 | 183 | | |
184 | | - | |
| 184 | + | |
185 | 185 | | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | 186 | | |
207 | 187 | | |
208 | 188 | | |
| |||
0 commit comments