Skip to content

Commit a140d2d

Browse files
AwneeshAwneesh
authored andcommitted
Add Zenodo DOI badge and update model count in README
1 parent 05e0ac7 commit a140d2d

1 file changed

Lines changed: 3 additions & 1 deletion

File tree

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# KVShuttle
22

3+
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.18764713.svg)](https://doi.org/10.5281/zenodo.18764713)
4+
35
Benchmark and decision framework for KV cache transfer compression in disaggregated LLM serving.
46

57
KVShuttle evaluates 14+ compression strategies across multiple models and sequence lengths, providing GPU-calibrated timing data and analytical transfer simulation to help practitioners choose the right compression scheme for their bandwidth regime.
@@ -31,7 +33,7 @@ KVShuttle evaluates 14+ compression strategies across multiple models and sequen
3133
- **Serving framework integration** — reference vLLM adapter (`KVShuttleConnector`) for disaggregated prefill/decode
3234
- **Analytical transfer simulation** — models sequential and pipelined transfer at configurable bandwidths
3335
- **Break-even analysis** — identifies the maximum bandwidth at which each strategy is beneficial
34-
- **Multi-model sweep** — benchmarks across 6 model architectures (Qwen2.5-3B through Llama-3.1-8B)
36+
- **Multi-model sweep** — benchmarks across 7 model architectures (Qwen2.5-3B through Llama-3.1-70B)
3537
- **Learned router** — trains a lightweight classifier to select the best compressor per-request
3638

3739
## Project Structure

0 commit comments

Comments
 (0)