v0.4 — KV cache tiering (hot RAM + cold SSD) by magicnight · Pull Request #26 · magicnight/Mac-MLX

magicnight · 2026-04-18T14:54:43Z

Summary

First of three v0.4.0 engine-parity sub-features (with ModelPool + MCP server to follow as separate PRs).

PromptCacheKey — sha256 hash over (modelID, tokens) with 16-way shard layout
PromptCacheStore actor — hot dict + cold safetensors LRU, Sendable snapshot wrapper
MLXSwiftEngine.generate() wired to check store + save updated cache post-generation — HIT/MISS debug logs under `engine` category
Settings → "KV Cache" section with hot/cold budget steppers + Clear All button
New tests: 5 PromptCacheKey cases + 4 PromptCacheStore cases (3 require Metal; skip gracefully on SPM)

MVP scope

Full-prompt hash (new prompt must START with old prompt as prefix to benefit) — vLLM-style block-level chained hashing for longest-common-prefix match is v0.4.0.1
Hot LRU is entry-count-based (8 entries). MB slider is persisted for future byte-accurate budget
Cold tier has no automatic pruning yet; users trigger via "Clear All"

Test plan

MacMLXCore: `swift build` + `swift test` (93/93, 3 skipped for metallib)
Xcode app: `xcodebuild -scheme macMLX -configuration Debug build`
Manual: load a small model, send same prompt twice → second turn shows `Prompt cache HIT` in Logs

🤖 Generated with Claude Code

5 tasks: PromptCacheKey, PromptCacheStore (hot+cold LRU), engine wiring, Settings UI, CHANGELOG. MVP uses full-prompt hash; block- level longest-common-prefix matching deferred to v0.4.0.1.

…sk shard

@unchecked

Actor-based two-tier prompt cache. Hot tier is an in-memory LRU dict keyed by PromptCacheKey. Cold tier is safetensors files under `root/<shard>/<hash>.safetensors` round-tripped via mlx-swift-lm's savePromptCache / loadPromptCache. Eviction from hot persists to cold; cold hits promote back into hot. Introduces PromptCacheSnapshot — an @unchecked Sendable wrapper for [any KVCache] so snapshots can cross the actor isolation boundary (mlx-swift-lm's KVCache has no Sendable conformance). Tests cover put/get hot hit, hot->cold eviction, cold->hot restore, and miss-returns-nil. The three MLX-dependent tests skip when default.metallib is not in the test bundle (standard SPM test binaries often lack it); the miss-path test runs unconditionally.

On each generate call, hash the full input token sequence, look up a prior cache snapshot, and pass it to the token iterator so the shared prefix prefill is skipped. Save the extended snapshot after generation completes so the next turn benefits. MVP keys on exact-prefix match; vLLM-style block hashing with longest-common-prefix matching is v0.4.1+.

cursor · 2026-04-18T14:54:47Z

You have used all of your free Bugbot PR reviews.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

magicnight added 6 commits April 18, 2026 21:34

docs: v0.4 KV cache tiering MVP plan

6ee4f05

5 tasks: PromptCacheKey, PromptCacheStore (hot+cold LRU), engine wiring, Settings UI, CHANGELOG. MVP uses full-prompt hash; block- level longest-common-prefix matching deferred to v0.4.0.1.

feat(prompt-cache): PromptCacheKey — sha256 identifier with 16-way di…

be3f141

…sk shard

feat(kv-cache): Settings UI + budget defaults + Clear All button

94a02dc

docs: v0.4 KV cache tiering changelog entry

2b91d05

magicnight merged commit ace00ba into main Apr 18, 2026
2 checks passed

magicnight deleted the feat/v0.4-kv-cache-tiering branch April 18, 2026 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4 — KV cache tiering (hot RAM + cold SSD)#26

v0.4 — KV cache tiering (hot RAM + cold SSD)#26
magicnight merged 6 commits intomainfrom
feat/v0.4-kv-cache-tiering

magicnight commented Apr 18, 2026

Uh oh!

cursor Bot commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

magicnight commented Apr 18, 2026

Summary

MVP scope

Test plan

Uh oh!

cursor Bot commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant