Problem
The bundle cache at ~/.mpak/cache/ accumulates without bound and duplicates vendored Python dependencies across every cached bundle. Real-world example from one developer's machine after a few months of use:
- Total cache: 3.7 GB
_local/: 2.1 GB across 33 entries (local .mcpb installs from mpak install <path>)
- 16 of those entries are 76 MB each and represent the same bundle rebuilt during dev iteration
- Each entry's
deps/ dominates: pandas (48 MB), numpy (23 MB), cryptography (22 MB), pydantic, fastmcp, etc., copied per-bundle
Two distinct contributors:
- Local-rebuild churn.
prepareLocalBundle() keys cache entries by a hash of the bundle's absolute path (packages/sdk-typescript/src/mpakSDK.ts:280). A versioned filename like dist/mcp-foo-v0.1.0.mcpb → v0.1.1.mcpb changes the path, so each rebuild produces a new cache entry; the old one is never reclaimed. Dominant cause of the 2.1 GB.
- Cross-bundle duplication. Every bundle's
deps/ is extracted as its own copy. Pandas/numpy/cryptography exist N times for N bundles, even though the wheels are usually byte-identical. Grows linearly with bundle count.
There is no eviction policy, no size cap, and no GC. The only cleanup path is manual rm -rf.
Options considered
| Approach |
Fixes churn |
Fixes cross-bundle dup |
Cost |
Notes |
| A. Retention policy + size cap |
yes |
no |
trivial (~50 lines) |
Per-localPath keep-latest-N in _local/; global LRU/size cap across cacheHome. Stopgap. |
B. APFS clonefile / Linux reflinks during extract |
yes |
yes (partial) |
small |
Identical file contents share blocks at the FS layer. Needs a content-hash index to detect "seen this file before." macOS-only natively (APFS); Linux needs btrfs/xfs reflinks. |
| C. Shared venv (one env, all bundles) |
yes |
yes |
medium |
Disqualified — version conflicts across bundles are exactly why deps are vendored. |
| D. Content-addressable storage + hardlinks |
yes |
yes |
significant rewrite |
General, correct, prior art (pnpm, Nix, Bazel). Hardlink edge cases (__pycache__, cross-FS). |
E. Delegate to uv (lockfile in bundle, uv resolves on install) |
yes |
yes |
medium |
Changes bundle contract; requires online install or pre-warmed wheel cache. uv already implements CAS+hardlinks for wheels. |
| F. Symlink farms |
yes |
yes |
medium |
Python + symlinks + __pycache__ is a known footgun. |
| G. Overlay/FUSE |
yes |
yes |
high |
Way too heavy for a single-machine disk cache. |
Recommendation
Phased:
-
Ship A first (low-risk, kills the dominant cause):
- In
_local/, retain only the most recent extraction per localPath from .mpak-local-meta.json. Evict older entries.
- Add an opt-in size cap on
cacheHome with LRU eviction across both _local/ and named registry caches.
- Expose
mpak cache prune and mpak cache info.
-
Then evaluate B (FS clones) for cross-bundle dedup. Lowest-effort path to dedup that doesn't require restructuring storage. macOS gets it via clonefile(2); Linux via FICLONE ioctl on btrfs/xfs; ext4 falls back to regular copy.
-
Reach for D or E only if data justifies it. mpak is a single-machine, single-user disk cache — it does not need the generality CAS protocols provide for multi-host or build-system contexts. Delegating to uv (E) is attractive but changes the bundle contract and deserves its own design discussion.
Files
packages/sdk-typescript/src/cache.ts — MpakBundleCache, registry-bundle cache layout
packages/sdk-typescript/src/mpakSDK.ts:275 — prepareLocalBundle, _local/<hash> entry creation
packages/sdk-typescript/src/helpers.ts:267 — hashBundlePath (path-keyed, not content-keyed)
Acceptance for phase 1
Problem
The bundle cache at
~/.mpak/cache/accumulates without bound and duplicates vendored Python dependencies across every cached bundle. Real-world example from one developer's machine after a few months of use:_local/: 2.1 GB across 33 entries (local.mcpbinstalls frommpak install <path>)deps/dominates: pandas (48 MB), numpy (23 MB), cryptography (22 MB), pydantic, fastmcp, etc., copied per-bundleTwo distinct contributors:
prepareLocalBundle()keys cache entries by a hash of the bundle's absolute path (packages/sdk-typescript/src/mpakSDK.ts:280). A versioned filename likedist/mcp-foo-v0.1.0.mcpb→v0.1.1.mcpbchanges the path, so each rebuild produces a new cache entry; the old one is never reclaimed. Dominant cause of the 2.1 GB.deps/is extracted as its own copy. Pandas/numpy/cryptography exist N times for N bundles, even though the wheels are usually byte-identical. Grows linearly with bundle count.There is no eviction policy, no size cap, and no GC. The only cleanup path is manual
rm -rf.Options considered
localPathkeep-latest-N in_local/; global LRU/size cap acrosscacheHome. Stopgap.clonefile/ Linux reflinks during extract__pycache__, cross-FS).uv(lockfile in bundle, uv resolves on install)__pycache__is a known footgun.Recommendation
Phased:
Ship A first (low-risk, kills the dominant cause):
_local/, retain only the most recent extraction perlocalPathfrom.mpak-local-meta.json. Evict older entries.cacheHomewith LRU eviction across both_local/and named registry caches.mpak cache pruneandmpak cache info.Then evaluate B (FS clones) for cross-bundle dedup. Lowest-effort path to dedup that doesn't require restructuring storage. macOS gets it via
clonefile(2); Linux viaFICLONEioctl on btrfs/xfs; ext4 falls back to regular copy.Reach for D or E only if data justifies it. mpak is a single-machine, single-user disk cache — it does not need the generality CAS protocols provide for multi-host or build-system contexts. Delegating to uv (E) is attractive but changes the bundle contract and deserves its own design discussion.
Files
packages/sdk-typescript/src/cache.ts—MpakBundleCache, registry-bundle cache layoutpackages/sdk-typescript/src/mpakSDK.ts:275—prepareLocalBundle,_local/<hash>entry creationpackages/sdk-typescript/src/helpers.ts:267—hashBundlePath(path-keyed, not content-keyed)Acceptance for phase 1
_local/retains at most N entries perlocalPath(config, default 1)mpak cache inforeports total size, per-bundle size, oldest entrympak cache pruneruns the eviction passes manually