Summary
Move the canister filesystem from heap-only (Python dicts) to stable memory, with /tmp as a fast heap-backed exception for scratch data.
Problem
Currently, memfs stores all files in Python dicts (_MEMFS, _MEMFS_DIRS, _MEMFS_MTIMES) in Wasm linear memory (heap). This has several issues:
- 4 GB limit — heap is capped, no room for large datasets
- Inconsistent persistence model — Entity metadata (e.g. Codex) lives in StableBTreeMap (stable memory), but the actual file content lives in heap. If the heap breaks but stable memory survives, you get orphaned entity records.
- No atomicity — a Python trap mid-write can leave the dict in an inconsistent state
- Mental model mismatch — users expect
open() to write to persistent storage. Heap "feels" volatile even though IC preserves it across upgrades.
Proposed behavior
Route open() based on path prefix:
# Persistent — stored in stable memory (survives upgrades, reinstalls via backup)
with open("/data/report.txt", "w") as f:
f.write("important")
# Temporary — heap only, fast, for scratch work
with open("/tmp/cache.txt", "w") as f:
f.write("throwaway")
| Path |
Storage |
Speed |
Capacity |
Survives upgrade |
/tmp/* |
Heap (Python dict) |
Fast |
4 GB shared |
Yes (IC orthogonal persistence) |
| Everything else |
Stable memory (StableBTreeMap) |
~10-100x slower |
400 GB |
Yes |
This mirrors how real operating systems work — /tmp is RAM-backed, everything else is disk.
Implementation
- Add two Rust FFI functions:
stable_fs_read(path) -> bytes and stable_fs_write(path, bytes)
- Uses an existing or new StableBTreeMap instance with path strings as keys
- Directory listings and mtimes stored as stable entries too
- Patch
builtins.open in frozen_stdlib_preamble.py to route based on path prefix:
/tmp/* → current heap-based memfs (unchanged)
- All other paths → stable memory via FFI
- Patch
os.listdir, os.remove, os.rename, etc. to match
Impact
- Codex code content becomes truly stable-memory-backed (consistent with entity metadata)
- File storage scales to 400 GB instead of competing for 4 GB heap
- Clear, familiar semantics for developers
- Minimal Rust FFI work — same pattern as existing StableBTreeMap bindings
Trade-offs
- File reads/writes become slower (~10-100x) for non-tmp paths
- Existing canisters would need a migration to move heap files to stable memory on upgrade
Related
Summary
Move the canister filesystem from heap-only (Python dicts) to stable memory, with
/tmpas a fast heap-backed exception for scratch data.Problem
Currently, memfs stores all files in Python dicts (
_MEMFS,_MEMFS_DIRS,_MEMFS_MTIMES) in Wasm linear memory (heap). This has several issues:open()to write to persistent storage. Heap "feels" volatile even though IC preserves it across upgrades.Proposed behavior
Route
open()based on path prefix:/tmp/*This mirrors how real operating systems work —
/tmpis RAM-backed, everything else is disk.Implementation
stable_fs_read(path) -> bytesandstable_fs_write(path, bytes)builtins.openinfrozen_stdlib_preamble.pyto route based on path prefix:/tmp/*→ current heap-based memfs (unchanged)os.listdir,os.remove,os.rename, etc. to matchImpact
Trade-offs
Related