Skip to content

Move filesystem to stable memory with /tmp heap fallback #32

@deucalioncodes

Description

@deucalioncodes

Summary

Move the canister filesystem from heap-only (Python dicts) to stable memory, with /tmp as a fast heap-backed exception for scratch data.

Problem

Currently, memfs stores all files in Python dicts (_MEMFS, _MEMFS_DIRS, _MEMFS_MTIMES) in Wasm linear memory (heap). This has several issues:

  • 4 GB limit — heap is capped, no room for large datasets
  • Inconsistent persistence model — Entity metadata (e.g. Codex) lives in StableBTreeMap (stable memory), but the actual file content lives in heap. If the heap breaks but stable memory survives, you get orphaned entity records.
  • No atomicity — a Python trap mid-write can leave the dict in an inconsistent state
  • Mental model mismatch — users expect open() to write to persistent storage. Heap "feels" volatile even though IC preserves it across upgrades.

Proposed behavior

Route open() based on path prefix:

# Persistent — stored in stable memory (survives upgrades, reinstalls via backup)
with open("/data/report.txt", "w") as f:
    f.write("important")

# Temporary — heap only, fast, for scratch work  
with open("/tmp/cache.txt", "w") as f:
    f.write("throwaway")
Path Storage Speed Capacity Survives upgrade
/tmp/* Heap (Python dict) Fast 4 GB shared Yes (IC orthogonal persistence)
Everything else Stable memory (StableBTreeMap) ~10-100x slower 400 GB Yes

This mirrors how real operating systems work — /tmp is RAM-backed, everything else is disk.

Implementation

  1. Add two Rust FFI functions: stable_fs_read(path) -> bytes and stable_fs_write(path, bytes)
    • Uses an existing or new StableBTreeMap instance with path strings as keys
    • Directory listings and mtimes stored as stable entries too
  2. Patch builtins.open in frozen_stdlib_preamble.py to route based on path prefix:
    • /tmp/* → current heap-based memfs (unchanged)
    • All other paths → stable memory via FFI
  3. Patch os.listdir, os.remove, os.rename, etc. to match

Impact

  • Codex code content becomes truly stable-memory-backed (consistent with entity metadata)
  • File storage scales to 400 GB instead of competing for 4 GB heap
  • Clear, familiar semantics for developers
  • Minimal Rust FFI work — same pattern as existing StableBTreeMap bindings

Trade-offs

  • File reads/writes become slower (~10-100x) for non-tmp paths
  • Existing canisters would need a migration to move heap files to stable memory on upgrade

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions