Skip to content

Rust-backed file store: eliminate pre_upgrade serialization bottleneck #35

@deucalioncodes

Description

@deucalioncodes

Problem

The current file persistence uses a Python-dict-backed StableBTreeMap (memory_id=255). During pre_upgrade, the entire file store is serialized as JSON+base64 and written to stable memory in one shot. This has two failure modes:

  1. 4GB stable memory limit — base64 overhead means ~3GB max raw file storage
  2. Instruction limit — JSON serialization of many files can exceed the ~5B instruction budget for pre_upgrade, causing the upgrade to trap and leaving the canister in an inconsistent state

Both failures result in data loss with no warning as you approach the limits.

Solution

Replace the Python-dict file store with a Rust-backed StableBTreeMap<String, Vec<u8>> from ic-stable-structures. This stores file data directly in stable memory on every close(), making pre_upgrade a no-op for files.

Implementation plan

  1. Add a Rust StableBTreeMap<String, Vec<u8>> in the canister template for file storage
  2. Expose native Python APIs via _basilisk_ic: file_store_get, file_store_set, file_store_delete, file_store_keys
  3. Modify _PersistentFile / _persistent_open to use the native API instead of the Python dict
  4. Remove file store serialization from _basilisk_save_stable_maps / _basilisk_load_stable_maps
  5. Backward compatibility: migrate existing Python-dict file store data to Rust-backed store on first upgrade

Benefits

  • No serialization during upgrades — eliminates both the 4GB and instruction-limit risks
  • Writes are immediately durable — no data loss if upgrade fails
  • Scales to full stable memory capacity (currently 8GB on IC, growing)
  • Important for the agent-programmable-canister use case (Support importing user-uploaded .py files on-canister #34) where codebases are uploaded to memfs

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions