Skip to content

lucadealfaro/snapshot-kernel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Snapshot-Kernel

A Snapshotting Kernel for Python

Luca de Alfaro, 2026

snapshot-kernel is a snapshotting kernel, mainly geared at executing Jupyter Python notebooks. The kernel stores execution states. Executing code from a state generates a new state. In this way, it is possible to store the state after executing each cell of a notebook, and to return to that state if we need to re-execute the cell. The API is implemented as a REST API using bottle.py, and results in very fast startup and shutdown times.

Kernel Basics

The kernel stores states, where a state is a snapshot of the execution environment, including among others:

  • Timestamp when the state was created
  • Variables
  • Imported modules

States are immutable. They are not modified; however, new states can be created from existing states by executing code in that state.

The kernel stores states in the form of a dictionary mapping state names to their content. The main methods that the kernel implements are:

  • execute(code: str, exec_id: str, state_name: str, new_state_name: Optional[str] = None): dict: Executes the given code in the specified state. exec_id is a unique ID for this execution. If new_state_name is provided, the resulting state after execution will be stored under that name. Otherwise, the resulting state will be stored under a new randomly generated unique name. The output of the execution should be a dictionary including at least:

    • output: The output of the execution, if any, as a list in the same format as the output of a Jupyter notebook cell execution.
    • state_name: The name of the state after execution.
    • error: Any error that occurred during execution, if applicable.
  • get_state(state_name: str) -> dict: Retrieves the state associated with the given name. The state should include all variables and imported modules at that point in execution.

  • list_states() -> List[str]: Returns a list of all state names currently stored in the kernel.

  • delete_state(state_name: str): Deletes the state associated with the given name from the kernel.

  • reset(): Resets the kernel by clearing all stored states and returning to an initial empty state.

  • interrupt(exec_id: str): Interrupts the execution associated with the given exec_id. This should stop the execution of the code and return an appropriate response indicating that the execution was interrupted.

Implementation Details

The kernel is written in Python. The Python code is:

  • Indented with 4-space indentation
  • Type hints not necessary
  • Docstrings should be included.

Output Format

The output generated by the kernel is in a format that is compatible with Jupyter notebook cell outputs. This means that the output is a list of dictionaries, where each dictionary represents an output item and has at least the following keys:

  • output_type: A string indicating the type of output (e.g., "stream", "display_data", "execute_result", "error", etc.).
  • data: The actual output data, which can be in various formats depending on the output type (e.g., text, HTML, images, etc.).
  • metadata: Any additional metadata associated with the output, such as MIME types, execution count, etc.

In particular, the kernel can also capture:

  • Figures / plots generated by the code, such as those generated by matplotlib, should be captured and included in the output in a format that can be rendered in a Jupyter notebook (e.g., as base64-encoded PNG images).
  • Rich outputs, such as those generated by libraries like pandas (e.g., DataFrames), should also be captured and included in the output in a format that can be rendered in a Jupyter notebook (e.g., as HTML tables).

The test cases should include tests for these formats.

Communication with the Kernel

Communication with the kernel occurs over a rest API using HTTP requests, and the bottle.py web server, using cheroot as the WSGI server to enable multi-threading. The multithreading is necessary to allow for interrupting long-running executions. Note that we are also not ruling out the possibility of computing multiple states in parallel, generating multiple output states from the same input state.

The bottle server should be launched with a command of the form:

python -m bottle --bind <IP_ADDRESS>:8080 --token=<SECRET_TOKEN> main.py

where kernel_server.py is the file containing the implementation of the kernel and the bottle server, and SECRET_TOKEN is a token that should be specified as a URL parameter in every request for authentication. The server listens for incoming HTTP requests and route them to the appropriate methods of the kernel based on the request path and method (e.g., POST for executing code, GET for retrieving states, etc.).

About

A checkpointing kernel for Python notebooks.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages