A benchmark comparing floating-point performance and resource efficiency across different container runtimes (standard Linux, static binary, WebAssembly).
-
Source code (
src/):-
mmb.c: Benchmark source. Performs matrix multiplication with integrated validity check. Outputs iterations and throughput in MFLOPS.
-
single_env_bench.py: A Python orchestrator that handles pod deployment, log parsing, and cgroup resource monitoring for a single execution environment.
-
metaorchestrator.py: The main benchmarking script. Coordinates a benchmark suite across multiple execution environments. It is configured exclusively by
bench_config.yaml. -
bench_config.yaml: Single source of truth for benchmark configuration - environments, matrix sizes, trial count, duration, warmup, cgroup sample interval, and output subfolder.
-
-
Dockerfile: Multi-stage build file defining three targets:
debian: Standard GCC build (Debian Slim)static: Statically linked binary (Scratch)wasm: WebAssembly build using WASI SDK (Scratch).
-
Makefile: Manages the full project lifecycle - dependency checking, WASM shim setup, image building, running the benchmark suite, and cleanup.
- Linux with root access (required for
/sys/fs/cgroup) - K3s with containerd
- Docker (BuildKit enabled)
kubectlconfigured for K3s (KUBECONFIG=/etc/rancher/k3s/k3s.yaml)- Python 3 with
pyyaml(pip install pyyaml) - KWasm Operator (for WASM targets)
make check-depsInstall the KWasm operator via Helm:
helm repo add kwasm http://kwasm.sh/kwasm-operator/
helm install -n kwasm --create-namespace kwasm-operator kwasm/kwasm-operatorAnnotate the node to trigger shim binary installation (replace <node-name> with your node):
kubectl annotate node <node-name> kwasm.sh/kwasm-node=trueWait for the installer pod to complete, then create the required symlinks:
make setup-wasmmake image-buildflowchart TD
Root[make image-build]
subgraph DebianFlow [Debian]
direction TB
DebBuild[build standard image]
end
subgraph StaticFlow [Static]
direction TB
StatBuild[build static binary image]
end
subgraph WasmFlow [WASM]
direction TB
WasmBuild[build WASM image]
end
Import[import to k3s containerd]
%% Connections
Root --> DebianFlow
Root --> StaticFlow
Root --> WasmFlow
%% Converge to single import step
DebBuild --> Import
StatBuild --> Import
WasmBuild --> Import
Creates mmb-debian:latest, mmb-static:latest, mmb-wasm:latest and imports them into the K3s containerd registry.
make reset- remove project images from Docker and K3smake hard-reset- also removes build tools pulled by Docker during the build
make runRoot privileges are required to read from /sys/fs/cgroup.
The metaorchestrator.py script automates the benchmarking process across many environments by executing single_env_bench.py for each image in bench_config.yaml, using the shared arguments defined below them.
These are the pre-defined parameters of the benchmark suite:
environments:
- image: mmb-debian:latest
runtime_class: default
- image: mmb-static:latest
runtime_class: default
- image: mmb-wasm:latest
runtime_class: wasmtime
- image: mmb-wasm:latest
runtime_class: wasmedge
- image: mmb-wasm:latest
runtime_class: wasmer
shared_args:
sizes: [256, 1024]
trials: 25
duration: 30
warmup: true
interval: 0.5
namespace: default
results_subfolder_name: test_bench
---
config:
look: classic
theme: redux-dark
---
flowchart TB
subgraph DataSources["Data Sources"]
PodLogs["mmb.c output
(via kubectl logs)"]
Cgroups["cgroup filesystem
(/sys/fs/cgroup)"]
end
subgraph Host[" "]
direction BT
Orchestrator[("Orchestrator")]
DataSources
end
PodLogs -- Iterations, MFLOPS, Validity --> Orchestrator
Cgroups -- CPU Usage, Memory, Throttling --> Orchestrator
The results are saved under results/<subfolder>/ (where <subfolder> is results_subfolder_name from the config), with one .json file per execution environment. Each entry is keyed as <size>_w (with warmup) or <size>_nw (without warmup) and contains:
Lifecycle events, recorded as Unix timestamps:
start: Deployment trigger timerunning_time: Pod reachedRunningstatebench_start: The moment the C application printsBENCH_START(after optional warmup).end: The moment the C application printsBENCH_END.
---
config:
theme: mc
---
gantt
title Trial execution flow
dateFormat X
axisFormat %
section Orchestrator
apply & wait :orch1, 0, 2
logs streaming :orch3, after orch1, 7s
cgroup monitor thread active :crit, orch4, after ben2, 4s
save / cleanup :orch5, after ben4, 1s
section Pod Status
ContainerCreating :pod1, 0, 2
Running :pod2, after pod1, 7s
Terminating :pod3, after pod2, 1s
section App (mmb.c)
init :ben1, after pod1, 1s
warmup (5s) :ben2, after ben1, 1s
workload loop :crit, ben3, after ben2, 4s
validation & exit :ben4, after ben3, 1s
section Timestamps
start :milestone, m0, 0, 0s
running_time :milestone, m1, after pod1, 0s
bench_start :milestone, m2, after ben2, 0s
end :milestone, m3, after ben3, 0s
Metrics extracted directly from the C application's standard output:
iterations: Total matrix multiplications completed.throughput_mflops: Speed in MFLOPS.valid:trueif calculation check passed.
A list of raw snapshots captured from /sys/fs/cgroup at the defined --interval. Each sample object contains:
timestamp: Time of the snapshot.usage_usec: Total CPU time consumed in microseconds.user_usec: CPU time spent in user space.system_usec: CPU time spent in kernel space.nr_throttled: Cumulative count of throttle events.mem_bytes: Total memory usage (RSS + Cache).rss_bytes: Resident Set Size.
Derived from processing the raw cgroup samples:
cold_start_time: Startup latency (running_time-start).avg_cpu_cores: Average CPU usage normalized to cores (e.g., 1.0 = 1.0 cores used).peak_mem_bytes: The highest memory usage observed during the trial.avg_mem_bytes: The average memory usage throughout the execution.throttled_events: The total number of times the CPU was throttled by the cgroup controller.