Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,010 changes: 1,010 additions & 0 deletions Cargo.lock

Large diffs are not rendered by default.

16 changes: 16 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,19 @@ serde_json = "1"
clap = { version = "4", features = ["derive"] }
thiserror = "2"
libc = "0.2"
base64 = "0.22"

# containerd mode-2 (feature "containerd"): drive containerd's gRPC API in
# process. Optional so the default build stays lean and needs no protoc.
containerd-client = { version = "0.9", optional = true }
tonic = { version = "0.14", default-features = false, features = ["codegen", "channel"], optional = true }
prost-types = { version = "0.14", optional = true }
tokio = { version = "1", features = ["rt", "rt-multi-thread", "macros", "net", "time", "signal"], optional = true }
sha2 = { version = "0.10", optional = true }
hex = { version = "0.4", optional = true }
anyhow = { version = "1", optional = true }

[features]
default = []
# Compile the in-process containerd backend (pulls tonic/tokio; needs host protoc).
containerd = ["dep:containerd-client", "dep:tonic", "dep:prost-types", "dep:tokio", "dep:sha2", "dep:hex", "dep:anyhow"]
5 changes: 5 additions & 0 deletions examples/inception/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Fetched/generated by setup.sh, not committed.
tools/
Orchfile.run
Orchfile.stress
state/
13 changes: 13 additions & 0 deletions examples/inception/Orchfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Inception: orchd boots a Linux VM with its own apple runtime (osx mode), sizes
# it from this spec, mounts the containerd toolchain in, and runs the nested
# containerd test inside it. See README.md.
#
# setup.sh rewrites __TOOLS__ to the absolute path of ./tools and writes the
# runnable copy as Orchfile.run.

SERVICE ctd
FROM docker.io/library/debian:bookworm-slim
MEMORY 3G
CPUS 3
VOLUME __TOOLS__:/opt/tools
CMD /bin/sh /opt/tools/run-test.sh
74 changes: 74 additions & 0 deletions examples/inception/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Inception: orchd running containerd, inside a container orchd booted

A composability test, and the most demanding one in this repo. It exercises
**both** of orchd's runtimes at once, nested:

```
macOS host
└─ orchd --runtime apple (osx mode) <- boots a Linux microVM, no daemon
└─ Debian VM (sized + mounted from the Orchfile spec)
└─ orchd --runtime containerd <- drives containerd inside the VM
└─ containerd -> a container (alpine)
```

orchd orchestrates orchd, two runtimes deep, in a box orchd created. If the
spec isn't honored end to end, this doesn't run: the outer VM needs the
**memory** and **cpus** from the Orchfile, and the containerd toolchain is
**mounted** in as a volume (not baked into an image, which would never fit the
in-RAM initramfs). So this doubles as the proof that orchd-osx honors the full
service spec (memory / cpus / volumes).

## What's here

| file | role |
|------|------|
| `Orchfile` | the **outer** unit: boot a Debian VM, sized, with the toolchain mounted, running the driver |
| `run-test.sh` | runs **inside** the VM: starts containerd, then has the inner orchd drive it |
| `inner-Orchfile` | the **inner** workload the containerd runtime runs (an alpine container) |
| `stress.sh` / `inner-stress-Orchfile` | the stress variant: repeated grow/fell cycles over several containers, leak-checked against containerd's own state |
| `setup.sh` | stages `tools/` (builds the Linux orchd, fetches containerd + runc) and writes runnable `Orchfile.run` / `Orchfile.stress` |

`tools/` (containerd + runc + the Linux orchd) is fetched/built by
`setup.sh`, not committed.

## Run it

```sh
cd examples/inception
./setup.sh # builds the linux orchd, fetches containerd + runc, stages tools/
ORCHD_APPLE_MODE=osx \
orchd --orchfile Orchfile.run --runtime apple --platform orchdi \
--state-dir ./state grow
# watch the nested test:
tail -f ./state/logs/orch.ctd.log
```

You should see, from inside the VM: containerd come up, then
`orchd --runtime containerd grow` pull and run the inner alpine container, and
containerd's own `ctr tasks ls` report it RUNNING.

## Stress it

Same VM, but the driver runs repeated grow/fell cycles over several containers
and asserts containerd is left with zero leaked tasks/containers/snapshots
after each teardown:

```sh
ORCHD_APPLE_MODE=osx \
orchd --orchfile Orchfile.stress --runtime apple --platform orchdi \
--state-dir ./state grow
tail -f ./state/logs/orch.ctd.log # ends with RESULT: PASS
```

Note: graceful stop allows a grace period before SIGKILL, so each `fell` takes
several seconds to settle if the container's PID 1 ignores SIGTERM (same as
`docker stop`'s default). The cycle test waits that out before leak-checking.

## Requirements

- macOS on Apple silicon, the orchd-osx runtime built + signed (`just build-osx`)
and the kernel fetched (`just kernel`).
- **~3 GiB of free RAM.** The Orchfile asks for a 3 GiB / 3 cpu VM; containerd
plus a nested container needs the room. On an 8 GiB machine, close other VMs
and memory-heavy apps first (`colima stop`, `container system stop`, browsers)
or the VM start fails with `BootFailed` (the host simply can't spare the RAM).
5 changes: 5 additions & 0 deletions examples/inception/inner-Orchfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Inner workload: what orchd-inside-the-VM drives through containerd.
# orchd's containerd runtime pulls and runs it over containerd's gRPC API.
SERVICE web
FROM docker.io/library/alpine:latest
CMD sleep 300
12 changes: 12 additions & 0 deletions examples/inception/inner-stress-Orchfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Stress workload: several containers (mixed images) for the cycle test.
SERVICE a
FROM docker.io/library/alpine:latest
CMD sleep 600

SERVICE b
FROM docker.io/library/alpine:latest
CMD sleep 600

SERVICE c
FROM docker.io/library/busybox:latest
CMD sleep 600
40 changes: 40 additions & 0 deletions examples/inception/run-test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/bin/sh
# Runs INSIDE the orchd-osx VM (Debian). Starts containerd from the mounted
# toolchain, then has our Linux orchd drive it via the in-process containerd
# backend (the container runs in the host netns, so no CNI/iptables). Verbose so
# the detached supervisor's logfile tells the whole story.
set -u
log(){ echo "[inception] $*"; }
export PATH=/opt/tools/bin:$PATH

# containerd pulls images itself; debian-slim ships no CA bundle, so give it one
# (system path + the env Go reads) for registry TLS.
mkdir -p /etc/ssl/certs
cp /opt/tools/ca-bundle.crt /etc/ssl/certs/ca-certificates.crt
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt

log "STAGE 0: $(uname -m); cgroup=$(stat -fc %T /sys/fs/cgroup 2>/dev/null)"

log "STAGE 1: start containerd"
mkdir -p /run/containerd /var/lib/containerd
containerd >/var/log/containerd.log 2>&1 &
for i in $(seq 1 20); do ctr version >/dev/null 2>&1 && break; sleep 1; done
if ! ctr version >/dev/null 2>&1; then
log "containerd did NOT come up:"; tail -25 /var/log/containerd.log; exit 1
fi
log "containerd up: $(ctr --version 2>/dev/null)"

log "STAGE 2: orchd drives containerd via its gRPC API"
mkdir -p /run/orchd
orchd --orchfile /opt/tools/inner-Orchfile --runtime containerd --platform orchdi --state-dir /run/orchd grow
log "orchd grow rc=$?"
sleep 8
log "--- orchd survey (what orchd supervises) ---"
orchd --platform orchdi --state-dir /run/orchd survey
log "--- containerd's own view (ctr), proving the task is real ---"
for n in $(ctr namespaces ls -q 2>/dev/null); do
log "namespace=$n"; ctr -n "$n" tasks ls 2>&1; ctr -n "$n" containers ls 2>&1
done
log "--- supervisor log ---"; tail -20 /run/orchd/logs/*.log 2>/dev/null
log "=== DONE ==="
sleep 5
59 changes: 59 additions & 0 deletions examples/inception/setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env bash
# Stage the inception example: build the static Linux orchd, fetch the container
# runtime (containerd + runc), and lay out tools/ exactly as the Orchfile mounts
# it. Idempotent.
set -euo pipefail

here="$(cd "$(dirname "$0")" && pwd)"
repo="$(cd "$here/../.." && pwd)"
tools="$here/tools"

mkdir -p "$tools/bin"

echo "==> building static aarch64-linux orchd"
( cd "$repo" && just build-linux >/dev/null )
cp "$repo/target/aarch64-unknown-linux-musl/release/orchd" "$tools/bin/orchd"

echo "==> fetching containerd + runc (the runtime)"
if [ ! -e "$tools/bin/containerd" ]; then
cver="$(gh api repos/containerd/containerd/releases/latest --jq '.tag_name' | sed 's/^v//')"
echo " containerd ${cver}"
curl -fsSL "https://github.com/containerd/containerd/releases/download/v${cver}/containerd-${cver}-linux-arm64.tar.gz" \
| tar -xz -C "$tools" # -> bin/containerd, bin/ctr, bin/containerd-shim-runc-v2
fi
if [ ! -e "$tools/bin/runc" ]; then
rurl="$(gh api repos/opencontainers/runc/releases/latest \
--jq '.assets[] | select(.name=="runc.arm64") | .browser_download_url')"
echo " $rurl"
curl -fsSL "$rurl" -o "$tools/bin/runc"
chmod +x "$tools/bin/runc"
fi

echo "==> copying the in-VM driver + inner workloads into tools/"
cp "$here/run-test.sh" "$tools/run-test.sh"
cp "$here/inner-Orchfile" "$tools/inner-Orchfile"
cp "$here/stress.sh" "$tools/stress.sh"
cp "$here/inner-stress-Orchfile" "$tools/inner-stress-Orchfile"

echo "==> staging a CA bundle (debian-slim has none; containerd needs it for registry TLS)"
if [ -f /etc/ssl/cert.pem ]; then
cp /etc/ssl/cert.pem "$tools/ca-bundle.crt"
else
curl -fsSL https://curl.se/ca/cacert.pem -o "$tools/ca-bundle.crt"
fi

echo "==> writing runnable Orchfiles (absolute volume path)"
sed "s|__TOOLS__|$tools|" "$here/Orchfile" > "$here/Orchfile.run"
# stress variant: same VM, CMD runs the grow/fell cycle test.
sed "s|__TOOLS__|$tools|; s|/opt/tools/run-test.sh|/opt/tools/stress.sh|" \
"$here/Orchfile" > "$here/Orchfile.stress"

cat <<EOF

staged -> $tools ($(du -sh "$tools" | cut -f1))

Run it:
ORCHD_APPLE_MODE=osx orchd --orchfile "$here/Orchfile.run" \\
--runtime apple --platform orchdi --state-dir "$here/state" grow
tail -f "$here/state/logs/orch.ctd.log"
EOF
45 changes: 45 additions & 0 deletions examples/inception/stress.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/sh
# Stress the containerd runtime: repeated grow/fell cycles over multiple
# containers, leak-checked against containerd's own state — but only AFTER
# teardown has actually finished (containerd-run processes gone), so we measure
# the settled state, not mid-grace.
set -u
log(){ echo "[stress] $*"; }
export PATH=/opt/tools/bin:$PATH
mkdir -p /etc/ssl/certs && cp /opt/tools/ca-bundle.crt /etc/ssl/certs/ca-certificates.crt
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
NS=orch

log "start containerd"
mkdir -p /run/containerd /var/lib/containerd
containerd >/var/log/containerd.log 2>&1 &
for i in $(seq 1 20); do ctr version >/dev/null 2>&1 && break; sleep 1; done
ctr version >/dev/null 2>&1 || { log "containerd FAILED"; exit 1; }
log "containerd up"

running(){ ctr -n $NS tasks ls 2>/dev/null | grep -c RUNNING; }
containers(){ ctr -n $NS containers ls -q 2>/dev/null | grep -c . ; }
leaked_snaps(){ ctr -n $NS snapshots ls 2>/dev/null | grep -cE "^orch-[abc] "; }
crun_alive(){ ps -eo args 2>/dev/null | grep -q "[c]ontainerd-run"; }

FAIL=0
for cycle in 1 2; do
log "===== CYCLE $cycle ====="
rm -rf /run/orchd; mkdir -p /run/orchd
orchd --orchfile /opt/tools/inner-stress-Orchfile --runtime containerd --platform orchdi --state-dir /run/orchd grow >/dev/null 2>&1
sleep 8
r=$(running); log "grow -> RUNNING=$r (expect 3)"
[ "$r" = "3" ] || FAIL=1

orchd --platform orchdi --state-dir /run/orchd fell >/dev/null 2>&1
log "fell issued; waiting 16s for teardown grace to settle..."
sleep 16
rt=$(running); ct=$(containers); sn=$(leaked_snaps)
log "settled -> running=$rt containers=$ct snaps=$sn (expect 0/0/0)"
[ "$rt" = "0" ] && [ "$ct" = "0" ] && [ "$sn" = "0" ] || FAIL=1
done

log "===== FINAL ====="
log "tasks:"; ctr -n $NS tasks ls 2>&1
log "RESULT: $([ $FAIL = 0 ] && echo PASS || echo FAIL)"
sleep 3
77 changes: 77 additions & 0 deletions examples/inception/stress2.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
#!/bin/sh
# Extended stress for the containerd runtime + orchdi supervisor:
# TEST 1 fan-out: 6 containers up, clean teardown
# TEST 2 oneshot: a container that exits is not restarted and is cleaned up
# TEST 3 crash/restart: RESTART on-failure actually restarts, and fell stops it
# One VM boot; leak-checked against containerd's own state.
set -u
log(){ echo "[stress2] $*"; }
export PATH=/opt/tools/bin:$PATH
mkdir -p /etc/ssl/certs && cp /opt/tools/ca-bundle.crt /etc/ssl/certs/ca-certificates.crt
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
NS=orch
running(){ ctr -n $NS tasks ls 2>/dev/null | grep -c RUNNING; }
ctrs(){ ctr -n $NS containers ls -q 2>/dev/null | grep -c . ; }
sd(){ orchd --platform orchdi --state-dir /run/orchd "$@"; }
reset(){ rm -rf /run/orchd; mkdir -p /run/orchd; }

mkdir -p /run/containerd /var/lib/containerd
containerd >/var/log/containerd.log 2>&1 &
for i in $(seq 1 20); do ctr version >/dev/null 2>&1 && break; sleep 1; done
ctr version >/dev/null 2>&1 || { log "containerd FAILED"; exit 1; }
log "containerd up"
FAIL=0

log "===== TEST 1: fan-out (6 containers) ====="
reset
i=1; : > /run/fan.Orchfile
for n in a b c d e f; do
printf 'SERVICE %s\nFROM docker.io/library/alpine:latest\nCMD sleep 600\n' "$n" >> /run/fan.Orchfile
done
sd --orchfile /run/fan.Orchfile --runtime containerd grow >/dev/null 2>&1
sleep 16
r=$(running); log "running=$r (expect 6)"; [ "$r" = "6" ] || FAIL=1
sd fell >/dev/null 2>&1; log "fell; settling 16s..."; sleep 16
r=$(running); c=$(ctrs); log "after fell running=$r containers=$c (expect 0/0)"; { [ "$r" = "0" ] && [ "$c" = "0" ]; } || FAIL=1

log "===== TEST 2: oneshot (exits, must NOT restart) ====="
reset
printf 'SERVICE once\nFROM docker.io/library/alpine:latest\nCMD true\nONESHOT true\n' > /run/once.Orchfile
sd --orchfile /run/once.Orchfile --runtime containerd grow >/dev/null 2>&1
sleep 10
r=$(running); c=$(ctrs); log "running=$r containers=$c (expect 0/0 — ran once and cleaned up)"; { [ "$r" = "0" ] && [ "$c" = "0" ]; } || FAIL=1
log "survey (oneshot should not be running):"; sd survey
sd fell >/dev/null 2>&1; sleep 3

log "===== TEST 3: crash/restart (RESTART on-failure) ====="
reset
printf 'SERVICE crash\nFROM docker.io/library/alpine:latest\nCMD false\nRESTART on-failure\nRESTART_DELAY 2s\n' > /run/crash.Orchfile
sd --orchfile /run/crash.Orchfile --runtime containerd grow >/dev/null 2>&1
sleep 20
restarts=$(grep -ch "restart #" /run/orchd/logs/*.log 2>/dev/null | head -1)
log "restarts observed in ~20s: ${restarts:-0} (expect >= 2)"; [ "${restarts:-0}" -ge 2 ] || FAIL=1
before=$(grep -ch "restart #" /run/orchd/logs/*.log 2>/dev/null | head -1)
sd fell >/dev/null 2>&1; log "fell; checking the restart loop stops..."; sleep 10
after=$(grep -ch "restart #" /run/orchd/logs/*.log 2>/dev/null | head -1)
c=$(ctrs); log "after fell: restarts froze (${before:-0} -> ${after:-0}), containers=$c (expect frozen, 0)"
{ [ "${before:-0}" = "${after:-0}" ] && [ "$c" = "0" ]; } || FAIL=1

log "===== TEST 4: spec alignment (volume + env + memory cgroup honored) ====="
reset
mkdir -p /run/vol; echo "VOLUME-OK" > /run/vol/marker
printf 'SERVICE sa\nFROM docker.io/library/alpine:latest\nCMD sleep 600\nMEMORY 64M\nENV FOO=bar\nVOLUME /run/vol:/mnt\n' > /run/sa.Orchfile
sd --orchfile /run/sa.Orchfile --runtime containerd grow >/dev/null 2>&1
sleep 8
r=$(running); log "running=$r (container with MEMORY+VOLUME+ENV; expect 1)"; [ "$r" = "1" ] || FAIL=1
vol=$(ctr -n $NS tasks exec --exec-id v orch-sa cat /mnt/marker 2>/dev/null | tr -d '\r')
log "volume /mnt/marker = '$vol' (expect VOLUME-OK)"; [ "$vol" = "VOLUME-OK" ] || FAIL=1
e=$(ctr -n $NS tasks exec --exec-id e orch-sa printenv FOO 2>/dev/null | tr -d '\r')
log "env FOO = '$e' (expect bar)"; [ "$e" = "bar" ] || FAIL=1
pid=$(ctr -n $NS tasks ls 2>/dev/null | grep orch-sa | awk '{print $2}')
cg=$(awk -F: '{print $3}' /proc/"$pid"/cgroup 2>/dev/null)
mem=$(cat /sys/fs/cgroup"$cg"/memory.max 2>/dev/null)
log "cgroup memory.max = '$mem' (expect 67108864 = 64M)"; [ "$mem" = "67108864" ] || FAIL=1
sd fell >/dev/null 2>&1; sleep 14

log "RESULT: $([ $FAIL = 0 ] && echo PASS || echo FAIL)"
sleep 3
16 changes: 16 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ prefix := env_var_or_default("PREFIX", "/usr/local")
kernel := env_var_or_default("HOME", "") / ".orch/osx/kernel/vmlinux"
kata_ver := "3.31.0"
opt := "ReleaseSafe"
linux_tgt := "aarch64-unknown-linux-musl"

# List recipes.
default:
Expand All @@ -26,6 +27,21 @@ build:
build-orchd:
cargo build --release

# Cross-compile a static aarch64-linux orchd with the containerd backend: drops
# into a Linux VM and drives containerd in process. Feeds examples/inception
# (orchd-osx boots a Linux guest that runs this orchd against containerd). On a
# Linux host, build natively with `--features containerd`; this recipe
# cross-builds from macOS. Needs rustup (`brew install rustup`), cargo-zigbuild
# (`cargo install cargo-zigbuild`), and protoc (`brew install protobuf`).
build-linux:
#!/usr/bin/env bash
set -euo pipefail
export PATH="/opt/homebrew/opt/rustup/bin:$PATH"
export PROTOC="$(command -v protoc)"
rustup target add {{linux_tgt}} >/dev/null 2>&1 || true
cargo zigbuild --target {{linux_tgt}} --release --features containerd
echo "linux orchd -> target/{{linux_tgt}}/release/orchd ($(file -b target/{{linux_tgt}}/release/orchd | cut -d, -f1-2))"

[macos]
build-apple:
cd orchd-apple && zig build -Doptimize={{opt}}
Expand Down
Loading