Big Smooth: bump idle timeout 30 min → 24 h (was masquerading as 'crashes unprompted')#86
Open
brentrager wants to merge 2 commits into
Open
Big Smooth: bump idle timeout 30 min → 24 h (was masquerading as 'crashes unprompted')#86brentrager wants to merge 2 commits into
brentrager wants to merge 2 commits into
Conversation
Root-caused the "Big Smooth crashes unprompted" symptom: it wasn't
crashing — it was gracefully shutting down on a 30-minute idle
timer (server.rs:600+). Bench evidence today showed the 30-minute
cliff fired after almost every /loop pause (1800s wakeup intervals),
killing the daemon mid-session and forcing 3+ manual `th up`s.
Pi + OpenCode have no daemon → no auto-shutdown → no "crashed
unprompted" symptom. Smooth's daemon model meant every loop pause
was implicitly a kill. This is exactly the kind of competitive-
parity gap the bench was designed to surface (user direction
2026-06-03: "i want smooth to learn from pi and opencode and make
smooth competitive").
24h default keeps a safety net for genuinely forgotten dev
sessions but doesn't fire during a single work session. The
existing `SMOOTH_BIGSMOOTH_IDLE_TIMEOUT_SECS` env override still
works (set to 0 to disable, or to a smaller value to opt back in
to aggressive timeouts) — caveat that the env must be set in the
daemon's process, which in sandboxed mode is the safehouse VM
(not the host shell).
Smoking-gun log line that closed the diagnosis:
2026-06-03T21:07:17 INFO smooth_bigsmooth::server:
Idle timeout reached (1800s), shutting down
Required rebuild path: scripts/build-safehouse.sh + cp the new
binary to ~/.smooth/runner-bin/safehouse, then th down + th up.
The shadow-bin mechanism (smooth-cli/src/main.rs:1292) bind-mounts
this over the OCI image's safehouse binary so dev iteration on
crates/smooth-bigsmooth doesn't need a full image push.
|
Documents the existing `th up direct` mode that we'd been overlooking:
boots in ~0.3s on the host instead of the ~30s safehouse-microVM
startup. That's pi/opencode-parity boot time (they're both ~3s).
Bench evidence from today:
smooth-direct : 0.850 aggregate, ~0.3s boot
smooth-sandboxed : 0.789 aggregate, ~30s boot (with variance)
pi : 1.000 aggregate, ~3s boot
opencode : >=0.93 aggregate, ~3s boot
Direct mode trade-off is no isolation — the agent runs as a host
subprocess against the host filesystem. Fine for dev machines + CI
runners you own + bench harnesses. Sandboxed remains the default
for untrusted dispatch.
Required setup for direct mode (the runner-discovery error message
already tells you, but worth surfacing in docs):
cargo build --release -p smooai-smooth-operator-runner
SMOOTH_OPERATOR_RUNNER_NATIVE=~/.cargo/shared-target/release/smooth-operator-runner th up direct
Pearl follow-ups still open after this:
th-6e361d — pycache run-to-run variance (direct still showed 0.500
on disk-bloat in one run; smooth's nondeterminism isn't
purely a sandbox artifact)
th-e74aa6 — runner-discovery UX paper-cut (separate from this work)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes pearl th-1b9b3e.
What was happening
Big Smooth wasn't crashing — it was gracefully exiting on a 30-minute idle timer (
server.rs:600+). The bench's/loop1800s wakeup intervals tripped it every iteration, killing the daemon mid-session.Smoking gun in the safehouse exec.log:
Why this is a pi/opencode-parity fix
Pi + OpenCode have no daemon → no auto-shutdown. Smooth's daemon model meant every loop pause was implicitly a kill. User direction 2026-06-03: "make smooth competitive with [pi + opencode]."
Fix
24h default. Env override (
SMOOTH_BIGSMOOTH_IDLE_TIMEOUT_SECS) still works (set to 0 to disable). Caveat: env must be set in the daemon process — in sandboxed mode that's the safehouse VM, not the host shell, so the env-var path needs separate plumbing if anyone wants to tighten the timeout below default in a bench scenario.Test plan
scripts/build-safehouse.sh; copied to~/.smooth/runner-bin/safehouseth down && th up && th statusreports healthy