-
Notifications
You must be signed in to change notification settings - Fork 0
systems python execution
Active contributors: Duy
The Python Execution System is the only component in the Tack Timetable repository that is ever allowed to execute solver code generated by the LLM Coder stage. It is implemented as a hardened sandbox host (python/code_executor.py) that:
- Receives generated Python source +
input.jsonvia a throw-away job directory. - Performs an early
py_compilesyntax gate. - Dispatches execution through
sandbox/run.pyto a strongly isolated runtime (Docker recommended; bubblewrap fallback; gated "none" mode for development only). - Enforces a strict
result.jsoncontract plus singleSOLUTION_FOUNDmarker rule. - Captures structured status (
optimal | feasible | infeasible | timeout | crashed), artifacts under.ai_results/, and truncated stdout/stderr. - Returns a typed
ExecutionResultenvelope understood by the TypeScript agent.
Non-negotiable security rule: LLM-generated solver code is never executed directly on the host machine. The bridge (python-bridge.ts) explicitly refuses local execution and always routes through the host. The host always delegates to the sandbox dispatcher.
See also:
- overview/architecture.md (security model and layer diagram)
- systems/ai-pipeline/coder.md
- systems/ai-pipeline/index.md
Only the files and directories relevant to secure Python execution are listed (paths are repo-root relative):
-
python/-
code_executor.py— the single secure host entry point (run_user_code,daemon,main, py_compile gate,result.jsonparsing, status mapping, artifact rotation). -
templates/solver_skeleton.py— authoritative CP-SAT solver skeleton that generated code must extend (synced at build time topublic/templates/). -
validator_engine.py— 46 constraint checkers (used by the deterministic validator after execution).
-
-
sandbox/-
run.py— dispatcher (run_sandboxed) that chooses the isolation technology based onTT_SANDBOX_MODEenv or auto-detect (docker>bwrap> error;noneonly ifTT_SANDBOX_ALLOW_UNSAFE=1). -
executor.py— Docker sandbox implementation (run_in_sandbox,ensure_image_built, strict hardening:--network=none, read-only root, tmpfs, non-root user, capability drops, CPU/memory limits, workspace-only mount). -
bubblewrap_executor.py— lightweight Linux namespace sandbox (run_with_bubblewrap: new mount/PID namespaces, seccomp, bind only workspace + Python site-packages). -
Dockerfile— minimal Python 3.11-slim + ortools image used by the Docker sandbox. -
README.md— historical context and operational guidance for sandbox choices.
-
-
electron/-
main.mjs— persistent daemon lifecycle (spawnDaemon,ensureDaemon,runWithDaemon), per-call fallback (spawnPerCall), job directory creation withinput.json,python:executeCodeIPC handler. -
preload.ts— contextBridge exposure ofwindow.electron.python.executeCode.
-
-
src/features/timetable/ai/-
python-bridge.ts— high-level transport selector used by the Local Agent (executeGeneratedCode). Chooses Electron IPC (when preload exposes it) or HTTP POST fallback.
-
-
src/app/api/ai/python-execute/-
route.ts— Next.js server route (web / standalone distribution). Creates a unique job temp dir, spawnspython3 python/code_executor.py, enforces path safety onresultPath, supports partial result on timeout, performs tree kill on timeout.
-
| Abstraction | Primary File(s) | Responsibility |
|---|---|---|
run_user_code |
python/code_executor.py |
Create temp workspace, copy or create input.json, write solver_generated.py, run py_compile, invoke sandbox/run.py:run_sandboxed, enforce single SOLUTION_FOUND marker, parse result.json, map raw status, write capped .ai_results/ artifact, return structured envelope (phase, ok, status, durationMs, resultSummary, errorDigest, stdout, stderr, optional resultPath/resultData). |
daemon |
python/code_executor.py |
Persistent newline-JSON worker mode (read jobs from stdin, write results to stdout). Used by Electron desktop for low-latency repeated solves within one session. Respects per-job solverWorkers override. |
run_sandboxed |
sandbox/run.py |
Auto-detect (_auto_mode) or env-selected dispatch (TT_SANDBOX_MODE) to Docker, bwrap, or (unsafe) raw subprocess. Always annotates result with sandbox field. |
run_in_sandbox |
sandbox/executor.py |
Build (if needed) and run the timetable-sandbox:latest image with maximum hardening. Enforces workspace-only visibility, no network, resource limits, non-root execution. |
run_with_bubblewrap |
sandbox/bubblewrap_executor.py |
Execute via bwrap with new mount/PID namespaces, seccomp filter, minimal bind mounts (workspace + Python packages only). Faster startup than Docker; Linux-only. |
executeGeneratedCode |
src/features/timetable/ai/python-bridge.ts |
Public API called by the 6-stage Local Agent. Detects Electron vs web context and routes accordingly. Never executes Python itself. |
python:executeCode IPC |
electron/main.mjs + preload.ts
|
Desktop transport surface. Manages the long-lived daemon worker or falls back to per-call spawn. Writes input.json to a temp job dir before invoking the binary. |
POST /api/ai/python-execute |
src/app/api/ai/python-execute/route.ts |
Web/standalone transport. Creates isolated job temp dir, spawns the executor with strict env and timeout handling, validates resultPath to prevent traversal, supports best-effort partial result on timeout. |
Two distinct transports feed the same hardened host:
sequenceDiagram
participant Agent as Local Agent (browser / renderer)
participant Bridge as python-bridge.ts
participant Electron as Electron main process
participant HTTP as Next.js API route
participant Host as python/code_executor.py
participant Disp as sandbox/run.py (dispatcher)
participant Docker as Docker sandbox
participant Bwrap as bubblewrap sandbox
Agent->>Bridge: executeGeneratedCode(code, input, timeoutMs, solverWorkers?)
alt Electron desktop (preload exposes IPC)
Bridge->>Electron: ipcRenderer.invoke('python:executeCode', ...)
Electron->>Electron: runWithDaemon(...) or spawnPerCall(...)
Electron->>Host: (persistent daemon) JSON line over stdin<br/>or fresh spawn of code_executor binary with job dir
else Web / standalone / no IPC
Bridge->>HTTP: fetch('/api/ai/python-execute', { method: 'POST', body: {code, input, timeoutMs, solverWorkers} })
HTTP->>Host: spawn('python3', ['python/code_executor.py', timeout], { cwd: jobDir, env: ... })
end
Host->>Host: write solver_generated.py + input.json (from job dir or cwd)
Host->>Host: python -m py_compile solver_generated.py (early syntax gate)
Host->>Disp: run_sandboxed(file_path, timeout, workspace_dir)
Disp->>Disp: mode = env.TT_SANDBOX_MODE or _auto_mode()
alt docker
Disp->>Docker: docker run --rm --network=none --read-only ... -v workspace:/sandbox_workspace ... python solver_generated.py
else bwrap
Disp->>Bwrap: bwrap --ro-bind /usr ... --bind workspace /workspace ... python solver_generated.py
else none (only if TT_SANDBOX_ALLOW_UNSAFE=1)
Disp->>Disp: raw subprocess (dev only, intentionally painful)
end
Docker-->>Disp: stdout, stderr, return_code, (timeout handling)
Disp-->>Host: {stdout, stderr, return_code, ...}
Host->>Host: if result.json exists → parse + validate marker count + map status
Host-->>Electron/HTTP: JSON {phase, ok, status, durationMs, resultData?, errorDigest, stdout(trunc), stderr(trunc), ...}
Electron/HTTP-->>Bridge: same envelope (plus resultData when available)
Bridge-->>Agent: ExecutionResult
Key runtime behaviors:
- The generated solver is expected to read
input.jsonand writeresult.jsonwith at minimum{ "status": "...", "schedule": [...] }. - The host enforces that
SOLUTION_FOUNDappears at most once in stdout. - On timeout the host (and web route) attempt best-effort partial result extraction.
- Artifacts are rotated (max 50) under
.ai_results/.
-
Local Agent pipeline —
src/features/timetable/ai/local-agent.ts(and the Coder/Repair stages) callsexecuteGeneratedCode. Failures surface as typedExecutionResultwithstatusanderrorDigest. See systems/ai-pipeline/coder.md. - Architecture — The security model ("AI never runs its own code on the host") and the five-layer diagram are described in overview/architecture.md.
- Full AI pipeline context — The 6-stage loop (Translator → Planner → Coder → Sandbox execution → Deterministic Validator → Repair) is documented in systems/ai-pipeline/index.md.
-
Solver skeleton & validation — Generated code must conform to
python/templates/solver_skeleton.py. Post-execution validation uses bothvalidator_engine.py(Python) anddeterministic-validator.ts(TypeScript) + CP-SAT round-trip. -
Build / packaging — The Electron builder bundles the PyInstaller
code_executorbinary (frompython-dist/) plus Python source fallback. The samecode_executor.pysource is used by the web server route.
-
Add or change sandbox technology — Extend the dispatch table in
sandbox/run.py. Implement a new executor following the existing result shape contract. Update auto-detect logic andsandbox/README.md. -
Modify the result contract or status mapping — Edit
run_user_code(and the TypeScriptExecutionResult/types.tsdefinitions). All callers, tests, and prompt expectations must be updated. -
Adjust timeouts, parallelism, or resource limits —
EXECUTOR_TIMEOUT_SECONDS(env/argv),SOLVER_WORKERS, Docker--memory/--cpus, or bwrap limits. Exposed to the UI via bridge options. -
Harden or evolve the Electron daemon — Changes to
spawnDaemon,runWithDaemon, or the IPC handler inelectron/main.mjs. Must preserve the fallback to per-call spawn. -
Web route safety changes — Any modification to job directory handling,
resultPathvalidation, tree-kill logic, or partial-result-on-timeout behavior insrc/app/api/ai/python-execute/route.tsmust preserve the documented isolation and traversal-prevention properties. -
Prompt or skeleton changes that affect generated code shape — Coordinate with the Coder stage (systems/ai-pipeline/coder.md) and run the prompt validation suite (
npm run test:prompt).
All code references below use full repo-root paths.
| Path | Approx. LOC | Role |
|---|---|---|
python/code_executor.py |
~350 | The sole secure host that executes LLM-generated solver code. Implements run_user_code, daemon mode, py_compile gate, result.json contract, artifacting, and status mapping. |
sandbox/run.py |
~130 | Sandbox dispatcher. Selects Docker, bubblewrap, or (unsafe) raw mode. Auto-detects when TT_SANDBOX_MODE is unset. |
sandbox/executor.py |
~300 | Docker sandbox implementation. Builds timetable-sandbox:latest on demand. Enforces --network=none, read-only root, non-root user, capability drops, resource limits, workspace-only mount. |
sandbox/bubblewrap_executor.py |
~150 | Lightweight Linux bwrap sandbox. New mount + PID namespaces + seccomp. Binds only workspace and Python package roots. |
sandbox/Dockerfile |
~40 | Minimal hardened image (python:3.11-slim + ortools) used by the Docker sandbox path. |
sandbox/README.md |
~200 | Historical motivation and operational guidance for the sandboxing strategy. |
electron/main.mjs |
~280 | Electron main process. Manages persistent code_executor --daemon worker, job directory creation, IPC handler for python:executeCode, and per-call fallback spawn. |
electron/preload.ts |
~10 | Exposes the Python execution IPC surface to the renderer via contextBridge. |
src/features/timetable/ai/python-bridge.ts |
~60 | High-level bridge used by the Local Agent. Chooses between Electron IPC and HTTP fallback. |
src/app/api/ai/python-execute/route.ts |
~200 | Next.js API route used by web/standalone distributions. Creates isolated job dirs, spawns the executor, enforces path safety, and supports partial results on timeout. |
No other files in the repository are permitted to execute or spawn interpreters for LLM-generated solver code. All execution funnels through the files listed above.