Skip to content

systems python execution

Duy edited this page May 31, 2026 · 1 revision

Python Execution System

Active contributors: Duy

Purpose

The Python Execution System is the only component in the Tack Timetable repository that is ever allowed to execute solver code generated by the LLM Coder stage. It is implemented as a hardened sandbox host (python/code_executor.py) that:

  • Receives generated Python source + input.json via a throw-away job directory.
  • Performs an early py_compile syntax gate.
  • Dispatches execution through sandbox/run.py to a strongly isolated runtime (Docker recommended; bubblewrap fallback; gated "none" mode for development only).
  • Enforces a strict result.json contract plus single SOLUTION_FOUND marker rule.
  • Captures structured status (optimal | feasible | infeasible | timeout | crashed), artifacts under .ai_results/, and truncated stdout/stderr.
  • Returns a typed ExecutionResult envelope understood by the TypeScript agent.

Non-negotiable security rule: LLM-generated solver code is never executed directly on the host machine. The bridge (python-bridge.ts) explicitly refuses local execution and always routes through the host. The host always delegates to the sandbox dispatcher.

See also:

Directory layout

Only the files and directories relevant to secure Python execution are listed (paths are repo-root relative):

  • python/
    • code_executor.py — the single secure host entry point (run_user_code, daemon, main, py_compile gate, result.json parsing, status mapping, artifact rotation).
    • templates/solver_skeleton.py — authoritative CP-SAT solver skeleton that generated code must extend (synced at build time to public/templates/).
    • validator_engine.py — 46 constraint checkers (used by the deterministic validator after execution).
  • sandbox/
    • run.py — dispatcher (run_sandboxed) that chooses the isolation technology based on TT_SANDBOX_MODE env or auto-detect (docker > bwrap > error; none only if TT_SANDBOX_ALLOW_UNSAFE=1).
    • executor.py — Docker sandbox implementation (run_in_sandbox, ensure_image_built, strict hardening: --network=none, read-only root, tmpfs, non-root user, capability drops, CPU/memory limits, workspace-only mount).
    • bubblewrap_executor.py — lightweight Linux namespace sandbox (run_with_bubblewrap: new mount/PID namespaces, seccomp, bind only workspace + Python site-packages).
    • Dockerfile — minimal Python 3.11-slim + ortools image used by the Docker sandbox.
    • README.md — historical context and operational guidance for sandbox choices.
  • electron/
    • main.mjs — persistent daemon lifecycle (spawnDaemon, ensureDaemon, runWithDaemon), per-call fallback (spawnPerCall), job directory creation with input.json, python:executeCode IPC handler.
    • preload.ts — contextBridge exposure of window.electron.python.executeCode.
  • src/features/timetable/ai/
    • python-bridge.ts — high-level transport selector used by the Local Agent (executeGeneratedCode). Chooses Electron IPC (when preload exposes it) or HTTP POST fallback.
  • src/app/api/ai/python-execute/
    • route.ts — Next.js server route (web / standalone distribution). Creates a unique job temp dir, spawns python3 python/code_executor.py, enforces path safety on resultPath, supports partial result on timeout, performs tree kill on timeout.

Key abstractions

Abstraction Primary File(s) Responsibility
run_user_code python/code_executor.py Create temp workspace, copy or create input.json, write solver_generated.py, run py_compile, invoke sandbox/run.py:run_sandboxed, enforce single SOLUTION_FOUND marker, parse result.json, map raw status, write capped .ai_results/ artifact, return structured envelope (phase, ok, status, durationMs, resultSummary, errorDigest, stdout, stderr, optional resultPath/resultData).
daemon python/code_executor.py Persistent newline-JSON worker mode (read jobs from stdin, write results to stdout). Used by Electron desktop for low-latency repeated solves within one session. Respects per-job solverWorkers override.
run_sandboxed sandbox/run.py Auto-detect (_auto_mode) or env-selected dispatch (TT_SANDBOX_MODE) to Docker, bwrap, or (unsafe) raw subprocess. Always annotates result with sandbox field.
run_in_sandbox sandbox/executor.py Build (if needed) and run the timetable-sandbox:latest image with maximum hardening. Enforces workspace-only visibility, no network, resource limits, non-root execution.
run_with_bubblewrap sandbox/bubblewrap_executor.py Execute via bwrap with new mount/PID namespaces, seccomp filter, minimal bind mounts (workspace + Python packages only). Faster startup than Docker; Linux-only.
executeGeneratedCode src/features/timetable/ai/python-bridge.ts Public API called by the 6-stage Local Agent. Detects Electron vs web context and routes accordingly. Never executes Python itself.
python:executeCode IPC electron/main.mjs + preload.ts Desktop transport surface. Manages the long-lived daemon worker or falls back to per-call spawn. Writes input.json to a temp job dir before invoking the binary.
POST /api/ai/python-execute src/app/api/ai/python-execute/route.ts Web/standalone transport. Creates isolated job temp dir, spawns the executor with strict env and timeout handling, validates resultPath to prevent traversal, supports best-effort partial result on timeout.

How it works

Two distinct transports feed the same hardened host:

sequenceDiagram
    participant Agent as Local Agent (browser / renderer)
    participant Bridge as python-bridge.ts
    participant Electron as Electron main process
    participant HTTP as Next.js API route
    participant Host as python/code_executor.py
    participant Disp as sandbox/run.py (dispatcher)
    participant Docker as Docker sandbox
    participant Bwrap as bubblewrap sandbox

    Agent->>Bridge: executeGeneratedCode(code, input, timeoutMs, solverWorkers?)

    alt Electron desktop (preload exposes IPC)
        Bridge->>Electron: ipcRenderer.invoke('python:executeCode', ...)
        Electron->>Electron: runWithDaemon(...) or spawnPerCall(...)
        Electron->>Host: (persistent daemon) JSON line over stdin<br/>or fresh spawn of code_executor binary with job dir
    else Web / standalone / no IPC
        Bridge->>HTTP: fetch('/api/ai/python-execute', { method: 'POST', body: {code, input, timeoutMs, solverWorkers} })
        HTTP->>Host: spawn('python3', ['python/code_executor.py', timeout], { cwd: jobDir, env: ... })
    end

    Host->>Host: write solver_generated.py + input.json (from job dir or cwd)
    Host->>Host: python -m py_compile solver_generated.py (early syntax gate)
    Host->>Disp: run_sandboxed(file_path, timeout, workspace_dir)
    Disp->>Disp: mode = env.TT_SANDBOX_MODE or _auto_mode()
    alt docker
        Disp->>Docker: docker run --rm --network=none --read-only ... -v workspace:/sandbox_workspace ... python solver_generated.py
    else bwrap
        Disp->>Bwrap: bwrap --ro-bind /usr ... --bind workspace /workspace ... python solver_generated.py
    else none (only if TT_SANDBOX_ALLOW_UNSAFE=1)
        Disp->>Disp: raw subprocess (dev only, intentionally painful)
    end
    Docker-->>Disp: stdout, stderr, return_code, (timeout handling)
    Disp-->>Host: {stdout, stderr, return_code, ...}
    Host->>Host: if result.json exists → parse + validate marker count + map status
    Host-->>Electron/HTTP: JSON {phase, ok, status, durationMs, resultData?, errorDigest, stdout(trunc), stderr(trunc), ...}
    Electron/HTTP-->>Bridge: same envelope (plus resultData when available)
    Bridge-->>Agent: ExecutionResult
Loading

Key runtime behaviors:

  • The generated solver is expected to read input.json and write result.json with at minimum { "status": "...", "schedule": [...] }.
  • The host enforces that SOLUTION_FOUND appears at most once in stdout.
  • On timeout the host (and web route) attempt best-effort partial result extraction.
  • Artifacts are rotated (max 50) under .ai_results/.

Integration points

  • Local Agent pipelinesrc/features/timetable/ai/local-agent.ts (and the Coder/Repair stages) calls executeGeneratedCode. Failures surface as typed ExecutionResult with status and errorDigest. See systems/ai-pipeline/coder.md.
  • Architecture — The security model ("AI never runs its own code on the host") and the five-layer diagram are described in overview/architecture.md.
  • Full AI pipeline context — The 6-stage loop (Translator → Planner → Coder → Sandbox execution → Deterministic Validator → Repair) is documented in systems/ai-pipeline/index.md.
  • Solver skeleton & validation — Generated code must conform to python/templates/solver_skeleton.py. Post-execution validation uses both validator_engine.py (Python) and deterministic-validator.ts (TypeScript) + CP-SAT round-trip.
  • Build / packaging — The Electron builder bundles the PyInstaller code_executor binary (from python-dist/) plus Python source fallback. The same code_executor.py source is used by the web server route.

Entry points for modification

  • Add or change sandbox technology — Extend the dispatch table in sandbox/run.py. Implement a new executor following the existing result shape contract. Update auto-detect logic and sandbox/README.md.
  • Modify the result contract or status mapping — Edit run_user_code (and the TypeScript ExecutionResult / types.ts definitions). All callers, tests, and prompt expectations must be updated.
  • Adjust timeouts, parallelism, or resource limitsEXECUTOR_TIMEOUT_SECONDS (env/argv), SOLVER_WORKERS, Docker --memory/--cpus, or bwrap limits. Exposed to the UI via bridge options.
  • Harden or evolve the Electron daemon — Changes to spawnDaemon, runWithDaemon, or the IPC handler in electron/main.mjs. Must preserve the fallback to per-call spawn.
  • Web route safety changes — Any modification to job directory handling, resultPath validation, tree-kill logic, or partial-result-on-timeout behavior in src/app/api/ai/python-execute/route.ts must preserve the documented isolation and traversal-prevention properties.
  • Prompt or skeleton changes that affect generated code shape — Coordinate with the Coder stage (systems/ai-pipeline/coder.md) and run the prompt validation suite (npm run test:prompt).

Key source files

All code references below use full repo-root paths.

Path Approx. LOC Role
python/code_executor.py ~350 The sole secure host that executes LLM-generated solver code. Implements run_user_code, daemon mode, py_compile gate, result.json contract, artifacting, and status mapping.
sandbox/run.py ~130 Sandbox dispatcher. Selects Docker, bubblewrap, or (unsafe) raw mode. Auto-detects when TT_SANDBOX_MODE is unset.
sandbox/executor.py ~300 Docker sandbox implementation. Builds timetable-sandbox:latest on demand. Enforces --network=none, read-only root, non-root user, capability drops, resource limits, workspace-only mount.
sandbox/bubblewrap_executor.py ~150 Lightweight Linux bwrap sandbox. New mount + PID namespaces + seccomp. Binds only workspace and Python package roots.
sandbox/Dockerfile ~40 Minimal hardened image (python:3.11-slim + ortools) used by the Docker sandbox path.
sandbox/README.md ~200 Historical motivation and operational guidance for the sandboxing strategy.
electron/main.mjs ~280 Electron main process. Manages persistent code_executor --daemon worker, job directory creation, IPC handler for python:executeCode, and per-call fallback spawn.
electron/preload.ts ~10 Exposes the Python execution IPC surface to the renderer via contextBridge.
src/features/timetable/ai/python-bridge.ts ~60 High-level bridge used by the Local Agent. Chooses between Electron IPC and HTTP fallback.
src/app/api/ai/python-execute/route.ts ~200 Next.js API route used by web/standalone distributions. Creates isolated job dirs, spawns the executor, enforces path safety, and supports partial results on timeout.

No other files in the repository are permitted to execute or spawn interpreters for LLM-generated solver code. All execution funnels through the files listed above.

Clone this wiki locally