systems python execution

Python Execution System

Active contributors: Duy

Purpose

The Python Execution System is the only component in the Tack Timetable repository that is ever allowed to execute solver code generated by the LLM Coder stage. It is implemented as a hardened sandbox host (python/code_executor.py) that:

Receives generated Python source + input.json via a throw-away job directory.
Performs an early py_compile syntax gate.
Dispatches execution through sandbox/run.py to a strongly isolated runtime (Docker recommended; bubblewrap fallback; gated "none" mode for development only).
Enforces a strict result.json contract plus single SOLUTION_FOUND marker rule.
Captures structured status (optimal | feasible | infeasible | timeout | crashed), artifacts under .ai_results/, and truncated stdout/stderr.
Returns a typed ExecutionResult envelope understood by the TypeScript agent.

Non-negotiable security rule: LLM-generated solver code is never executed directly on the host machine. The bridge (python-bridge.ts) explicitly refuses local execution and always routes through the host. The host always delegates to the sandbox dispatcher.

Directory layout

Only the files and directories relevant to secure Python execution are listed (paths are repo-root relative):

python/
- code_executor.py — the single secure host entry point (run_user_code, daemon, main, py_compile gate, result.json parsing, status mapping, artifact rotation).
- templates/solver_skeleton.py — authoritative CP-SAT solver skeleton that generated code must extend (synced at build time to public/templates/).
- validator_engine.py — 46 constraint checkers (used by the deterministic validator after execution).
sandbox/
- run.py — dispatcher (run_sandboxed) that chooses the isolation technology based on TT_SANDBOX_MODE env or auto-detect (docker > bwrap > error; none only if TT_SANDBOX_ALLOW_UNSAFE=1).
- executor.py — Docker sandbox implementation (run_in_sandbox, ensure_image_built, strict hardening: --network=none, read-only root, tmpfs, non-root user, capability drops, CPU/memory limits, workspace-only mount).
- bubblewrap_executor.py — lightweight Linux namespace sandbox (run_with_bubblewrap: new mount/PID namespaces, seccomp, bind only workspace + Python site-packages).
- Dockerfile — minimal Python 3.11-slim + ortools image used by the Docker sandbox.
- README.md — historical context and operational guidance for sandbox choices.
electron/
- main.mjs — persistent daemon lifecycle (spawnDaemon, ensureDaemon, runWithDaemon), per-call fallback (spawnPerCall), job directory creation with input.json, python:executeCode IPC handler.
- preload.ts — contextBridge exposure of window.electron.python.executeCode.
src/features/timetable/ai/
- python-bridge.ts — high-level transport selector used by the Local Agent (executeGeneratedCode). Chooses Electron IPC (when preload exposes it) or HTTP POST fallback.
src/app/api/ai/python-execute/
- route.ts — Next.js server route (web / standalone distribution). Creates a unique job temp dir, spawns python3 python/code_executor.py, enforces path safety on resultPath, supports partial result on timeout, performs tree kill on timeout.

Key abstractions

Abstraction	Primary File(s)	Responsibility
`run_user_code`	`python/code_executor.py`	Create temp workspace, copy or create `input.json`, write `solver_generated.py`, run `py_compile`, invoke `sandbox/run.py:run_sandboxed`, enforce single `SOLUTION_FOUND` marker, parse `result.json`, map raw status, write capped `.ai_results/` artifact, return structured envelope (`phase`, `ok`, `status`, `durationMs`, `resultSummary`, `errorDigest`, `stdout`, `stderr`, optional `resultPath`/`resultData`).
`daemon`	`python/code_executor.py`	Persistent newline-JSON worker mode (read jobs from stdin, write results to stdout). Used by Electron desktop for low-latency repeated solves within one session. Respects per-job `solverWorkers` override.
`run_sandboxed`	`sandbox/run.py`	Auto-detect (`_auto_mode`) or env-selected dispatch (`TT_SANDBOX_MODE`) to Docker, bwrap, or (unsafe) raw subprocess. Always annotates result with `sandbox` field.
`run_in_sandbox`	`sandbox/executor.py`	Build (if needed) and run the `timetable-sandbox:latest` image with maximum hardening. Enforces workspace-only visibility, no network, resource limits, non-root execution.
`run_with_bubblewrap`	`sandbox/bubblewrap_executor.py`	Execute via `bwrap` with new mount/PID namespaces, seccomp filter, minimal bind mounts (workspace + Python packages only). Faster startup than Docker; Linux-only.
`executeGeneratedCode`	`src/features/timetable/ai/python-bridge.ts`	Public API called by the 6-stage Local Agent. Detects Electron vs web context and routes accordingly. Never executes Python itself.
`python:executeCode` IPC	`electron/main.mjs` + `preload.ts`	Desktop transport surface. Manages the long-lived daemon worker or falls back to per-call spawn. Writes `input.json` to a temp job dir before invoking the binary.
`POST /api/ai/python-execute`	`src/app/api/ai/python-execute/route.ts`	Web/standalone transport. Creates isolated job temp dir, spawns the executor with strict env and timeout handling, validates `resultPath` to prevent traversal, supports best-effort partial result on timeout.

How it works

Two distinct transports feed the same hardened host:

sequenceDiagram
    participant Agent as Local Agent (browser / renderer)
    participant Bridge as python-bridge.ts
    participant Electron as Electron main process
    participant HTTP as Next.js API route
    participant Host as python/code_executor.py
    participant Disp as sandbox/run.py (dispatcher)
    participant Docker as Docker sandbox
    participant Bwrap as bubblewrap sandbox

    Agent->>Bridge: executeGeneratedCode(code, input, timeoutMs, solverWorkers?)

    alt Electron desktop (preload exposes IPC)
        Bridge->>Electron: ipcRenderer.invoke('python:executeCode', ...)
        Electron->>Electron: runWithDaemon(...) or spawnPerCall(...)
        Electron->>Host: (persistent daemon) JSON line over stdin<br/>or fresh spawn of code_executor binary with job dir
    else Web / standalone / no IPC
        Bridge->>HTTP: fetch('/api/ai/python-execute', { method: 'POST', body: {code, input, timeoutMs, solverWorkers} })
        HTTP->>Host: spawn('python3', ['python/code_executor.py', timeout], { cwd: jobDir, env: ... })
    end

    Host->>Host: write solver_generated.py + input.json (from job dir or cwd)
    Host->>Host: python -m py_compile solver_generated.py (early syntax gate)
    Host->>Disp: run_sandboxed(file_path, timeout, workspace_dir)
    Disp->>Disp: mode = env.TT_SANDBOX_MODE or _auto_mode()
    alt docker
        Disp->>Docker: docker run --rm --network=none --read-only ... -v workspace:/sandbox_workspace ... python solver_generated.py
    else bwrap
        Disp->>Bwrap: bwrap --ro-bind /usr ... --bind workspace /workspace ... python solver_generated.py
    else none (only if TT_SANDBOX_ALLOW_UNSAFE=1)
        Disp->>Disp: raw subprocess (dev only, intentionally painful)
    end
    Docker-->>Disp: stdout, stderr, return_code, (timeout handling)
    Disp-->>Host: {stdout, stderr, return_code, ...}
    Host->>Host: if result.json exists → parse + validate marker count + map status
    Host-->>Electron/HTTP: JSON {phase, ok, status, durationMs, resultData?, errorDigest, stdout(trunc), stderr(trunc), ...}
    Electron/HTTP-->>Bridge: same envelope (plus resultData when available)
    Bridge-->>Agent: ExecutionResult

Key runtime behaviors:

The generated solver is expected to read input.json and write result.json with at minimum { "status": "...", "schedule": [...] }.
The host enforces that SOLUTION_FOUND appears at most once in stdout.
On timeout the host (and web route) attempt best-effort partial result extraction.
Artifacts are rotated (max 50) under .ai_results/.

Integration points

Local Agent pipeline — src/features/timetable/ai/local-agent.ts (and the Coder/Repair stages) calls executeGeneratedCode. Failures surface as typed ExecutionResult with status and errorDigest. See systems/ai-pipeline/coder.md.
Architecture — The security model ("AI never runs its own code on the host") and the five-layer diagram are described in overview/architecture.md.
Full AI pipeline context — The 6-stage loop (Translator → Planner → Coder → Sandbox execution → Deterministic Validator → Repair) is documented in systems/ai-pipeline/index.md.
Solver skeleton & validation — Generated code must conform to python/templates/solver_skeleton.py. Post-execution validation uses both validator_engine.py (Python) and deterministic-validator.ts (TypeScript) + CP-SAT round-trip.
Build / packaging — The Electron builder bundles the PyInstaller code_executor binary (from python-dist/) plus Python source fallback. The same code_executor.py source is used by the web server route.

Entry points for modification

Add or change sandbox technology — Extend the dispatch table in sandbox/run.py. Implement a new executor following the existing result shape contract. Update auto-detect logic and sandbox/README.md.
Modify the result contract or status mapping — Edit run_user_code (and the TypeScript ExecutionResult / types.ts definitions). All callers, tests, and prompt expectations must be updated.
Adjust timeouts, parallelism, or resource limits — EXECUTOR_TIMEOUT_SECONDS (env/argv), SOLVER_WORKERS, Docker --memory/--cpus, or bwrap limits. Exposed to the UI via bridge options.
Harden or evolve the Electron daemon — Changes to spawnDaemon, runWithDaemon, or the IPC handler in electron/main.mjs. Must preserve the fallback to per-call spawn.
Web route safety changes — Any modification to job directory handling, resultPath validation, tree-kill logic, or partial-result-on-timeout behavior in src/app/api/ai/python-execute/route.ts must preserve the documented isolation and traversal-prevention properties.
Prompt or skeleton changes that affect generated code shape — Coordinate with the Coder stage (systems/ai-pipeline/coder.md) and run the prompt validation suite (npm run test:prompt).

Key source files

All code references below use full repo-root paths.

Path	Approx. LOC	Role
`python/code_executor.py`	~350	The sole secure host that executes LLM-generated solver code. Implements `run_user_code`, `daemon` mode, py_compile gate, `result.json` contract, artifacting, and status mapping.
`sandbox/run.py`	~130	Sandbox dispatcher. Selects Docker, bubblewrap, or (unsafe) raw mode. Auto-detects when `TT_SANDBOX_MODE` is unset.
`sandbox/executor.py`	~300	Docker sandbox implementation. Builds `timetable-sandbox:latest` on demand. Enforces `--network=none`, read-only root, non-root user, capability drops, resource limits, workspace-only mount.
`sandbox/bubblewrap_executor.py`	~150	Lightweight Linux bwrap sandbox. New mount + PID namespaces + seccomp. Binds only workspace and Python package roots.
`sandbox/Dockerfile`	~40	Minimal hardened image (python:3.11-slim + ortools) used by the Docker sandbox path.
`sandbox/README.md`	~200	Historical motivation and operational guidance for the sandboxing strategy.
`electron/main.mjs`	~280	Electron main process. Manages persistent `code_executor --daemon` worker, job directory creation, IPC handler for `python:executeCode`, and per-call fallback spawn.
`electron/preload.ts`	~10	Exposes the Python execution IPC surface to the renderer via contextBridge.
`src/features/timetable/ai/python-bridge.ts`	~60	High-level bridge used by the Local Agent. Chooses between Electron IPC and HTTP fallback.
`src/app/api/ai/python-execute/route.ts`	~200	Next.js API route used by web/standalone distributions. Creates isolated job dirs, spawns the executor, enforces path safety, and supports partial results on timeout.

No other files in the repository are permitted to execute or spawn interpreters for LLM-generated solver code. All execution funnels through the files listed above.

Tack Timetable

By the Numbers

Tack Timetable — Project Lore

Tack Timetable — Fun Facts

systems python execution

Python Execution System

Purpose

Directory layout

Key abstractions

How it works

Integration points

Entry points for modification

Key source files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tack Timetable

How to Contribute

Systems

Features

Reference

Clone this wiki locally