-
-
Notifications
You must be signed in to change notification settings - Fork 119
Code Sandbox
CortexPrism executes code in isolated environments to protect the host system from potentially harmful or buggy code generated by LLMs.

docker run --rm \
--network=none \
--memory=256m \
--cpus=0.5 \
--pids-limit=64 \
--security-opt=no-new-privileges \
<image> <interpreter> /tmp/code.<ext>-
No network access —
--network=none - Resource limits — 256MB memory, 0.5 CPU, 64 PIDs max
-
No privilege escalation —
--security-opt=no-new-privileges -
Ephemeral — Container destroyed immediately after execution (
--rm) - No host mounts — No filesystem access to the host machine
- Timeout: 30 seconds
- Max output: 64KB (configurable via
maxOutputBytes)
When Docker is not available (docker info fails), CortexPrism falls back to direct subprocess execution. This provides less isolation but retains policy gating through the security validator.
When gVisor is installed, runInDocker passes --runtime=runsc for kernel-level syscall filtering. getAvailableRuntime() auto-detects gVisor availability (cached result) and prefers it over plain Docker.
| Language | Docker Image |
|---|---|
| Python | python:3.12-alpine |
| JavaScript | node:22-alpine |
| TypeScript | denoland/deno:alpine |
| Bash | alpine:3.20 |
| Ruby | ruby:3.3-alpine |
| Go | golang:1.22-alpine |
| Rust | rust:1.78-alpine |
When code execution fails, CortexPrism can automatically fix and retry:
runInSandbox(code)
→ exit != 0?
→ LLM: "Fix this error: <stderr>\n\nCode:\n<code>"
→ extract code from LLM response
→ runInSandbox(fixedCode)
→ repeat up to maxRounds (default 4)
Enable with --fix flag on cortex run or configure per session.
cortex sandbox run script.py # Docker sandbox
cortex sandbox run script.py --no-sandbox # Subprocess mode
cortex sandbox run script.py --fix # Auto-fix on failure
cortex sandbox run script.py --fix --max-fix 6 # Up to 6 fix attempts
cortex sandbox run script.py --sandbox-debug # Enable debug loggingThe code_exec tool lets agents execute code in the sandbox. The tool description explicitly warns that:
- The sandbox has NO access to host files or workspace
- No package managers are available in the sandbox
- Use file tools for all file operations
Configurable via ~/.cortex/config.json:
{
"sandbox": {
"runtime": "docker",
"languages": ["python", "javascript", "typescript", "bash", "ruby", "go", "rust"],
"timeout": 30000,
"memoryLimit": "512m",
"outputLimit": 102400
}
}runtime: docker | gvisor (kernel-level syscall filtering via runsc) | subprocess.
| Method | Path | Description |
|---|---|---|
POST |
/api/code/exec |
Execute code in sandbox |
GET |
/api/sandbox/config |
Sandbox configuration (runtime, Docker/gVisor availability, timeout/memory limits, supported languages) |
PUT |
/api/sandbox/config |
Update sandbox config |
GET |
/api/sandbox/backends |
Available backends (docker, gvisor, e2b, daytona) with API-key-based availability |
GET |
/api/sandbox/debug |
Debug status |
PUT |
/api/sandbox/debug |
Toggle sandbox debug logging |
Full environment capture and replay system:
| Method | Path | Description |
|---|---|---|
POST |
/api/sandbox/snapshots |
Capture environment snapshot (env vars, dependencies, git state, sandbox config) to JSON + DB |
GET |
/api/sandbox/snapshots |
List snapshots with optional session filter and sensitive-key masking |
GET |
/api/sandbox/snapshots/:id |
Single snapshot detail with masked env values |
POST |
/api/sandbox/snapshots/:id/replicate |
Replicate snapshot to target workspace (writes commented .cortex-env-replication.sh) |
GET |
/api/sandbox/snapshots/compare?id1=&id2= |
Diff two snapshots (env vars + dependencies) |
DELETE |
/api/sandbox/snapshots/:id |
Delete snapshot (file + DB row) |
Security: Env key validation (/^[A-Za-z_][A-Za-z0-9_]*$/), value length limit (1024 chars), sensitive env value masking for keys matching API_KEY|TOKEN|SECRET|PASSWORD|AUTH|CREDENTIAL|PRIVATE_KEY|ACCESS_KEY patterns. Shell-injection-safe env var replication with fully escaped $, backtick, !, \.
Point-in-time workspace state capture:
| Method | Path | Description |
|---|---|---|
POST |
/api/workspace/snapshots |
Capture file tree with SHA-256 hashes, git state, memory context, tool state |
GET |
/api/workspace/snapshots |
List snapshots with session filter |
GET |
/api/workspace/snapshots/:id |
Single snapshot with full file tree |
POST |
/api/workspace/snapshots/:id/restore |
Write restore manifest (.cortex-ws-restore.json) |
GET |
/api/workspace/snapshots/diff?id1=&id2= |
Diff file trees (added/removed/modified) |
DELETE |
/api/workspace/snapshots/:id |
Delete snapshot |
Files >10 MB skipped with skipped:too-large:<size> placeholder hash. Excludes .git, node_modules, __pycache__, .DS_Store from scans.
Serialize environment config into versioned manifests:
| Method | Path | Description |
|---|---|---|
POST |
/api/sandbox/dev-env/generate |
Auto-detect language, dependencies, setup commands; generate DevEnvManifest
|
GET |
/api/sandbox/dev-env/manifest?workspacePath= |
Load existing cortex-devenv.json
|
PUT |
/api/sandbox/dev-env/manifest |
Save/update manifest with validation |
GET |
/api/sandbox/dev-env/list |
List all stored manifests |
Auto-detection of JavaScript (npm/yarn/pnpm/bun), Python (pip), Rust (cargo), Go, Ruby (bundler). Unique default names via SHA-256 hash of workspace path to prevent collisions.
Reproduce issues as sandbox test runs:
| Method | Path | Description |
|---|---|---|
POST |
/api/sandbox/bug-repro |
Create bug repro run from issue title, description, language, and code |
GET |
/api/sandbox/bug-repro |
List runs with optional status/session filters |
GET |
/api/sandbox/bug-repro/:id |
Single run detail with result |
POST |
/api/sandbox/bug-repro/:id/run |
Execute repro in sandbox (docker/subprocess) |
DELETE |
/api/sandbox/bug-repro/:id |
Delete run |
Status lifecycle: queued → running → passed | failed | error. Error handling wraps runInSandbox with try/catch for runtime failures.
src/sandbox/
├── executor.ts # Core execution engine (Docker/subprocess/gVisor/e2b/daytona)
├── agent-sandbox.ts # Docker CLI args builder for agent workspace sandboxes
├── autofix.ts # LLM auto-debug loop (up to 4 fix rounds)
├── replication.ts # Environment snapshots: capture, replicate, compare
├── workspace-snapshot.ts # Workspace snapshots: file tree, git state, restore manifest
├── dev-env-code.ts # Dev env manifests: generate, validate, save, load
├── bug-repro.ts # Bug reproduction: create, execute, list results
├── git-capture.ts # Shared git state capture (branch, HEAD, porcelain status)
├── dependency-detect.ts # Shared dependency detection (JS/Python/Rust/Go/Ruby)
├── environment.ts # Sandbox environment provisioning with language auto-detection
├── logger.ts # Namespaced debug logging (toggleable via env var/CLI/API/WebUI)
├── snapshot-types.ts # TypeScript interfaces for all snapshot/manifest/run types
└── mod.ts # Barrel export
- Security — Sandbox isolation in the security model
-
Built-in Tools — The
code_exectool and sandbox tools
CortexPrism — Open-source AI agent operating system · Discord · Apache 2.0 License · Built with Deno 2.x + TypeScript
- Agent Loop
- Built-in Agents
- Metacognition
- Memory System
- Skills System
- Sub-Agents
- Built-in Tools
- Code Intelligence
- Code Sandbox
- Cross-Agent Context Protocol
- Prompt Lab
- PKM Assistant
- Voice Pipeline
- Computer Use
- Browser Tool
- Git & GitHub
- Scheduler & Jobs
- Dashboard
- Observability
- A2A Protocol
- MCP Gateway
- Distributed Nodes
- Memori Checkpoints
- Eval System
- Workflow Engine
- Triggers
- Projects
- TUI
- Glossary
- Update System
- Chrome Bridge
- Swarm
- AgentLint
- Model Benchmarking
- Smart Context
- Cost Optimizer