Security

Security Model

CortexPrism uses a Parallax security model with a three-layer LLM-based access control system for protecting sensitive data from unauthorized agent access. v0.53.0 added multi-user authentication, API token management, and instance federation security.

Policy settings

Architecture

Agent → Tool Intent → Policy Validator → (Sensitive?) → LLM Supervisor → Human Approval → Executor
                            │
                    [regex allow/deny rules]
                    [capability level (CPL)]
                    [optional human approval]

The system has two complementary security paths:

Policy validation — all tool calls are evaluated against regex rules before execution
LLM supervisor — sensitive data access triggers a 3-layer review (classification → LLM review → human approval)

Policy Validator

Every tool call an agent makes is evaluated before execution:

The agent emits a tool intent (e.g. shell("rm -rf /tmp/cache"))
The validator evaluates the intent against all active policy rules
The intent is either approved, denied, or held for human approval

Default Deny Rules

Seeded on first cortex db migrate:

Pattern	Blocks
`rm\s+-rf\s+/`	Recursive delete from root
`:\{.*\}`	Fork bomb patterns
`dd\s+if=.*of=/dev/`	Direct disk writes
`chmod\s+777\s+/`	World-write on filesystem root

Managing Rules

cortex policy list
cortex policy add "curl.*evil\.com" --kind shell --effect deny --reason "Blocked domain"
cortex policy check shell "rm -rf /etc"
cortex policy remove pol_abc123

Rules are evaluated by priority (ASC order). The first matching rule wins. If no rule matches, the default is to allow.

LLM Security Supervisor

For sensitive data access (memory search, database queries, browser screenshots, etc.), CortexPrism implements a 3-layer access control system:

Layer 1: Data Classification
  - Classify data as SECRET, SENSITIVE, NORMAL, or PUBLIC
  - Pattern-based detection (passwords, API keys, PII, SSNs, credit cards)
  - PUBLIC/NORMAL → allow immediately

Layer 2: LLM Supervisor
  - Fast model review (Gemini 2.0 Flash, GPT-4o Mini)
  - Decision caching per session (1-hour TTL) to reduce costs
  - Confidence scoring (0.0-1.0)
  - High confidence → auto-allow; low confidence → escalate to human

Layer 3: Human Approval
  - CLI: Color-coded interactive prompt with reasoning
  - Web UI: Modal dialog with sample data preview
  - Temporary grants (1-hour TTL per session + tool)
  - Timeout after 60s → auto-deny

Configuration

In ~/.cortex/config.json:

{
  "supervisor": {
    "provider": "google",
    "model": "gemini-2.0-flash",
    "cacheTTL": 3600
  }
}

See Security Supervisor for the full architecture guide.

Multi-User Authentication (v0.53.0+)

CortexPrism supports multi-user collaboration with:

PBKDF2 password hashing (200,000 iterations, SHA-256, per-user random salt) for user credentials
SHA-256 hashed API tokens stored in the user_tokens table — used for persistent CLI auth via Authorization: Bearer header
Team-based authorization — agents, vault entries, memory, and services are scoped to users and teams
Instance federation — pairing tokens enable trust between Cortex instances; federation_peers table tracks established relationships
Resource sharing — resource_shares table with ownership validation; users can share agents, sessions, and workspace with other users
Authorization guards — requireInstanceAdmin(), requireTeamAdmin(), requireTeamMember(), requireResourceOwner() enforce coarse permission checks on all API routes

AES-256-GCM Vault

API keys and credentials are stored encrypted using:

AES-256-GCM symmetric encryption
PBKDF2 key derivation (200,000 iterations, SHA-256, per-installation random salt)
Passphrase supplied via CORTEX_VAULT_KEY env var (never stored on disk)

export CORTEX_VAULT_KEY="your-passphrase"
cortex vault store "openai-key" --service openai
cortex vault get "openai-key"

Once vaulted, credentials are removed from config.json plain text. Provider API keys in config are encrypted with AES-256-GCM before writing to disk. Access logging is fire-and-forget to prevent failures from blocking credential retrieval. Usage limits, expiration, and allowed-agent enforcement are checked before decryption.

Cortex Lens (Audit Log)

Append-only audit log in lens.db tracking 35+ event types:

Every LLM call (provider, model, token counts, cost)
Every tool call (name, arguments, result, policy decision)
Every policy evaluation (rule matched, effect, reason)
Every security supervisor decision (classification, LLM review, human approval)
Session start/end events
MQM predictions, observations, and weight updates
Node events (connected, disconnected, heartbeat, directives)

Visible in the Web UI under the Activity tab and queryable via GET /api/lens/recent.

CPL (Capability Level)

YAML-based policy files defining capability boundaries:

version: 1
description: "Custom security policy"
rules:
  - kind: shell
    effect: deny
    pattern: "rm\\s+-rf\\s+/"
    reason: "Prevent recursive root delete"
    priority: 1
  - kind: domain
    effect: allow
    pattern: "api\\.github\\.com"
    reason: "Allow GitHub API access"
    priority: 10

Loaded via cortex policy or auto-loaded from .cortex/policy.yml. The CPL YAML editor in the Web UI supports live import with validation.

Code Sandbox Isolation

Code execution runs in ephemeral Docker/gVisor containers with:

No network access by default
CPU and memory resource limits
No host filesystem mounts
Container destroyed immediately after execution
gVisor (--runtime=runsc) for kernel-level syscall filtering when available

Subprocess fallback available for systems without Docker (less isolation, retains policy gating).

No Telemetry

CortexPrism collects no telemetry. No usage data, prompts, or credentials are ever sent to external servers. Data stays on your machine.

Known Limitations

Policy validator operates on intent strings — best-effort filter, not OS-level sandboxing
LLM prompt injection through untrusted content is a risk — review tool approvals carefully
Subprocess code execution has no container isolation — use Docker/gVisor for untrusted code
LLM supervisor adds latency (~200-500ms) and token costs per sensitive data access

Reporting Vulnerabilities

See the Security Policy for the responsible disclosure process.

LLM Vulnerability Scanner (#136)

POST /api/security/scan analyzes prompts and outputs for:

Prompt injection — system impersonation, instruction override, format injection
Data leaks — passwords, secrets, tokens, API keys in output
Destructive commands — DROP TABLE, rm -rf, shutdown
Unsafe patterns — curl-pipe-bash
Code injection — eval with user input
XSS vectors — innerHTML, dangerouslySetInnerHTML
SQL injection — concatenated SQL with request parameters

Returns findings with severity levels (critical, high, medium) and an overall risk score.

Credential Hygiene Monitor (#142)

GET /api/security/hygiene checks the vault for:

Duplicate credential names
Namespace conventions (suggested for api_key types)
Total count warnings (>50 credentials)

Returns a 0-100 hygiene score and categorized issues.

Zero-Trust Policy Generator (#274)

GET /api/security/policies/generate-allowlist generates path/domain allow-lists from enabled allow policy rules, suitable for ingress/egress firewall configuration.

Bulk Approval (#254)

POST /api/security/approvals/bulk accepts multiple request IDs with a single approve/deny action.

Sandbox Security Extensions

See Code Sandbox for:

Environment Replication Debugger (#79) — capture and replay development environments with sensitive-key masking
Workspace Context Snapshot (#240) — point-in-time workspace state capture with SHA-256 hashing
Dev Environment as Code (#232) — serialize environment config into versioned manifests
Bug Reproduction Studio (#230) — reproduce issues as sandbox test runs
Remote Sandbox Backends (#257) — E2B and Daytona cloud sandbox support
Path traversal prevention on all sandbox endpoints

Dynamic Tool Permission Grant (#62, v0.43.0, updated v0.49.0)

Per-task tool permission evaluation replacing static allow/deny. Implementation in packages/gate/src/security/.

Four decisions: granted, granted_with_guardrails, denied, requires_approval
Risk profiles for 13 tool categories with default guardrails
Temporary grants with session-scoped caching
Lens audit logging for every grant decision

Tool Approval Workflow Engine (#254, v0.43.0)

Structured approval pipeline for high-risk tool executions:

Auto-approval for low-risk operations with configurable threshold
Webhook-based channel notifications with approve/deny URLs
5-minute timeout with automatic expiration
Unified with Dynamic Grant into the "Agentic Tool Governance" stack

Data Loss Prevention Guard (#137, v0.43.0)

21 built-in scanners for comprehensive sensitive data detection:

AWS access/secret keys, GitHub tokens/PATs, OpenAI/Anthropic/Google API keys
JWTs, private keys (RSA/EC/DSA), PEM certificates
Database connection strings, Slack/Discord tokens
Credit cards, SSNs, emails, IPs, password fields
Three action levels: monitor, redact, block
Fire-and-forget lens audit logging

AI Guardrails & Content Safety (#179, v0.43.0)

Pluggable content safety middleware with 5 built-in classifiers:

prompt_injection — 10 detection patterns (ignore-previous-instructions, jailbreak DAN/STAN, system override)
pii_leakage, harmful_code, excessive_length (>100K chars), shell_injection
registerClassifier()/unregisterClassifier() API for custom classifiers
Returns GuardrailResult with pass/block/warn per check

Session Isolation Boundary (#139, v0.43.0)

Multi-tenant data isolation between Cortex sessions:

Three modes: strict (no cross-project), permissive (path-only), shared (no restrictions)
Path-based isolation, environment variable filtering, cross-session memory access control
Network access gating per mode, violation recording with lens audit trail (1K ring buffer)

Supply Chain Integrity (#138, v0.43.0)

Full plugin verification pipeline:

SHA-256 hash check against known-good hashes per package@version
Blocked hash list, digital signature verification
Author reputation scoring (0–100), blocked/allowed author lists
Malware pattern scanning (6 default patterns)
WASM binary scanning — parses WASM section headers to detect suspicious imports (wasi_snapshot_preview1.proc_exit, args_get, environ_get, sock_*), unknown env imports, excessive memory requests (> 4 GB), and WASM version mismatches. Replaces text-based pattern matching for .wasm files.
Configurable SupplyChainPolicy with blockSuspicious mode
verifyPluginIntegrity() runs before every plugin install

Dependency Supply Chain Guardian (#272, v0.43.0)

Continuous dependency monitoring across 6 ecosystems (npm, PyPI, Maven, Go, Cargo, NuGet):

CVE database with severity-scored vulnerability records
Version range matching for affected-versions parsing
Risk score calculation and blocked license enforcement
Auto-generated remediation suggestions with safe version bump recommendations
Periodic checks every 6 hours

v0.50.0 Security Hardening (#301)

Comprehensive security policy audit across all 6 layers of built-in protections, resolving 18 issues:

SSRF protection wired into shell command validation — the existing SSRF module (resolveAndCheck() with private IP/DNS blocking) was never called from the validator. Shell commands containing URLs now undergo SSRF checks, blocking cloud metadata endpoints, loopback addresses, and RFC 1918 private IPs.
Session isolation enforced at tool-call boundary — isPathAllowed() from the isolation module was registered but never consulted by the validator. File tool path arguments are now checked against registered session boundaries, preventing cross-session file access.
16 new default deny rules seeded — 5 shell rules (mkfs, /proc/sys/ writes, iptables/ufw, crontab, git push), 7 path rules (/etc/shadow, /root/.ssh/, .gnupg/, .env, id_rsa, sshd_config, sudoers), 3 domain rules (AWS/GCP metadata endpoints, loopback), 1 computer action rule. All use INSERT OR IGNORE to avoid overwriting user customizations.
4 existing shell regex patterns hardened — rm -rf catches -r -f, --recursive --force, and -fr variants; fork bomb pattern matches actual :(){ :|: & };: syntax; dd catches bare device names; chmod 777 catches -R 777 and non-root paths.
Policy table CHECK constraint widened (migration 042) — table originally only allowed ('tool', 'shell', 'domain', 'capability') kinds, but the validator also checks 'path' and 'computer' kinds which could never be inserted via SQL.
12 chrome_ and codegraph tool risk profiles added* — chrome_execute_js, chrome_http_auth, and chrome_network_rules set to high with confirmation required; 7 additional chrome tools profiled at medium; codegraph tools profiled. Previously all fell through to a blanket medium.
CORTEX_VAULT_KEY removed from safe environment variables — the vault encryption key was listed as accessible from any session, creating a path for agents to exfiltrate the master key.
Guardrail shell injection patterns narrowed — backtick and $() patterns matched empty content and blocked legitimate code examples. Changed to {1,200} quantifier requiring 1+ characters.
Data classification default relaxed from 'sensitive' to 'normal' — the security-first default classified all non-empty content as sensitive, triggering excessive supervisor LLM calls. Only content matching explicit SENSITIVE_PATTERNS or SECRET_PATTERNS is now elevated.

Implementation in packages/gate/src/security/validator.ts, packages/gate/src/security/isolation.ts, packages/gate/src/security/classification.ts, packages/gate/src/security/guardrails.ts, packages/gate/src/security/dynamic-grant.ts, and packages/core/src/db/migrations/042_policy_review.sql.

CortexPrism — Open-source AI agent operating system · Discord · Apache 2.0 License · Built with Deno 2.x + TypeScript

Pattern	Blocks
`rm\s+-rf\s+/`	Recursive delete from root
`:\(\)\{.*\}`	Fork bomb patterns
`dd\s+if=.*of=/dev/`	Direct disk writes
`chmod\s+777\s+/`	World-write on filesystem root

Uh oh!

Uh oh!

Security

Security Model

Architecture

Policy Validator

Default Deny Rules

Managing Rules

LLM Security Supervisor

Configuration

Multi-User Authentication (v0.53.0+)

AES-256-GCM Vault

Cortex Lens (Audit Log)

CPL (Capability Level)

Code Sandbox Isolation

No Telemetry

Known Limitations

Reporting Vulnerabilities

LLM Vulnerability Scanner (#136)

Credential Hygiene Monitor (#142)

Zero-Trust Policy Generator (#274)

Bulk Approval (#254)

Sandbox Security Extensions

Dynamic Tool Permission Grant (#62, v0.43.0, updated v0.49.0)

Tool Approval Workflow Engine (#254, v0.43.0)

Data Loss Prevention Guard (#137, v0.43.0)

AI Guardrails & Content Safety (#179, v0.43.0)

Session Isolation Boundary (#139, v0.43.0)

Supply Chain Integrity (#138, v0.43.0)

Dependency Supply Chain Guardian (#272, v0.43.0)

v0.50.0 Security Hardening (#301)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!