Skip to content

Configuration

scarecr0w12 edited this page Jun 25, 2026 · 7 revisions

Configuration

CortexPrism stores its configuration in ~/.cortex/config.json (created by cortex setup).

Settings screen

Full Config Reference

{
  "version": 1,
  "defaultProvider": "anthropic",
  "providers": {
    "anthropic": { "kind": "anthropic", "model": "claude-sonnet-4-5", "apiKey": "sk-ant-...", "reasoningEffort": "medium" },
    "openai": { "kind": "openai", "model": "gpt-4o", "apiKey": "sk-...", "baseUrl": "https://api.openai.com/v1" },
    "google": { "kind": "google", "model": "gemini-2.0-flash", "apiKey": "..." },
    "ollama": { "kind": "ollama", "model": "llama3.2", "baseUrl": "http://localhost:11434", "numCtx": 8192, "keepAlive": "5m" }
  },
  "agent": {
    "name": "Cortex",
    "maxTurns": 50,
    "streamOutput": true
  },
  "router": {
    "enabled": false,
    "strategy": "cascade",
    "confidenceThreshold": 0.7
  },
  "modelSelection": {
    "enabled": false,
    "mode": "balanced",
    "observeThreshold": 50,
    "enforceConfidence": 0.85,
    "suggestConfidence": 0.65,
    "quartermasterProvider": "google",
    "quartermasterModel": "gemini-2.0-flash",
    "autoModelPool": [
      { "provider": "anthropic", "model": "claude-3-5-sonnet-20241022", "enabled": true },
      { "provider": "openai", "model": "gpt-4o", "enabled": true }
    ]
  },
  "supervisor": {
    "provider": "google",
    "model": "gemini-2.0-flash",
    "cacheTTL": 3600
  },
  "computerUse": {
    "enabled": false,
    "display": { "width": 1920, "height": 1080 },
    "runtime": "native",
    "screenshot": { "format": "png", "quality": 90 },
    "timeout": { "action": 5000, "display": 10000 },
    "approval": { "requireApproval": true }
  },
  "sandbox": {
    "timeoutMs": 30000,
    "maxOutputBytes": 65536,
    "scrollAmount": 3,
    "dockerImages": {}
  },
  "voice": {
    "enabled": false,
    "provider": "openai",
    "defaultVoice": "alloy",
    "speed": 1.0,
    "autoTTS": false
  },
  "update": {
    "channel": "stable",
    "checkOnStartup": true,
    "autoUpdate": false,
    "checkIntervalHours": 24,
    "githubToken": null,
    "gpgKeyPath": null
  },
  "pluginUpdate": {
    "checkOnStartup": true,
    "autoUpdate": false,
    "checkIntervalHours": 24
  },
  "logging": {
    "level": "error",
    "fileEnabled": true,
    "fileMaxBytes": 10485760,
    "fileMaxFiles": 5
  },
  "webAuth": {
    "requireAuth": true
  }
}

Provider Configurations

Anthropic

{ "kind": "anthropic", "model": "claude-sonnet-4-5", "apiKey": "sk-ant-...", "reasoningEffort": "medium" }

reasoningEffort: low (1024 tokens) | medium (4096) | high (16384) — maps to thinking.budget_tokens

OpenAI

{ "kind": "openai", "model": "gpt-4o", "apiKey": "sk-...", "baseUrl": "https://api.openai.com/v1", "reasoningEffort": "medium" }

o-series models use max_completion_tokens instead of max_tokens and omit temperature/top_p.

Google Gemini

{ "kind": "google", "model": "gemini-2.0-flash", "apiKey": "...", "reasoningEffort": "high" }

thinkingConfig.thinkingBudget mapped from reasoningEffort. Supports native vision via inlineData.

Ollama (Local)

{ "kind": "ollama", "model": "llama3.2", "baseUrl": "http://localhost:11434", "numCtx": 8192, "keepAlive": "5m", "numThread": 4 }

AWS Bedrock

{ "kind": "bedrock", "model": "us.amazon.nova-pro-v1:0", "region": "us-east-1", "secretKey": "..." }

Uses Converse API with inferenceConfig for maxTokens, temperature, and topP.

OpenRouter

{ "kind": "openrouter", "model": "openai/gpt-4o", "apiKey": "...", "httpReferer": "https://myapp.com", "xTitle": "My App" }

Injects HTTP-Referer and X-Title headers for OpenRouter rankings.

Perplexity

{ "kind": "perplexity", "model": "sonar-pro", "apiKey": "...", "searchRecencyFilter": "week", "returnCitations": true }

searchRecencyFilter: month | week | day | hour. returnCitations and returnImages forwarded.

Together AI / Fireworks / Novita

{ "kind": "together", "model": "meta-llama/Llama-3.3-70B-Instruct-Turbo", "apiKey": "...", "repetitionPenalty": 1.1 }

repetitionPenaltyrepetition_penalty body field.

LiteLLM

{ "kind": "litellm", "model": "gpt-4o", "apiKey": "...", "dropParams": true }

dropParamsdrop_params to silently ignore unsupported parameters.

Venice AI

{ "kind": "venice", "model": "default", "apiKey": "...", "includeVeniceSystemPrompt": true }

includeVeniceSystemPromptvenice_parameters.include_venice_system_prompt.

DeepInfra

{ "kind": "deepinfra", "model": "meta-llama/Llama-3.3-70B-Instruct", "apiKey": "..." }

Serverless GPU inference. Supports repetitionPenalty.

Hyperbolic

{ "kind": "hyperbolic", "model": "deepseek-ai/DeepSeek-V3", "apiKey": "..." }

80% cheaper than traditional cloud providers.

MiniMax

{ "kind": "minimax", "model": "minimax-m3", "apiKey": "..." }

MiniMax M3: 80.5% SWE-bench Verified at $0.30/$1.20 per 1M tokens.

Zhipu (GLM)

{ "kind": "zhipu", "model": "glm-4-flash", "apiKey": "..." }

Free tier available (glm-4-flash).

Replicate

{ "kind": "replicate", "model": "meta/meta-llama-3-70b-instruct", "apiKey": "..." }

Uses predictions-based REST API with polling (non-streaming) and SSE (streaming).

Cloudflare Workers AI

{ "kind": "cloudflare", "model": "@cf/meta/llama-3-8b-instruct", "apiKey": "...", "accountId": "..." }

Edge inference via Cloudflare's AI platform. Requires both API token and Account ID. accountIdaccount_id path parameter.

LM Studio

{ "kind": "lmstudio", "model": "local-model", "baseUrl": "http://localhost:1234/v1", "numCtx": 4096, "keepAlive": "10m" }

Uses OpenAI-compatible path with num_ctx and keep_alive forwarding.

Environment Variables

Variable Purpose
CORTEX_DATA_DIR Override data directory (default: ~/.cortex/data/)
CORTEX_CONFIG_DIR Override config directory (default: ~/.cortex/)
CORTEX_VAULT_KEY Vault decryption passphrase
CORTEX_LOG_LEVEL Override log level (trace, debug, info, warn, error, silent)
GITHUB_TOKEN GitHub personal access token
GH_TOKEN Alternative GitHub token
OPENAI_API_KEY OpenAI API key (alternative to config)
CORTEX_API_URL Server URL for CLI login/API commands (default: http://localhost:11434)

Multi-User Auth Config (v0.53.0+)

Multi-user authentication is enabled when the users table has at least one user. On first run, an auto-admin is created from the vault password. API tokens provide persistent CLI authentication.

{
  "webAuth": {
    "requireAuth": true
  }
}
Field Default Description
webAuth.requireAuth true Require login to access the web UI

User, team, token, and federation management is handled through the REST API and CLI (cortex users, cortex teams, cortex login). See CLI Reference and Security.

Model Router Config

Cascade Router

Tries cheapest provider first, escalates on low confidence.

{
  "router": {
    "enabled": true,
    "strategy": "cascade",
    "confidenceThreshold": 0.7,
    "cascade": [
      { "provider": "ollama", "model": "llama3.2:3b" },
      { "provider": "anthropic", "model": "claude-sonnet-4-5" }
    ]
  }
}

Threshold Router

Routes based on prompt complexity scoring.

{
  "router": {
    "enabled": true,
    "strategy": "threshold",
    "confidenceThreshold": 0.5,
    "threshold": {
      "strongProvider": "anthropic",
      "strongModel": "claude-sonnet-4-5",
      "weakProvider": "ollama",
      "weakModel": "llama3.2:3b",
      "scorer": "heuristic"
    }
  }
}

Model Selection (MQM) Config

{
  "modelSelection": {
    "enabled": false,
    "mode": "balanced",
    "observeThreshold": 50,
    "enforceConfidence": 0.85,
    "suggestConfidence": 0.65,
    "costBudget": 1.0,
    "qualityThreshold": 0.7,
    "allowedProviders": ["anthropic", "openai", "ollama"],
    "quartermasterProvider": "google",
    "quartermasterModel": "gemini-2.0-flash",
    "autoModelPool": [
      { "provider": "anthropic", "model": "claude-sonnet-4-5", "enabled": true },
      { "provider": "openai", "model": "gpt-4o", "enabled": true }
    ]
  }
}
Field Default Description
mode "balanced" Strategy: conservative, balanced, or aggressive
observeThreshold 50 LLM calls before MQM activates predictions
enforceConfidence 0.85 Threshold for enforce mode (override model)
suggestConfidence 0.65 Threshold for suggest mode (hint in prompt)
autoModelPool [] Available models for Auto mode, each {provider, model, enabled} (v0.46+)
quartermasterProvider Dedicated provider for MQM analysis
quartermasterModel Dedicated model for MQM analysis

Security Supervisor Config

{
  "supervisor": {
    "provider": "google",
    "model": "gemini-2.0-flash",
    "cacheTTL": 3600
  }
}
}

Logging Config

{
  "logging": {
    "level": "error",
    "fileMaxBytes": 10485760,
    "fileMaxFiles": 5,
    "otlp": {
      "endpoint": "http://localhost:4318"
    },
    "langfuse": {
      "publicKey": "pk-lf-...",
      "secretKey": "sk-lf-..."
    }
  }
}

Computer Use Config

{
  "computerUse": {
    "enabled": false,
    "display": { "width": 1920, "height": 1080 },
    "runtime": "native",
    "screenshot": { "format": "png", "quality": 90 },
    "timeout": { "action": 5000, "display": 10000 },
    "approval": { "requireApproval": true, "autoApproveReadOnly": false },
    "docker": { "image": "cortexprism/computer-use:latest" }
  }
}

Sandbox Config

{
  "sandbox": {
    "timeoutMs": 30000,
    "maxOutputBytes": 65536,
    "scrollAmount": 3,
    "dockerImages": {}
  }
}

`runtime`: `docker` | `gvisor` (kernel-level syscall filtering via `runsc`) | `subprocess`.

## Server Config

- [LLM Providers](LLM-Providers) — All 30 supported providers
- [Model Routing](Model-Routing) — Router strategies
- [Model Quartermaster](Model-Quartermaster) — MQM learning system
- [Security](Security) — Security model
- [Observability](Observability) — Logging, metrics, and tracing

Clone this wiki locally