AI COMMS supports 18 AI providers. You activate one with AI_PROVIDER and set its API key — that's it.
Change one line in .env:
AI_PROVIDER=anthropicRestart the agent. Done. The provider router loads the right adapter automatically.
The default. GPT-4o, GPT-4, GPT-3.5 Turbo, and any OpenAI model.
AI_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o # optional, defaults to gpt-4oGet a key: platform.openai.com/api-keys
Claude Sonnet 4, Claude Haiku, Claude Opus.
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514 # optionalGet a key: console.anthropic.com
Gemini 2.0 Flash, Gemini Pro, and other Gemini models.
AI_PROVIDER=google
GOOGLE_API_KEY=AIza...
GOOGLE_MODEL=gemini-2.0-flash # optionalGet a key: aistudio.google.com/apikey
Mistral Large, Mistral Medium, Mistral Small.
AI_PROVIDER=mistral
MISTRAL_API_KEY=...
MISTRAL_MODEL=mistral-large-latest # optionalGet a key: console.mistral.ai
Command R+, Command R, and other Cohere models.
AI_PROVIDER=cohere
COHERE_API_KEY=...
COHERE_MODEL=command-r-plus # optionalGet a key: dashboard.cohere.com/api-keys
Blazing-fast inference for LLaMA, Mixtral, and Gemma models.
AI_PROVIDER=groq
GROQ_API_KEY=gsk_...
GROQ_MODEL=llama-3.3-70b-versatile # optionalGet a key: console.groq.com/keys
DeepSeek Chat and DeepSeek Coder models.
AI_PROVIDER=deepseek
DEEPSEEK_API_KEY=...
DEEPSEEK_MODEL=deepseek-chat # optionalGet a key: platform.deepseek.com
Grok 2 and Grok models from xAI.
AI_PROVIDER=xai
XAI_API_KEY=xai-...
XAI_MODEL=grok-2-latest # optionalGet a key: console.x.ai
Sonar Pro and Sonar models — AI with built-in web search.
AI_PROVIDER=perplexity
PERPLEXITY_API_KEY=pplx-...
PERPLEXITY_MODEL=sonar-pro # optionalGet a key: perplexity.ai/settings/api
Open-source models (LLaMA, Mistral, etc.) hosted by Together.
AI_PROVIDER=together
TOGETHER_API_KEY=...
TOGETHER_MODEL=meta-llama/Llama-3-70b-chat-hf # optionalGet a key: api.together.xyz/settings/api-keys
Fast inference for open-source models.
AI_PROVIDER=fireworks
FIREWORKS_API_KEY=fw_...
FIREWORKS_MODEL=accounts/fireworks/models/llama-v3p1-70b-instruct # optionalGet a key: fireworks.ai/account/api-keys
NVIDIA-hosted inference — Nemotron, LLaMA, Gemma, and more via NVIDIA's NIM platform.
AI_PROVIDER=nvidia-nim
NVIDIA_API_KEY=nvapi-...
NVIDIA_NIM_MODEL=nvidia/nemotron-3-super-120b-a12b # optional
NVIDIA_NIM_BASE_URL=https://integrate.api.nvidia.com/v1 # optional
NVIDIA_NIM_MAX_TOKENS=4096 # optionalGet a key: build.nvidia.com
Self-hosted NIM: If you run NIM containers locally, point NVIDIA_NIM_BASE_URL to your server:
NVIDIA_NIM_BASE_URL=http://your-nim-server:8000/v1Connects to a running OpenClaw Gateway instance. Routes to whatever model you've configured on your OpenClaw instance.
AI_PROVIDER=openclaw
OPENCLAW_BASE_URL=http://localhost:18789 # your OpenClaw Gateway
OPENCLAW_AUTH_TOKEN=your-secret # optional auth token
OPENCLAW_SESSION=main # optional session ID
OPENCLAW_MODEL=default # optional model overrideThe provider tries two endpoints:
POST /api/chat— OpenClaw's native APIPOST /v1/chat/completions— OpenAI-compatible fallback
Run models locally with Ollama — no API key, no cloud, fully private.
AI_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434 # optional, this is the default
OLLAMA_MODEL=llama3 # any model you've pulledInstall Ollama: ollama.com
Pull a model first:
ollama pull llama3The timeout for Ollama is set to 2 minutes (vs 60 seconds for cloud providers) to account for local model loading time.
OpenAI's code-optimized model (o4-mini). Uses the same API key as OpenAI.
AI_PROVIDER=codex
CODEX_API_KEY=sk-... # or falls back to OPENAI_API_KEY
CODEX_MODEL=o4-mini # optionalUse GPT-4o and other models via GitHub's inference endpoint.
AI_PROVIDER=copilot
COPILOT_TOKEN=ghp_...
COPILOT_MODEL=gpt-4o # optional
COPILOT_BASE_URL=https://models.github.ai/inference # optionalGet a token: github.com/settings/tokens (needs copilot scope)
Anthropic's agentic coding model with extended thinking for complex reasoning.
AI_PROVIDER=claude-code
CLAUDE_CODE_API_KEY=sk-ant-... # or falls back to ANTHROPIC_API_KEY
CLAUDE_CODE_MODEL=claude-sonnet-4-20250514 # optional
CLAUDE_CODE_MAX_TOKENS=16384 # optional
CLAUDE_CODE_THINKING_BUDGET=10000 # optional, tokens for extended thinkingThe thinking budget controls how many tokens Claude spends on internal reasoning before responding. Set to 0 to disable extended thinking.
Anthropic's collaborative agent model, designed for multi-agent teamwork and task delegation.
AI_PROVIDER=claude-cowork
CLAUDE_COWORK_API_KEY=sk-ant-... # or falls back to ANTHROPIC_API_KEY
CLAUDE_COWORK_MODEL=claude-sonnet-4-20250514 # optional
CLAUDE_COWORK_MAX_TOKENS=8192 # optional
CLAUDE_COWORK_THINKING_BUDGET=8000 # optionalSet a fallback chain so if your primary provider fails, the next one takes over automatically:
AI_PROVIDER=openai
AI_FALLBACK_PROVIDERS=anthropic,google,groqFailover order: OpenAI → Anthropic → Google → Groq.
Each provider in the chain is tried once. If all fail, the error is returned to the user.
Optional — cap requests per minute to stay within provider quotas:
RATE_LIMIT_OPENAI_RPM=60
RATE_LIMIT_ANTHROPIC_RPM=40
RATE_LIMIT_GOOGLE_RPM=60If the rate limit is hit, the failover chain kicks in automatically.
| Use Case | Recommended |
|---|---|
| General purpose | openai (GPT-4o) or anthropic (Claude) |
| Fastest responses | groq |
| Best reasoning | anthropic (Claude Code with thinking) |
| Cheapest | ollama (free, local) or deepseek |
| Code tasks | codex or claude-code |
| Web search built in | perplexity |
| Privacy (no cloud) | ollama |
| NVIDIA GPUs | nvidia-nim |
| Self-hosted gateway | openclaw |
| Multi-agent collab | claude-cowork |
Next: Security configuration →