Usage and configuration

This guide will help you get started with cagent and learn how to use its powerful multi-agent system to accomplish various tasks.

What is cagent?

cagent is a powerful, customizable multi-agent system that orchestrates AI agents with specialized capabilities and tools. It features:

🏗️ Multi-tenant architecture with client isolation and session management
🔧 Rich tool ecosystem via Model Context Protocol (MCP) integration
🤖 Hierarchical agent system with intelligent task delegation
🌐 Multiple interfaces including CLI, TUI and API server
📦 Agent distribution via Docker registry integration
🔒 Security-first design with proper client scoping and resource isolation
⚡ Event-driven streaming for real-time interactions
🧠 Multi-model support (OpenAI, Anthropic, Gemini, Docker Model Runner (DMR))

Why?

After passing the last year+ building AI agents of various types, using a variety of software solutions and frameworks, we kept asking ourselves some of the same questions:

How can we make building and running useful agentic systems less of a hassle?
Most agents we build end up use many of the same building blocks. Can we re-use most of those building block and have declarative configurations for new agents?
How can we package and share agents amongst each other as simply as possible without all the headaches?

We really think we're getting somewhere as we build out the primitives of cagent so, in keeping with our love for open-source software in general, we decided to share it and build it in the open to allow the community at large to make use of our work and contribute to the future of the project itself.

Running Agents

Command Line Interface

cagent provides multiple interfaces and deployment modes:

# Terminal UI (TUI)
$ cagent run config.yaml
$ cagent run config.yaml -a agent_name    # Run a specific agent
$ cagent run config.yaml --debug          # Enable debug logging
$ cagent run config.yaml --yolo           # Auto-accept all the tool calls
$ cagent run config.yaml "First message"  # Start the conversation with the agent with a first message
$ cagent run config.yaml -c df            # Run with a named command from YAML

# Model Override Examples
$ cagent run config.yaml --model anthropic/claude-sonnet-4-0    # Override all agents to use Claude
$ cagent run config.yaml --model "agent1=openai/gpt-4o"         # Override specific agent
$ cagent run config.yaml --model "agent1=openai/gpt-4o,agent2=anthropic/claude-sonnet-4-0"  # Multiple overrides

# One off without TUI
$ cagent exec config.yaml                 # Run the agent once, with default instructions
$ cagent exec config.yaml "First message" # Run the agent once with instructions
$ cagent exec config.yaml --yolo          # Run the agent once and auto-accept all the tool calls

# API Server (HTTP REST API)
$ cagent api config.yaml
$ cagent api config.yaml --listen :8080

# Other commands
$ cagent new                          # Initialize new project
$ cagent new --model openai/gpt-5-mini --max-tokens 32000  # Override max tokens during generation
$ cagent eval config.yaml             # Run evaluations
$ cagent pull docker.io/user/agent    # Pull agent from registry
$ cagent push docker.io/user/agent    # Push agent to registry

Interface-Specific Features

CLI Interactive Commands

During CLI sessions, you can use special commands:

Command	Description
`/exit`	Exit the program
`/reset`	Clear conversation history
`/eval`	Save current conversation for evaluation
`/compact`	Compact conversation to lower context usage

🔧 Configuration Reference

Agent Properties

Property	Type	Description	Required
`name`	string	Agent identifier	✓
`model`	string	Model reference	✓
`description`	string	Agent purpose	✓
`instruction`	string	Detailed behavior instructions	✓
`sub_agents`	array	List of sub-agent names	✗
`toolsets`	array	Available tools	✗
`add_date`	boolean	Add current date to context	✗
`add_environment_info`	boolean	Add information about the environment (working dir, OS, git...)	✗
`max_iterations`	int	Specifies how many times the agent can loop when using tools	✗
`commands`	object/array	Named prompts for quick-start commands (used with `--command`)	✗

Example

agents:
  agent_name:
    model: string # Model reference
    description: string # Agent purpose
    instruction: string # Detailed behavior instructions
    tools: [] # Available tools (optional)
    sub_agents: [] # Sub-agent names (optional)
    add_date: boolean # Add current date to context (optional)
    add_environment_info: boolean # Add information about the environment (working dir, OS, git...) (optional)
    max_iterations: int # How many times this agent can loop when calling tools (optional, default = unlimited)
    commands: # Either mapping or list of singleton maps
      df: "check how much free space i have on my disk"
      ls: "list the files in the current directory"

Running with named commands

Use --command (or -c) to send a predefined prompt from the agent config as the first message.
Example YAML forms supported:

commands:
  df: "check how much free space i have on my disk"
  ls: "list the files in the current directory"

commands:
  - df: "check how much free space i have on my disk"
  - ls: "list the files in the current directory"

Run:

cagent run ./agent.yaml -c df
cagent run ./agent.yaml --command ls

Model Properties

Property	Type	Description	Required
`provider`	string	Provider: `openai`, `anthropic`, `google`, `dmr`	✓
`model`	string	Model name (e.g., `gpt-4o`, `claude-sonnet-4-0`, `gemini-2.5-flash`)	✓
`temperature`	float	Randomness (0.0-1.0)	✗
`max_tokens`	integer	Response length limit	✗
`top_p`	float	Nucleus sampling (0.0-1.0)	✗
`frequency_penalty`	float	Repetition penalty (0.0-2.0)	✗
`presence_penalty`	float	Topic repetition penalty (0.0-2.0)	✗
`base_url`	string	Custom API endpoint	✗
`thinking_budget`	string/int	Reasoning effort — OpenAI: effort string, Anthropic/Google: token budget int	✗

Example

models:
  model_name:
    provider: string # Provider: openai, anthropic, google, dmr
    model: string # Model name: gpt-4o, claude-3-5-sonnet-latest, gemini-2.5-flash, qwen3:4B, ...
    temperature: float # Randomness (0.0-1.0)
    max_tokens: integer # Response length limit
    top_p: float # Nucleus sampling (0.0-1.0)
    frequency_penalty: float # Repetition penalty (0.0-2.0)
    presence_penalty: float # Topic repetition penalty (0.0-2.0)
    parallel_tool_calls: boolean
    thinking_budget: string|integer # OpenAI: effort level string; Anthropic/Google: integer token budget

Reasoning Effort (thinking_budget)

Determine how much the model should think by setting the thinking_budget

OpenAI: use effort levels — minimal, low, medium, high
Anthropic: set an integer token budget. Range is 1024–32768; must be strictly less than max_tokens.
Google (Gemini): set an integer token budget. 0 -> disable thinking, -1 -> dynamic thinking (model decides). Most models: 0–24576 tokens. Gemini 2.5 Pro: 128–32768 tokens (and cannot disabled thinking).

Examples (OpenAI):

models:
  openai:
    provider: openai
    model: gpt-5-mini
    thinking_budget: low

agents:
  root:
    model: openai
    instruction: you are a helpful assistant

Examples (Anthropic):

models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-5-20250929
    thinking_budget: 1024

agents:
  root:
    model: claude
    instruction: you are a helpful assistant that doesn't think very much

Examples (Google):

models:
  gemini-no-thinking:
    provider: google
    model: gemini-2.5-flash
    thinking_budget: 0  # Disable thinking

  gemini-dynamic:
    provider: google
    model: gemini-2.5-flash
    thinking_budget: -1  # Dynamic thinking (model decides)

  gemini-fixed:
    provider: google
    model: gemini-2.5-flash
    thinking_budget: 8192  # Fixed token budget

agents:
  root:
    model: gemini-fixed
    instruction: you are a helpful assistant

Interleaved Thinking (Anthropic)

Anthropic's interleaved thinking feature uses the Beta Messages API to provide tool calling during model reasoning. You can control this behavior using the interleaved_thinking provider option:

models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-5-20250929
    thinking_budget: 8192  # Optional: defaults to 16384 when interleaved thinking is enabled
    provider_opts:
      interleaved_thinking: true   # Enable interleaved thinking (default: false)

Notes:

OpenAI: If an invalid effort value is set, the request will fail with a clear error
Anthropic: Values < 1024 or ≥ max_tokens are ignored (warning logged). When interleaved_thinking is enabled, cagent uses Anthropic's Beta Messages API with a default thinking budget of 16384 tokens if not specified
Google:
- Most models support values between -1 and 24576 tokens. Set to 0 to disable, -1 for dynamic thinking
- Gemini 2.5 Pro: supports 128–32768 tokens. Cannot be disabled (minimum 128)
- Gemini 2.5 Flash-Lite: supports 512–24576 tokens. Set to 0 to disable, -1 for dynamic thinking
For unsupported providers, thinking_budget has no effect
Debug logs include the applied effort (e.g., "OpenAI request using thinking_budget", "Gemini request using thinking_budget")

See examples/thinking_budget.yaml for a complete runnable demo.

Model Examples

⚠️ NOTE ⚠️
More model names can be found here

# OpenAI
models:
  openai:
    provider: openai
    model: gpt-5-mini

# Anthropic
models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-0

# Gemini
models:
  gemini:
    provider: google
    model: gemini-2.5-flash

# Docker Model Runner (DMR)
models:
  qwen:
    provider: dmr
    model: ai/qwen3

DMR (Docker Model Runner) provider usage

If base_url is omitted, cagent will use http://localhost:12434/engines/llama.cpp/v1 by default

You can pass DMR runtime (e.g. llama.cpp) options using

models:
  provider: dmr
  provider_opts: 
    runtime_flags:

The context length is taken from max_tokens at the model level:

models:
  local-qwen:
    provider: dmr
    model: ai/qwen3
    max_tokens: 8192
    # base_url: omitted -> auto-discovery via Docker Model plugin
    provider_opts:
      runtime_flags: ["--ngl=33", "--top-p=0.9"]

runtime_flags also accepts a single string with comma or space separation:

models:
  local-qwen:
    provider: dmr
    model: ai/qwen3
    max_tokens: 8192
    provider_opts:
      runtime_flags: "--ngl=33 --top-p=0.9"

Explicit base_url example with multiline runtime_flags string:

models:
  local-qwen:
    provider: dmr
    model: ai/qwen3
    base_url: http://127.0.0.1:12434/engines/llama.cpp/v1
    provider_opts:
      runtime_flags: |
        --ngl=33
        --top-p=0.9

Requirements and notes:

Docker Model plugin must be available for auto-configure/auto-discovery
- Verify with: docker model status --json
Configuration is best-effort; failures fall back to the default base URL
provider_opts currently apply to dmr and anthropic providers
runtime_flags are passed after -- to the inference runtime (e.g., llama.cpp)

Parameter mapping and precedence (DMR):

ModelConfig fields are translated into engine-specific runtime flags. For e.g. with the llama.cpp backend:
- temperature → --temp
- top_p → --top-p
- frequency_penalty → --frequency-penalty
- presence_penalty → --presence-penalty ...
provider_opts.runtime_flags always take priority over derived flags on conflict. When a conflict is detected, cagent logs a warning indicating the overridden flag. max_tokens is the only exception for now

Examples:

models:
  local-qwen:
    provider: dmr
    model: ai/qwen3
    temperature: 0.5            # derives --temp 0.5
    top_p: 0.9                  # derives --top-p 0.9
    max_tokens: 8192            # sets --context-size=8192
    provider_opts:
      runtime_flags: ["--temp", "0.7", "--threads", "8"]  # overrides derived --temp, sets --threads

models:
  local-qwen:
    provider: dmr
    model: ai/qwen3
    provider_opts:
      runtime_flags: "--ngl=33 --repeat-penalty=1.2"  # string accepted as well

Troubleshooting:

Plugin not found: cagent will log a debug message and use the default base URL
Endpoint empty in status: ensure the Model Runner is running, or set base_url manually
Flag parsing: if using a single string, quote properly in YAML; you can also use a list

Alloy models

"Alloy models" essentially means using more than one model in the same chat context. Not at the same time, but "randomly" throughout the conversation to try to take advantage of the strong points of each model.

More information on the idea can be found here

To have an agent use an alloy model, you can define more than one model in the model field, separated by commas.

Example:

agents:
  root:
    model: anthropic/claude-sonnet-4-0,openai/gpt-5-mini
    ...

Tool Configuration

Available MCP Tools

Common MCP tools include:

Filesystem: Read/write files
Shell: Execute shell commands
Database: Query databases
Web: Make HTTP requests
Git: Version control operations
Browser: Web browsing and automation
Code: Programming language specific tools
API: REST API integration tools

Configuring MCP Tools

Local (stdio) MCP Server:

toolsets:
  - type: mcp # Model Context Protocol
    command: string # Command to execute
    args: [] # Command arguments
    tools: [] # Optional: List of specific tools to enable
    env: [] # Environment variables for this tool
    env_file: [] # Environment variable files

Example:

toolsets:
  - type: mcp
    command: rust-mcp-filesystem
    args: ["--allow-write", "."]
    tools: ["read_file", "write_file"] # Optional: specific tools only
    env:
      - "RUST_LOG=debug"

Remote (SSE) MCP Server:

toolsets:
  - type: mcp # Model Context Protocol
    remote:
      url: string # Base URL to connect to
      transport_type: string # Type of MCP transport (sse or streamable)
      headers:
        key: value # HTTP headers. Mainly used for auth
    tools: [] # Optional: List of specific tools to enable

Example:

toolsets:
  - type: mcp
    remote:
      url: "https://mcp-server.example.com"
      transport_type: "sse"
      headers:
        Authorization: "Bearer your-token-here"
    tools: ["search_web", "fetch_url"]

Using tools via the Docker MCP Gateway

We recommend running containerized MCP tools, for security and resource isolation. Under the hood, cagent will run them with the Docker MCP Gateway so that all the tools in the Docker MCP Catalog can be accessed through a single endpoint.

In this example, lets configure duckduckgo to give our agents the ability to search the web:

toolsets:
  - type: mcp
    ref: docker:duckduckgo

Installing MCP Tools

Example installation of local tools with cargo or npm:

# Install Rust-based MCP filesystem tool
cargo install rust-mcp-filesystem

# Install other popular MCP tools
npm install -g @modelcontextprotocol/server-filesystem
npm install -g @modelcontextprotocol/server-git
npm install -g @modelcontextprotocol/server-web

Built-in Tools

Included in cagent are a series of built-in tools that can greatly enhance the capabilities of your agents without needing to configure any external MCP tools.

Configuration example

toolsets:
  - type: filesystem # Grants the agent filesystem access
  - type: think # Enables the think tool
  - type: todo # Enable the todo list tool
    shared: boolean # Should the todo list be shared between agents (optional)
  - type: memory # Allows the agent to store memories to a local sqlite db
    path: ./mem.db # Path to the sqlite database for memory storage (optional)

Let's go into a bit more detail about the built-in tools that agents can use:

Think Tool

The think tool allows agents to reason through problems step by step:

agents:
  root:
    # ... other config
    toolsets:
      - type: think

Todo Tool

The todo tool helps agents manage task lists:

agents:
  root:
    # ... other config
    toolsets:
      - type: todo

Memory Tool

The memory tool provides persistent storage:

agents:
  root:
    # ... other config
    toolsets:
      - type: memory
        path: "./agent_memory.db"

Task Transfer Tool

All agents automatically have access to the task transfer tool, which allows them to delegate tasks to other agents:

transfer_task(agent="developer", task="Create a login form", expected_output="HTML and CSS code")

Examples

Development Team

A complete development team with specialized roles:

agents:
  root:
    model: claude
    description: Technical lead coordinating development
    instruction: |
      You are a technical lead managing a development team.
      Coordinate tasks between developers and ensure quality.
    sub_agents: [developer, reviewer, tester]

  developer:
    model: claude
    description: Expert software developer
    instruction: |
      You are an expert developer. Write clean, efficient code
      and follow best practices.
    toolsets:
      - type: filesystem
      - type: shell
      - type: think

  reviewer:
    model: gpt4
    description: Code review specialist
    instruction: |
      You are a code review expert. Focus on code quality,
      security, and maintainability.
    toolsets:
      - type: filesystem

  tester:
    model: gpt4
    description: Quality assurance engineer
    instruction: |
      You are a QA engineer. Write tests and ensure
      software quality.
    toolsets:
      - type: shell
      - type: todo

models:
  gpt4:
    provider: openai
    model: gpt-4o

  claude:
    provider: anthropic
    model: claude-sonnet-4-0
    max_tokens: 64000

Research Assistant

A research-focused agent with web access:

agents:
  root:
    model: claude
    description: Research assistant with web access
    instruction: |
      You are a research assistant. Help users find information,
      analyze data, and provide insights.
    toolsets:
      - type: mcp
        command: mcp-web-search
        args: ["--provider", "duckduckgo"]
      - type: todo
      - type: memory
        path: "./research_memory.db"

models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-0
    max_tokens: 64000

Advanced Features

Agent Store and Distribution

cagent supports distributing via, and running agents from, Docker registries:

# Pull an agent from a registry
./bin/cagent pull docker.io/username/my-agent:latest

# Push your agent to a registry
./bin/cagent push docker.io/username/my-agent:latest

# Run an agent directly from an image reference
./bin/cagent run docker.io/username/my-agent:latest

Agent References:

File agents: my-agent.yaml (relative path)
Store agents: docker.io/username/my-agent:latest (full Docker reference)

Troubleshooting

Common Issues

Agent not responding:

Check API keys are set correctly
Verify model name matches provider
Check network connectivity

Tool errors:

Ensure MCP tools are installed and accessible
Check file permissions for filesystem tools
Verify tool arguments and command paths
Test MCP tools independently before integration
Check tool lifecycle (start/stop) in debug logs

Configuration errors:

Validate YAML syntax
Check all referenced agents exist
Ensure all models are defined
Verify toolset configurations
Check agent hierarchy (sub_agents references)

Session and connectivity issues:

Verify port availability for MCP server modes
Test MCP endpoint accessibility (curl test)
Verify client isolation in multi-tenant scenarios
Check session timeouts and limits

Performance issues:

Monitor memory usage with multiple concurrent sessions
Check for tool resource leaks
Verify proper session cleanup
Monitor streaming response performance

Debug Mode

Enable debug logging for detailed information:

# CLI mode
./bin/cagent run config.yaml --debug

Log Analysis

Check logs for:

API call errors and rate limiting
Tool execution failures and timeouts
Configuration validation issues
Network connectivity problems
MCP protocol handshake issues
Session creation and cleanup events
Client isolation boundary violations

Agent Store Issues

# Test Docker registry connectivity
docker pull docker.io/username/agent:latest

# Verify agent content
./bin/cagent pull docker.io/username/agent:latest

Integration Examples

Custom Memory Strategies

Implement persistent memory across sessions:

agents:
  researcher:
    model: claude
    instruction: |
      You are a research assistant with persistent memory.
      Remember important findings and reference previous research.
    toolsets:
      - type: memory
        path: ./research_memory.db

Multi-Model Teams

models:
  # Local model for fast responses
  claude_local:
    provider: anthropic
    model: claude-sonnet-4-0
    temperature: 0.2

  gpt4:
    provider: openai
    model: gpt-4o
    temperature: 0.1

  # Creative model for content generation
  gpt4_creative:
    provider: openai
    model: gpt-4o
    temperature: 0.8

agents:
  analyst:
    model: claude_local
    description: Fast analysis and reasoning

  coder:
    model: gpt4
    description: not very creative developer

  writer:
    model: gpt4_creative
    description: Creative content generation

This guide should help you get started with cagent and build powerful multi-agent systems.

FilesExpand file tree

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

Usage and configuration

What is cagent?

Why?

Running Agents

Command Line Interface

Interface-Specific Features

CLI Interactive Commands

🔧 Configuration Reference

Agent Properties

Example

Running with named commands

Model Properties

Example

Reasoning Effort (thinking_budget)

Interleaved Thinking (Anthropic)

Model Examples

DMR (Docker Model Runner) provider usage

Alloy models

Tool Configuration

Available MCP Tools

Configuring MCP Tools

Using tools via the Docker MCP Gateway

Installing MCP Tools

Built-in Tools

Think Tool

Todo Tool

Memory Tool

Task Transfer Tool

Examples

Development Team

Research Assistant

Advanced Features

Agent Store and Distribution

Troubleshooting

Common Issues

Debug Mode

Log Analysis

Agent Store Issues

Integration Examples

Custom Memory Strategies

Multi-Model Teams