A production-ready Go library for building tool-using, code-executing agents across frontier, open, and CLI-native model providers. MCP support is built in, but it is only one part of the runtime.
- Build one agent runtime instead of separate code paths for MCP tools, code execution, and provider switching
- Mix API-native models and CLI-native coding agents like Claude Code, Gemini CLI, and Codex-style providers
- Add production features such as summarization, large-output offloading, structured output, parallel tools, tracing, and caching
- Reuse the same runtime from Go applications and from the Node.js SDK
MCPAgent is a general-purpose Go agent runtime. It gives you one agent abstraction that can:
- Use MCP tools across multiple servers and protocols (HTTP, SSE, stdio)
- Run in multiple execution modes with
SimpleAgent, tool search, and code execution - Connect to coding-agent CLIs such as Claude Code, Gemini CLI, and Codex-style providers
- Route across model ecosystems including OpenAI, Anthropic, OpenRouter, Bedrock, Vertex, Azure, MiniMax, and open-model gateways
- Execute tools efficiently with optional parallel tool calls, caching, and dynamic tool discovery
- Stay productive in long sessions with context summarization and large-output offloading
- Return structured results using fixed conversion or tool-based structured output
- Support production workflows with observability, custom tools, session reuse, and a Node.js SDK
If you only need MCP, the library does that well. If you need a broader agent runtime that can mix MCP, code execution, provider routing, coding agents, and structured workflows, that is the larger value of the project.
If you are evaluating the project for the first time, these are the best first examples:
- basic/ - Smallest working MCP-backed agent example
- workflow_model_routing/ - Compare Kimi, MiniMax M2.7, and GLM-5.1 on the same MCP-backed workflow task
- basic_claude_code/ - Coding-agent CLI flow through the MCP bridge
- basic_gemini_cli/ - Fast Gemini CLI bridge example
- basic_gemini_cli_fallback_claude_code/ - Gemini CLI with Claude Code fallback
- multi-turn/ - Conversation history and cumulative token tracking
- nodejs-sdk/ - JavaScript/TypeScript SDK examples over gRPC
If you want a broader multi-tool demo, use multi-mcp-server/ after the basics.
# Add to your go.mod
go get github.com/manishiitg/mcpagent
# Or use replace directive for local development
replace github.com/manishiitg/mcpagent => ../mcpagentpackage main
import (
"context"
"fmt"
"os"
"time"
mcpagent "github.com/manishiitg/mcpagent/agent"
"github.com/manishiitg/mcpagent/llm"
)
func main() {
openAIKey := os.Getenv("OPENAI_API_KEY")
if openAIKey == "" {
panic("OPENAI_API_KEY is required")
}
llmModel, err := llm.InitializeLLM(llm.Config{
Provider: llm.ProviderOpenAI,
ModelID: "gpt-4o",
APIKeys: &llm.ProviderAPIKeys{
OpenAI: &openAIKey,
},
})
if err != nil {
panic(err)
}
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"mcp_servers.json",
mcpagent.WithMode(mcpagent.SimpleAgent),
)
if err != nil {
panic(err)
}
response, err := agent.Ask(ctx, "What tools are available?")
if err != nil {
panic(err)
}
fmt.Println(response)
}See examples/ for complete working examples:
- basic/ - Basic agent setup with single MCP server
- workflow_model_routing/ - Run the same MCP-backed prompt against Kimi, MiniMax M2.7, or GLM-5.1
- basic_claude_code/ - Basic Claude Code setup using the MCP bridge layer (defaults to
claude-haiku-4-5) - basic_gemini_cli/ - Basic Gemini CLI setup using the MCP bridge layer (defaults to
flash-lite) - basic_gemini_cli_fallback_claude_code/ - Gemini CLI primary with Claude Code fallback (supports
FORCE_FALLBACK=1) - basic_codex_cli/ - Basic Codex CLI setup using the MCP bridge layer (defaults to
gpt-5.3-codex-spark) - multi-turn/ - Multi-turn conversations with history
- multi-mcp-server/ - Connect to multiple MCP servers
- browser-automation/ - Browser automation with Playwright
- structured_output/ - Structured output examples
- custom_tools/ - Register and use custom tools
- code_execution/ - Code execution mode examples
- simple/ - Basic code execution (no folder guards)
- browser-automation/ - Code execution with browser automation
- multi-mcp-server/ - Code execution with tool filtering
- custom_tools/ - Custom tools in code execution mode
- tool_search/ - Tool search mode for dynamic tool discovery
- nodejs-sdk/ - Node.js SDK examples (see below)
The official Node.js/TypeScript SDK provides a simple interface for building MCP agents in JavaScript/TypeScript applications. The SDK communicates with the Go server via gRPC over Unix sockets for low-latency, bidirectional streaming, and it can route through API providers as well as CLI-native providers like gemini-cli.
npm install @mcpagent/nodeimport { MCPAgent } from '@mcpagent/node';
const agent = new MCPAgent({
serverOptions: {
mcpConfigPath: './mcp_servers.json',
logLevel: 'info',
},
});
// Initialize with your LLM provider
await agent.initialize({
provider: 'gemini-cli',
modelId: 'flash-lite',
});
// Ask a question
const response = await agent.ask('What tools do you have available?');
console.log(response.response);
// Streaming responses
for await (const event of agent.askStream('Explain quantum computing')) {
if (event.type === 'chunk') {
process.stdout.write(event.text);
} else if (event.type === 'final' && event.response) {
console.log(event.response);
}
}
// Cleanup
await agent.destroy();Register JavaScript/TypeScript handlers that the LLM can call:
import { MCPAgent } from '@mcpagent/node';
const agent = new MCPAgent({
serverOptions: { mcpConfigPath: './mcp_servers.json' },
});
// Register a calculator tool
agent.registerTool(
'calculate',
'Perform a mathematical calculation',
{
type: 'object',
properties: {
expression: { type: 'string', description: 'Math expression to evaluate' },
},
required: ['expression'],
},
async (args) => {
const result = Function(`"use strict"; return (${args.expression})`)();
return String(result);
},
{ timeoutMs: 5000 }
);
await agent.initialize({
provider: 'vertex',
modelId: 'gemini-3-flash-preview',
});
// The LLM can now use your custom tool
const response = await agent.ask('What is 15 * 7 + 23?');
// Output: 15 * 7 + 23 = 128The Node.js SDK uses a gRPC bidirectional streaming architecture:
Node.js SDK ββββββββββββββββββββββββββββββββββββΊ Go Server
Single bidirectional gRPC stream
- Client sends: questions, tool results
- Server sends: text chunks, tool calls, events, final response
Benefits:
- Real-time streaming: Token-by-token responses via gRPC stream
- Inline tool callbacks: Custom tools execute in the same connection (no separate callback server)
- Low latency: Unix domain sockets for IPC
- Type-safe: Protocol Buffers for all messages
See examples/nodejs-sdk/ for complete examples:
- basic.ts - Basic agent setup and queries
- custom-tools.ts - Register and use custom tools
- multi-turn.ts - Multi-turn conversations
- Gemini CLI support - The SDK now supports
provider: 'gemini-cli'for CLI-native Gemini runs from Node.js
For full SDK documentation, see sdk-node/README.md.
The default mode where the LLM invokes tools directly through native tool calling:
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"config.json",
mcpagent.WithMode(mcpagent.SimpleAgent),
)Enable dynamic tool discovery for large tool catalogs. The LLM starts with only search_tools and discovers tools on-demand:
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"config.json",
mcpagent.WithToolSearchMode(true),
// Optional: pre-discover frequently used tools
mcpagent.WithPreDiscoveredTools([]string{"get_weather", "send_message"}),
)How it works:
- LLM sees only
search_toolsinitially - LLM calls
search_tools(query: "weather")to find relevant tools - Discovered tools (
get_weather,weather_forecast) become available - LLM can now use discovered tools
Features:
- Regex pattern matching for flexible search
- Fuzzy search fallback when no exact matches found
- Pre-discovered tools option for frequently used tools
- Works with any LLM provider
See docs/tool_search_mode.md for details.
Execute code in any language (Python, bash, curl, Go, etc.) instead of JSON tool calls. The LLM discovers MCP tool endpoints via an OpenAPI spec and writes code that makes HTTP requests:
// Generate API token for bearer auth
apiToken := executor.GenerateAPIToken()
// Start HTTP server with per-tool endpoints and auth
handlers := executor.NewExecutorHandlers(configPath, logger)
mux := http.NewServeMux()
mux.HandleFunc("/api/mcp/execute", handlers.HandleMCPExecute)
mux.HandleFunc("/api/custom/execute", handlers.HandleCustomExecute)
// Per-tool wildcard endpoints (used by OpenAPI spec)
mux.HandleFunc("/tools/mcp/", func(w http.ResponseWriter, r *http.Request) {
// Route /tools/mcp/{server}/{tool} to handler
path := strings.TrimPrefix(r.URL.Path, "/tools/mcp/")
parts := strings.SplitN(path, "/", 2)
server, tool := parts[0], parts[1]
handlers.HandlePerToolMCPRequest(w, r, server, tool)
})
authedHandler := executor.AuthMiddleware(apiToken)(mux)
server := &http.Server{Addr: "127.0.0.1:8000", Handler: authedHandler}
go server.ListenAndServe()
defer server.Shutdown(ctx)
// Create agent with code execution mode
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"config.json",
mcpagent.WithCodeExecutionMode(true),
mcpagent.WithAPIConfig("http://127.0.0.1:8000", apiToken),
)The LLM calls get_api_spec(server_name) to discover per-tool HTTP endpoints, then uses execute_shell_command to write and run code that calls those endpoints. Custom tools (workspace tools, shell execution) remain as direct tool calls.
Note: Code execution mode requires an HTTP server with bearer token auth running (configurable via WithAPIConfig()).
Dynamically filter tools based on conversation context to reduce token usage:
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"config.json",
mcpagent.WithSmartRouting(true), // DEPRECATED
mcpagent.WithSmartRoutingThresholds(20, 3), // DEPRECATED
)Context offloading is a context engineering strategy that automatically saves large tool outputs to the filesystem instead of keeping them in the LLM's context window. This implements the "offload context" pattern, one of three primary context engineering approaches used in production agents like Manus.
Why Context Offloading?
As agents execute tasks, tool call results accumulate in the context window. Research from Chroma and Anthropic shows that as context windows fill, LLM performance degrades due to attention budget depletion. Context offloading prevents this by:
- Saving tokens: Only file path + preview (~200 chars) instead of full content (potentially 50k+ chars)
- Preventing context overflow: Large outputs don't consume context window space
- Maintaining performance: LLM attention budget isn't depleted by large payloads
- Enabling efficient exploration: Agent can access data incrementally as needed
How It Works:
agent, err := mcpagent.NewAgent(
ctx, llmModel, "config.json",
mcpagent.WithContextOffloading(true),
mcpagent.WithLargeOutputThreshold(10000), // tokens (default)
)When tool outputs exceed the threshold:
- External Storage: Full content is saved to
tool_output_folder/{session-id}/with unique filenames - Compact Reference: LLM receives file path + preview (first 50% of threshold) instead of full content
- On-Demand Access: Agent uses virtual tools to access data incrementally:
read_large_output- Read specific character rangessearch_large_output- Search for patterns using ripgrepquery_large_output- Execute jq queries on JSON files
Example Token Savings:
Without Context Offloading:
- Tool Output: 50,000 characters (~12,500 tokens)
- Sent to LLM: 50,000 chars (~12,500 tokens)
- Result: Context window overflow, attention budget depletion
With Context Offloading:
- Tool Output: 50,000 characters (~12,500 tokens)
- Saved to filesystem: 50,000 chars
- Sent to LLM: ~200 chars (file path + preview) (~50 tokens)
- Result: 99.6% token reduction, no context overflow
Note: The threshold is measured in tokens (using tiktoken encoding), not characters.
A threshold of 10000 tokens roughly equals ~40,000 characters (assuming ~4 chars per token).
Related Patterns:
This implementation follows the context engineering strategies outlined in Manus's approach:
- Offload Context: Store tool results externally, access on-demand β Implemented
- Reduce Context: Compact stale results, summarize when needed β³ Pending
- Isolate Context: Use sub-agents for discrete tasks (multi-agent support)
Similar patterns are used in Claude Code, LangChain, and other production agent systems.
Pending: Dynamic Context Reduction
Currently, context offloading only applies to large tool outputs when they're first generated. A future enhancement will implement dynamic context reduction to compact stale tool results as the context window fills, even if they weren't initially large.
What's Pending:
-
Compact Stale Results
- Concept: Replace older tool results with compact references (e.g., file paths) as context fills
- Behavior: Keep recent tool results in full to guide the agent's next decision, while older results are replaced with references
- Implementation: Automatically detect when tool results become "stale" (based on age, relevance, or context usage) and replace them with compact references
- Scope: This would apply to ALL tool results (not just large ones), dynamically compacting them when they become "stale"
- Reference: Similar to Anthropic's context editing feature
- Example: A 2000-token tool result from 10 turns ago becomes:
"Tool: search_docs returned results (saved to: tool_output_folder/session-123/search_20250101_120000.json)"
-
Summarize When Needed
- Concept: Once compaction reaches diminishing returns, apply schema-based summarization to the full trajectory
- Behavior: Generate consistent summary objects using full tool results, further reducing context while preserving essential information
- Implementation: When compaction alone isn't enough to manage context size, apply structured summarization with predefined schemas for different tool result types
- Scope: Summarize the entire conversation trajectory when individual compaction is insufficient
- Example: Instead of keeping 20 tool calls with full results, create a structured summary:
{ "tool_calls_summary": [ {"tool": "search", "count": 5, "key_findings": ["..."], "files": ["..."]}, {"tool": "read_file", "count": 3, "files_read": ["..."]} ] }
Current Behavior vs. Future Enhancement:
Current (Context Offloading):
- Large output (>10k tokens) β Offloaded immediately
- Small output (<10k tokens) β Stays in context forever
- Result: Context can still fill up with many small tool results
Future (Context Reduction):
- Large output (>10k tokens) β Offloaded immediately β
- Small output (<10k tokens) β Stays in context initially
- As context fills β Small outputs become "stale" β Compacted to references
- When compaction insufficient β Summarize trajectory
- Result: Context window stays manageable throughout long conversations
This enhancement would complete the "Reduce Context" strategy from Manus's context engineering approach, working alongside context offloading to maintain optimal context window usage.
See the Context Offloading example for a complete demonstration.
See the Context Offloading example for a complete demonstration.
Automatically summarize conversation history when token usage exceeds a threshold to maintain long-running conversations:
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"config.json",
// Enable context summarization
mcpagent.WithContextSummarization(true),
// Trigger when token usage reaches 70% of context window
mcpagent.WithSummarizeOnTokenThreshold(true, 0.7),
// Keep last 8 messages intact
mcpagent.WithSummaryKeepLastMessages(8),
)The agent monitors token usage and automatically replaces older messages with a concise LLM-generated summary when the threshold is reached, while preserving recent messages and tool call integrity. This enables "infinite" conversation depth within fixed context windows.
Intelligent caching reduces connection times by 60-85%:
// Caching is enabled by default
// Configure via environment variables:
// MCP_CACHE_DIR=/path/to/cache
// MCP_CACHE_TTL_MINUTES=10080 (7 days)Get structured data from LLM responses in two ways:
Fixed Conversion Model (2 LLM calls - reliable):
type Person struct {
Name string `json:"name"`
Age int `json:"age"`
Email string `json:"email"`
}
person, err := mcpagent.AskStructured(
agent,
ctx,
"Create a person profile for John Doe, age 30, email john@example.com",
Person{},
schemaString,
)Tool-Based Model (1 LLM call - faster):
result, err := mcpagent.AskWithHistoryStructuredViaTool[Order](
agent,
ctx,
messages,
"submit_order",
"Submit an order with items",
orderSchema,
)
if result.HasStructuredOutput {
order := result.StructuredResult
// Use structured order data
}See examples/structured_output/ for complete examples.
Register your own tools that work alongside MCP server tools. Custom tools work in both standard mode and code execution mode:
Standard Mode (direct tool calls):
// Define tool parameters (JSON schema)
params := map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"operation": map[string]interface{}{
"type": "string",
"enum": []string{"add", "subtract", "multiply", "divide"},
},
"a": map[string]interface{}{"type": "number"},
"b": map[string]interface{}{"type": "number"},
},
"required": []string{"operation", "a", "b"},
}
// Register the tool
err := agent.RegisterCustomTool(
"calculator",
"Performs mathematical operations",
params,
calculatorFunction,
"utility", // category (required)
)
// Tool execution function
func calculatorFunction(ctx context.Context, args map[string]interface{}) (string, error) {
// Extract and validate arguments
operation := args["operation"].(string)
a := args["a"].(float64)
b := args["b"].(float64)
// Perform calculation
var result float64
switch operation {
case "add": result = a + b
case "subtract": result = a - b
// ...
}
return fmt.Sprintf("Result: %.2f", result), nil
}Code Execution Mode (direct tool calls + HTTP API):
// In code execution mode, custom tools are:
// 1. Exposed as direct LLM tool calls (e.g., execute_shell_command, workspace tools)
// 2. MCP server tools are accessed via HTTP API endpoints (discovered via get_api_spec)
// 3. Custom tools can also be accessed via /api/custom/execute endpoint
// Register custom tool (same API)
err := agent.RegisterCustomTool(
"get_weather",
"Gets weather data for a location",
weatherParams,
weatherFunction,
"data", // category
)
// LLM can call custom tools directly as tool calls,
// or use get_api_spec to discover HTTP endpoints for MCP toolsSee examples/custom_tools/ for standard mode examples and examples/code_execution/custom_tools/ for code execution mode examples.
When the LLM returns multiple tool calls in a single response, they can be executed concurrently using goroutines (fork-join pattern) instead of sequentially:
agent, err := mcpagent.NewAgent(
ctx, llmModel, "config.json",
mcpagent.WithParallelToolExecution(true),
)How it works:
- LLM returns N tool calls in one response
- All tool calls are prepared sequentially (argument parsing, client resolution)
- Tool calls execute concurrently via goroutines
- Results are collected in deterministic order matching the original tool call order
Observability: ToolCallStartEvent includes an IsParallel field (true when the tool call is part of a parallel batch, false for sequential execution) so event listeners and tracers can distinguish between parallel and sequential tool calls.
Built-in tracing with Langfuse support:
tracer := observability.NewLangfuseTracer(...)
agent, err := mcpagent.NewAgent(
ctx,
llmModel,
"config.json",
mcpagent.WithTracer(tracer),
mcpagent.WithTraceID("trace-id"),
mcpagent.WithLogger(logger),
)Comprehensive documentation is available in the docs/ directory:
- OAuth Authentication - OAuth 2.0 authentication for MCP servers
- Code Execution Agent - Execute code in any language via OpenAPI spec
- Tool Search Mode - Dynamic tool discovery for large tool catalogs
- Tool-Use Agent - Standard tool calling mode
- Context Summarization - Automatic history summarization
- Smart Routing (DEPRECATED) - Dynamic tool filtering
- Context Offloading - Offload large tool outputs to filesystem (offload context pattern)
- Implements the "offload context" strategy from Manus's context engineering approach
- Prevents context window overflow and reduces token costs
- Enables efficient on-demand data access via virtual tools
- MCP Cache System - Server metadata caching
- Folder Guard - Fine-grained file access control
- LLM Resilience - Error handling and fallbacks
- Event System - Event architecture
- Parallel Tool Execution - Concurrent tool call execution
- Token Tracking - Usage monitoring
Complete working examples are available in the examples/ directory:
- basic/ - Simple agent setup with a single MCP server
- multi-turn/ - Multi-turn conversations with conversation history
- context_summarization/ - Automatic context summarization
- basic_claude_code/ - Claude Code provider with bridge-backed MCP access
- Uses
ProviderClaudeCodewith themcpbridgeflow - Starts a local executor API automatically for bridge-backed tool access
- Defaults to the faster
claude-haiku-4-5model
- Uses
- basic_gemini_cli/ - Gemini CLI provider with bridge-backed MCP access
- Uses
ProviderGeminiCLIwith themcpbridgeflow - Starts a local executor API automatically for bridge-backed tool access
- Defaults to the faster
flash-litemodel
- Uses
- basic_gemini_cli_fallback_claude_code/ - Gemini CLI primary with Claude Code fallback
- Uses
ProviderGeminiCLIas primary andProviderClaudeCodeas cross-provider fallback - Demonstrates
mcpagent.WithCrossProviderFallback(...) - Supports
FORCE_FALLBACK=1to intentionally fail Gemini and verify the Claude Code handoff
- Uses
- basic_codex_cli/ - Codex CLI provider with bridge-backed MCP access
- Uses
ProviderCodexCLIwith themcpbridgeflow - Starts a local executor API automatically for bridge-backed tool access
- Defaults to the faster
gpt-5.3-codex-sparkmodel
- Uses
- multi-mcp-server/ - Connect to multiple MCP servers simultaneously
- browser-automation/ - Browser automation using Playwright MCP server
-
structured_output/fixed/ - Fixed conversion model for structured output
- Uses
AskStructured()method - 2 LLM calls (text response + JSON conversion)
- More reliable, works with complex schemas
- Uses
-
structured_output/tool/ - Tool-based model for structured output
- Uses
AskWithHistoryStructuredViaTool()method - 1 LLM call (tool call during conversation)
- Faster and more cost-effective
- Uses
-
custom_tools/ - Register and use custom tools
- Register multiple custom tools with different categories
- Tools work alongside MCP server tools
- Examples: calculator, text formatter, weather simulator, text counter
-
offload_context/ - Context offloading example
- Demonstrates automatic offloading of large tool outputs to filesystem
- Shows how tool results are stored externally and accessed on-demand
- Uses virtual tools (
read_large_output,search_large_output,query_large_output) for efficient data exploration - Example: Search operations that produce large results, automatically offloaded and accessed incrementally
- tool_search/ - Tool search mode for dynamic tool discovery
- LLM starts with only
search_toolsvirtual tool - Demonstrates searching for tools using regex patterns
- Shows how discovered tools become available dynamically
- Uses Vertex AI with Gemini 3 Flash
- Example: Search for documentation tools and use them to get library information
- LLM starts with only
-
code_execution/simple/ - Basic code execution mode
- LLM discovers MCP tools via OpenAPI spec (
get_api_spec) - Writes and executes code in any language (Python, bash, curl, etc.)
- Bearer token auth secures API endpoints
- Per-tool HTTP endpoints for MCP tool access
- No folder guards (simplest example)
- HTTP server with auth required
- LLM discovers MCP tools via OpenAPI spec (
-
code_execution/browser-automation/ - Code execution with browser automation
- Combines code execution mode with Playwright MCP server
- Complex multi-step browser automation tasks
- Example: IPO analysis with web scraping and data collection
-
code_execution/multi-mcp-server/ - Code execution with tool filtering
- Demonstrates tool filtering in code execution mode
- Uses
WithSelectedTools()andWithSelectedServers()to filter available tools - Example: Selective tool access across multiple MCP servers
-
code_execution/custom_tools/ - Custom tools in code execution mode
- Register custom tools that work in code execution mode
- Custom tools exposed as direct LLM tool calls
- MCP tools accessed via HTTP API with bearer auth
- Example: Weather tool accessible alongside MCP tools
Examples include:
- Complete working code
- MCP server configuration
- Setup instructions in code, local files, or companion docs
Some example directories include dedicated README.md files, while others are intentionally lightweight and are meant to be read directly from the example source.
Create a JSON file with your MCP servers:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "./demo"]
},
"memory": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}
}
}The agent supports extensive configuration via functional options:
agent, err := mcpagent.NewAgent(
ctx, llmModel, "config.json",
// Observability (optional)
mcpagent.WithTracer(tracer),
mcpagent.WithTraceID(traceID),
mcpagent.WithLogger(logger),
// Agent mode
mcpagent.WithMode(mcpagent.SimpleAgent),
// Conversation settings
mcpagent.WithMaxTurns(30),
mcpagent.WithTemperature(0.7),
mcpagent.WithToolChoice("auto"),
// Code execution
mcpagent.WithCodeExecutionMode(true),
// Tool search mode (dynamic tool discovery)
mcpagent.WithToolSearchMode(true),
mcpagent.WithPreDiscoveredTools([]string{"tool1", "tool2"}),
// Smart routing (DEPRECATED)
mcpagent.WithSmartRouting(true), // DEPRECATED
mcpagent.WithSmartRoutingThresholds(20, 3), // DEPRECATED
// Parallel tool execution (concurrent goroutines for multiple tool calls)
mcpagent.WithParallelToolExecution(true),
// Context offloading (offload large tool outputs to filesystem)
mcpagent.WithContextOffloading(true),
mcpagent.WithLargeOutputThreshold(10000),
// Context summarization
mcpagent.WithContextSummarization(true),
mcpagent.WithSummarizeOnTokenThreshold(true, 0.7),
// Custom tools
mcpagent.WithCustomTools(customTools),
// Tool selection
mcpagent.WithSelectedTools([]string{"server1:tool1", "server2:*"}),
mcpagent.WithSelectedServers([]string{"server1", "server2"}),
// Custom tool registration (after agent creation)
// agent.RegisterCustomTool(name, description, params, execFunc, category)
)
// Folder guard paths are set on the created agent instance
agent.SetFolderGuardPaths(allowedRead, allowedWrite)The package includes comprehensive testing utilities:
# Run all tests
cd cmd/testing
go test ./...
# Run specific test
go run testing.go agent-mcp --log-file logs/test.log
go run testing.go code-exec --log-file logs/test.log
go run testing.go smart-routing --log-file logs/test.log
go run testing.go parallel-tool-exec --provider vertex --model gemini-3-flash-previewSee cmd/testing/README.md for details.
mcpagent/
βββ agent/ # Core agent implementation
β βββ agent.go # Main Agent struct and NewAgent()
β βββ conversation.go # Conversation loop and tool execution
β βββ connection.go # MCP server connection management
β βββ ...
βββ grpcserver/ # gRPC server (for SDK communication)
β βββ server.go # gRPC server setup
β βββ service.go # AgentService implementation
β βββ stream_handler.go # Bidirectional stream handling
β βββ pb/ # Generated protobuf code
βββ mcpclient/ # MCP client implementations
β βββ client.go # Client interface and implementations
β βββ stdio_manager.go # stdio protocol
β βββ sse_manager.go # SSE protocol
β βββ http_manager.go # HTTP protocol
βββ mcpcache/ # Caching system
β βββ manager.go # Cache manager
β βββ openapi/ # OpenAPI spec generation for code execution mode
βββ llm/ # LLM provider integration
β βββ providers.go # Provider implementations
β βββ types.go # LLM types
βββ events/ # Event system
β βββ data.go # Event data structures
β βββ types.go # Event types
βββ logger/ # Logging
β βββ v2/ # Logger v2 interface
βββ observability/ # Tracing and observability
β βββ tracer.go # Tracer interface
β βββ langfuse_tracer.go # Langfuse implementation
βββ executor/ # Tool execution handlers
βββ sdk-node/ # Node.js/TypeScript SDK
β βββ src/ # SDK source code
β β βββ agent.ts # MCPAgent class
β β βββ grpc-client.ts # gRPC client
β β βββ stream-handler.ts # Stream management
β βββ README.md # SDK documentation
βββ proto/ # Protocol Buffer definitions
β βββ agent.proto # gRPC service definitions
βββ examples/ # Example applications
βββ docs/ # Documentation
- OpenAI: GPT-4.1, GPT-4o, reasoning models, and compatible tool-calling models
- Anthropic: Claude models through direct provider integration
- OpenRouter: Access to open and frontier models behind a unified API
- AWS Bedrock: Claude, Llama, Mistral, and other Bedrock-served models
- Google Vertex AI: Gemini and related Vertex-hosted models
- Azure: Azure-hosted OpenAI and related model deployments
- Claude Code / Gemini CLI / Codex-style CLI providers: Coding-agent integrations through provider abstractions
- MiniMax: MiniMax chat and coding-plan providers
- Custom Providers: Extensible provider interface
MCP remains an important integration layer in the runtime, with support for:
- stdio: Standard input/output (most common)
- SSE: Server-Sent Events
- HTTP: REST API
Contributions are welcome! Please see the Documentation Writing Guide for standards.
This project is licensed under the MIT License - see the LICENSE file for details.
- MCP Protocol: Built on the Model Context Protocol
- multi-llm-provider-go: LLM provider abstraction layer
- mcp-go: MCP protocol implementation
- Context Engineering: Context offloading implementation inspired by Manus's context engineering strategies
Made with β€οΈ for the AI community