Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions docs/ai-triage.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ JSON-RPC 2.0 over HTTP at `POST /mcp`. Implements `initialize`, `tools/list`, `t
|---------|----------|---------|-------------|
| `AGENT_MCP_ENABLED` | `--mcp-enabled` | `false` | Enable the MCP server |
| `AGENT_MCP_BIND` | `--mcp-bind` | `127.0.0.1:9103` | Listen address (host:port) |
| `AGENT_MCP_TOKEN` | `--mcp-token` | (none) | Bearer token required on every request. Mandatory for non-loopback bind. |

### Tools

Expand All @@ -80,8 +81,8 @@ Tools whose dependency isn't configured (eBPF off, port-scan off, GPU absent, et

### Security model

- **Localhost-only by default.** The bind defaults to `127.0.0.1:9103`. Override to `0.0.0.0:9103` only behind a network policy (firewall, mesh policy, security group).
- **No built-in auth.** The trust boundary is the bind address. If you expose MCP off the loopback, you must put auth in front of it (mTLS at the proxy, IP allow-list, or wrap it in a sidecar).
- **Localhost-only by default.** The bind defaults to `127.0.0.1:9103`. When loopback (`127.0.0.0/8`, `::1`, `localhost`), the trust boundary is the bind address itself.
- **Token mandatory off-loopback.** If `AGENT_MCP_BIND` is changed to a non-loopback address, `AGENT_MCP_TOKEN` must be set or `serve_mcp` refuses to start. When set, every request must carry `Authorization: Bearer <token>`; comparison is constant-time. Setting the token on a loopback bind is allowed and stacks as defence in depth. A network-policy layer in front (firewall, mesh policy, security group) remains a separate, encouraged control.
- **Read-mostly tools.** `allocate_ports` is stateless (caller must bind immediately) and `agent_check_update --force` only triggers a manifest poll — neither mutates fleet state.

### Example: call a tool
Expand Down
5 changes: 3 additions & 2 deletions docs/ai-triage.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ MCP 服务器和 triage 端点之间**不直接通信**。运维人员(或后续
|---------|---------|--------|------|
| `AGENT_MCP_ENABLED` | `--mcp-enabled` | `false` | 启用 MCP 服务器 |
| `AGENT_MCP_BIND` | `--mcp-bind` | `127.0.0.1:9103` | 监听地址(host:port) |
| `AGENT_MCP_TOKEN` | `--mcp-token` | (无) | 每个请求都必须携带的 Bearer token。非 loopback 绑定时必须设置。 |

### 工具列表

Expand All @@ -80,8 +81,8 @@ MCP 服务器和 triage 端点之间**不直接通信**。运维人员(或后续

### 安全模型

- **默认仅监听 localhost。** 默认绑定 `127.0.0.1:9103`。改成 `0.0.0.0:9103` 必须配合网络策略(防火墙、mesh 策略、安全组)
- **没有内置认证。** 信任边界就是绑定地址本身。如果把 MCP 暴露到 loopback 之外,必须自己加认证(代理层 mTLS、IP 白名单、或包一层 sidecar)
- **默认仅监听 localhost。** 默认绑定 `127.0.0.1:9103`。当绑定为 loopback(`127.0.0.0/8`、`::1`、`localhost`)时,信任边界就是绑定地址本身
- **非 loopback 时强制 token。** 如果把 `AGENT_MCP_BIND` 改成非 loopback 地址,必须同时设置 `AGENT_MCP_TOKEN`,否则 `serve_mcp` 拒绝启动。一旦配置了 token,每个请求都必须携带 `Authorization: Bearer <token>`;比较使用恒定时间算法。loopback 上同样可以配置 token 作为纵深防御。前置网络策略(防火墙、mesh 策略、安全组)仍然是另一道独立的、推荐的防线
- **以读为主的工具集。** `allocate_ports` 是无状态的(调用方必须立即 bind),`agent_check_update --force` 只触发一次 manifest 轮询 —— 两者都不修改集群状态。

### 示例:调用一个工具
Expand Down
11 changes: 9 additions & 2 deletions sigma-agent/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ sigma-agent/
| `AGENT_EBPF_TRAFFIC_MAX_ENTRIES` | `--ebpf-traffic-max-entries` | `8192` | BPF map max entries (unique PIDs) |
| `AGENT_MCP_ENABLED` | `--mcp-enabled` | `false` | Enable MCP (LLM tool) server |
| `AGENT_MCP_BIND` | `--mcp-bind` | `127.0.0.1:9103` | MCP listen address (host:port) |
| `AGENT_MCP_TOKEN` | `--mcp-token` | (none) | Bearer token required on every MCP request. Mandatory for non-loopback bind. |

## IP Discovery

Expand Down Expand Up @@ -491,8 +492,14 @@ external LLM can call. This is the agent half of Sigma's AI surface; the LLM "br
Idle cost ≈ a listening socket.
- **No new background loops.** All tools read from `Arc<RwLock<...>>` snapshots maintained by
existing subsystems.
- **Localhost-only by default.** `AGENT_MCP_BIND` defaults to `127.0.0.1:9103`. The whole
surface is gated by the bind address — there is no per-request auth.
- **Localhost-only by default.** `AGENT_MCP_BIND` defaults to `127.0.0.1:9103`. When the
bind is loopback (`127.0.0.0/8`, `::1`, `localhost`) the surface is gated by the bind
address alone.
- **Token mandatory off-loopback.** If `AGENT_MCP_BIND` is changed to a non-loopback
address, `AGENT_MCP_TOKEN` must be set or `serve_mcp` refuses to start (loud `error!`
log, agent keeps running with MCP disabled). When set, every request must carry
`Authorization: Bearer <token>`; the comparison is constant-time. Setting the token on a
loopback bind is allowed and stacks for defence in depth.
- **Read-mostly.** `allocate_ports` is stateless (caller must bind immediately) and
`agent_check_update --force` only triggers a manifest poll. No tool mutates fleet state.

Expand Down
19 changes: 18 additions & 1 deletion sigma-agent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ Config via environment variables or CLI flags (flags override env):
| `AGENT_SSH_PORT` | `--ssh-port` | `22` | SSH port to report |
| `AGENT_MCP_ENABLED` | `--mcp-enabled` | `false` | Enable MCP (LLM tool) server |
| `AGENT_MCP_BIND` | `--mcp-bind` | `127.0.0.1:9103` | MCP listen address (host:port) |
| `AGENT_MCP_TOKEN` | `--mcp-token` | (none) | Bearer token required on every MCP request. Mandatory when `--mcp-bind` is non-loopback. |

## Usage

Expand Down Expand Up @@ -269,7 +270,23 @@ When `--mcp-enabled` is set, the agent runs a [Model Context Protocol](https://m

**Design contract — keep the agent lean.** The MCP server is intentionally light: no LLM, no persistent state, no extra background loops. Each tool wraps data already collected by `port_scan`, `ebpf_traffic`, or `xds`, or proxies a single call to `sigma-api`. Idle resource cost is effectively a listening socket; per-call CPU is bounded by the underlying capability. This keeps the agent within its budget (<1% CPU, <50MB RSS) on 1 vCPU VPS instances. The "AI brain" lives in `sigma-api`, not here.

**Security default — localhost-only.** Binds to `127.0.0.1:9103` by default. Override to `0.0.0.0:9103` (or another address) only behind a network policy.
**Security defaults.** Binds to `127.0.0.1:9103` by default — loopback-only, no auth required. If you change `--mcp-bind` to a non-loopback address (`0.0.0.0`, a specific public IP, etc.), you **must** also set `--mcp-token` / `AGENT_MCP_TOKEN`. Without it the MCP server refuses to start (the rest of the agent keeps running) — the operator gets an `error!` log and chooses to set the token or revert the bind.

When a token is configured, every request must carry `Authorization: Bearer <token>`; tokens are compared in constant time. Setting the token on a loopback bind is allowed and stacks as defence in depth.

```bash
# Off-loopback example — both env vars required:
AGENT_MCP_ENABLED=true \
AGENT_MCP_BIND=0.0.0.0:9103 \
AGENT_MCP_TOKEN=$(openssl rand -hex 32) \
./sigma-agent

# Client must present the token:
curl -s -X POST http://<vps>:9103/mcp \
-H "Authorization: Bearer $AGENT_MCP_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
```

### Tools

Expand Down
5 changes: 5 additions & 0 deletions sigma-agent/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,11 @@ pub struct Config {
/// MCP server bind address (host:port). Defaults to 127.0.0.1 — override to 0.0.0.0 only with a network policy.
#[arg(long, env = "AGENT_MCP_BIND", default_value = "127.0.0.1:9103")]
pub mcp_bind: String,

/// Bearer token required for MCP requests. Mandatory when `mcp_bind` is non-loopback;
/// optional (and ignored if empty) when loopback. Callers must send `Authorization: Bearer <token>`.
#[arg(long, env = "AGENT_MCP_TOKEN")]
pub mcp_token: Option<String>,
}

impl Config {
Expand Down
4 changes: 4 additions & 0 deletions sigma-agent/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,10 @@ async fn main() -> Result<()> {
} else {
None
},
auth_token: config
.mcp_token
.clone()
.filter(|s| !s.is_empty()),
});
let bind = config.mcp_bind.clone();
info!(bind = %bind, "MCP server enabled");
Expand Down
130 changes: 125 additions & 5 deletions sigma-agent/src/mcp.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
use std::sync::Arc;

use axum::extract::State;
use axum::http::{HeaderMap, StatusCode};
use axum::routing::post;
use axum::{Json, Router};
use serde::{Deserialize, Serialize};
Expand Down Expand Up @@ -131,6 +132,10 @@ pub struct McpState {
pub update_info: Option<SharedUpdateInfo>,
/// Manifest URL — needed so `agent_check_update` can run a forced poll.
pub update_manifest_url: Option<String>,
/// Bearer token. When `Some`, every `/mcp` request must include
/// `Authorization: Bearer <token>`. Compared in constant time.
/// `serve_mcp` refuses to start a non-loopback bind unless this is set.
pub auth_token: Option<String>,
}

// ---------- Tool schemas (returned by tools/list) ----------
Expand Down Expand Up @@ -690,11 +695,28 @@ async fn tool_query_syn_flood_candidates(

async fn mcp_handler(
State(state): State<Arc<McpState>>,
headers: HeaderMap,
body: axum::body::Bytes,
) -> Json<JsonRpcResponse> {
) -> Result<Json<JsonRpcResponse>, StatusCode> {
// Auth: if a token is configured, every request must carry it.
// No `WWW-Authenticate` header — this is RPC, not a browser-facing API.
if let Some(ref expected) = state.auth_token {
let presented = headers
.get("authorization")
.and_then(|v| v.to_str().ok())
.and_then(|s| s.strip_prefix("Bearer "));
let ok = match presented {
Some(token) => constant_time_eq(token.as_bytes(), expected.as_bytes()),
None => false,
};
if !ok {
return Err(StatusCode::UNAUTHORIZED);
}
}

let req: JsonRpcRequest = match serde_json::from_slice(&body) {
Ok(r) => r,
Err(e) => return Json(err(None, ERR_PARSE, format!("parse error: {}", e))),
Err(e) => return Ok(Json(err(None, ERR_PARSE, format!("parse error: {}", e)))),
};

let id = req.id.clone();
Expand Down Expand Up @@ -726,16 +748,67 @@ async fn mcp_handler(
),
};

Json(response)
Ok(Json(response))
}

/// Length-stable byte comparison. The token length itself is not secret,
/// so an early `len()` mismatch is fine — what we don't want is short-circuit
/// content comparison leaking a per-byte timing oracle.
fn constant_time_eq(a: &[u8], b: &[u8]) -> bool {
if a.len() != b.len() {
return false;
}
let mut diff: u8 = 0;
for (x, y) in a.iter().zip(b.iter()) {
diff |= x ^ y;
}
diff == 0
}

/// True when `bind` resolves to a loopback interface (no remote network
/// exposure). Used by `serve_mcp` to decide whether `auth_token` is
/// mandatory. Recognises:
/// - IPv4 in `127.0.0.0/8` (anything `std::net::Ipv4Addr::is_loopback`)
/// - IPv6 `::1` (and any other `Ipv6Addr::is_loopback`)
/// - The literal hostname `localhost`
/// Anything else — including `0.0.0.0`, `::`, a public IP, or a DNS name —
/// is treated as non-loopback and therefore requires a token.
fn is_loopback_bind(bind: &str) -> bool {
// host:port. IPv6 binds wrap the host in brackets, e.g. `[::1]:9103`.
let host = match bind.rsplit_once(':') {
Some((h, _)) => h.trim_start_matches('[').trim_end_matches(']'),
None => bind,
};
if host.eq_ignore_ascii_case("localhost") {
return true;
}
if let Ok(ip) = host.parse::<std::net::IpAddr>() {
return ip.is_loopback();
}
false
}

#[allow(dead_code)] // referenced in error reporting via ERR_INVALID_PARAMS
const _: i32 = ERR_INVALID_PARAMS;

pub async fn serve_mcp(bind: String, state: Arc<McpState>) {
// Foot-gun guard: a non-loopback bind without a token would expose every
// tool — port allocation, route enumeration, forced update polls — to
// anyone who can reach the agent. Refuse to start instead of binding.
// The agent itself keeps running; the operator gets a loud error and
// chooses to set the token or revert the bind.
if !is_loopback_bind(&bind) && state.auth_token.is_none() {
error!(
bind = %bind,
"MCP server refusing to start: non-loopback bind requires AGENT_MCP_TOKEN. \
Set the token, or move the bind back to 127.0.0.1 / ::1 / localhost."
);
return;
}

let app = Router::new()
.route("/mcp", post(mcp_handler))
.with_state(state);
.with_state(state.clone());

let listener = match TcpListener::bind(&bind).await {
Ok(l) => l,
Expand All @@ -745,9 +818,56 @@ pub async fn serve_mcp(bind: String, state: Arc<McpState>) {
}
};

info!(bind = %bind, "MCP server listening on /mcp (JSON-RPC 2.0, MCP protocol)");
info!(
bind = %bind,
loopback = is_loopback_bind(&bind),
auth = state.auth_token.is_some(),
"MCP server listening on /mcp (JSON-RPC 2.0, MCP protocol)"
);

if let Err(e) = axum::serve(listener, app).await {
error!(error = %e, "MCP server error");
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn loopback_recognises_ipv4_127() {
assert!(is_loopback_bind("127.0.0.1:9103"));
assert!(is_loopback_bind("127.0.0.2:9103"));
assert!(is_loopback_bind("127.255.255.254:1"));
}

#[test]
fn loopback_recognises_ipv6() {
assert!(is_loopback_bind("[::1]:9103"));
}

#[test]
fn loopback_recognises_localhost() {
assert!(is_loopback_bind("localhost:9103"));
assert!(is_loopback_bind("LOCALHOST:9103"));
}

#[test]
fn loopback_rejects_wildcards_and_externals() {
assert!(!is_loopback_bind("0.0.0.0:9103"));
assert!(!is_loopback_bind("[::]:9103"));
assert!(!is_loopback_bind("192.0.2.5:9103"));
assert!(!is_loopback_bind("example.com:9103"));
// Unparseable garbage is treated as non-loopback (fail-closed).
assert!(!is_loopback_bind("not-a-bind"));
}

#[test]
fn constant_time_eq_matches_string_eq() {
assert!(constant_time_eq(b"", b""));
assert!(constant_time_eq(b"secret", b"secret"));
assert!(!constant_time_eq(b"secret", b"secrey"));
assert!(!constant_time_eq(b"secret", b"secre"));
assert!(!constant_time_eq(b"secret", b"SECRET"));
}
}
Loading