Skip to content

Add trust controls for MCP server-provided tool metadata #359

@dgenio

Description

@dgenio

Summary

Introduce a configurable metadata policy in MCPToolAdapter for server-provided tool names and descriptions: length limits, control-character checks, optional normalization, and an explicit opt-in for passing remote descriptions through to downstream consumers verbatim.

Why this matters

Tool descriptions wrapped from MCP servers travel further than they first appear: they become ChainWeaver Tool.description values, can be re-exported to LLM clients via FlowServer or the export adapters, and may be rendered into the offline proposer prompts (render_tool_catalogue). Treating remote metadata as untrusted input — with conservative defaults and a clear trust boundary — strengthens every downstream surface at once. This is a defensive-hardening improvement aligned with current ecosystem guidance on handling third-party tool metadata.

Current evidence

  • chainweaver/mcp/adapter.py (~line 242): description=(mcp_tool.description or f"MCP tool '{remote_name}'.") — remote descriptions are adopted verbatim.
  • chainweaver/mcp/adapter.py (~lines 201–204): server prefixing (server_prefix) is optional with a default of "", so multi-server setups can produce colliding or look-alike tool names unless callers opt in.
  • chainweaver/_offline_llm.py (~lines 101–113): render_tool_catalogue() interpolates tool names and descriptions into LLM prompts without normalization (see related adversarial-coverage issue).
  • docs/security.md: documents executor guarantees but does not yet describe a trust model for imported tool metadata.

Proposed implementation

  1. Add a MetadataPolicy (Pydantic model) accepted by MCPToolAdapter:
    • max_description_length (default e.g. 2000, truncate with marker),
    • strip_control_chars: bool = True,
    • normalize_whitespace: bool = True,
    • description_mode: "server" | "placeholder"placeholder replaces remote descriptions with a neutral generated one ("MCP tool '<name>' from server '<server>'."), making verbatim adoption an explicit choice.
  2. Validate tool names against a conservative pattern (e.g., ^[A-Za-z0-9._-]+$ after prefixing); reject or sanitize names that fail, with a typed error naming the server and tool.
  3. Encourage prefixing: emit a documented warning when discover_tools is called against a session with server_prefix="" and more than one adapter is active, or make the prefix required via policy.
  4. Record the original raw metadata alongside the sanitized values (e.g., tool.metadata["mcp_raw_description"]) so nothing is lost for auditing.
  5. Document the trust boundary in docs/security.md: which fields come from the server, what the defaults do, and how to opt into verbatim mode.

Example prompt, schema, or interface

adapter = MCPToolAdapter(
    session,
    server_prefix="search",
    metadata_policy=MetadataPolicy(
        max_description_length=1000,
        description_mode="server",   # explicit opt-in to remote text
    ),
)

Acceptance criteria

  • Default policy strips control characters, normalizes whitespace, and enforces a length cap on imported descriptions.
  • Tool names failing the validation pattern are rejected (or sanitized, per policy) with a typed ChainWeaverError subclass.
  • Raw server metadata remains available for audit alongside sanitized values.
  • The trust model is documented in docs/security.md.
  • All four validation commands pass.

Test and evaluation plan

  • Unit tests: long descriptions, embedded control characters, unusual whitespace, names with separators/quotes — each against both policy modes.
  • Round-trip test: sanitized tools re-exported through FlowServer and export/ adapters carry the sanitized text.
  • Regression test confirming default behavior for well-formed metadata is unchanged apart from documented normalization.

Migration notes

Mostly opt-in. If conservative defaults (length cap, control-character stripping) are enabled by default, note the normalization in the changelog; provide MetadataPolicy.permissive() to restore prior verbatim behavior.

Risks and tradeoffs

  • Over-aggressive sanitization can degrade legitimately rich descriptions that LLM clients rely on for tool selection; length caps should be generous and truncation visible.
  • Name validation could break existing setups using exotic names — the sanitize-with-mapping option mitigates this.
  • Sanitization is one layer, not a complete defense; documentation should be clear about residual risk so users keep human review in their promotion workflow.

Suggested labels

security, llm

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions