Skip to content

Ingest MCP ToolAnnotations into ToolSafetyContract in MCPToolAdapter #371

@dgenio

Description

@dgenio

Summary

Close the round-trip gap between MCP tool annotations and ChainWeaver's safety vocabulary: MCPToolAdapter should map incoming ToolAnnotations (readOnlyHint, destructiveHint, idempotentHint) onto a ToolSafetyContract for each wrapped tool, mirroring the export direction that FlowServer already implements.

Why this matters

ChainWeaver's safety contracts drive real behavior: merge_safety() derives flow-level contracts (#125), FlowServer governance filters on side-effect levels (#294), and the trace-mining pipeline scores candidates by safety. Today an MCP tool arrives with safety=None ("unknown, not safe") even when its server declares annotations — so declared-read-only remote tools are treated identically to completely unannotated ones, and flows built from them inherit unnecessarily pessimistic (or absent) contracts. Honoring the annotations makes imported tools first-class citizens of the existing governance machinery, with conservative handling of the trust question.

Current evidence

  • chainweaver/mcp/server.py (~lines 469–487): _tool_annotations() maps ToolSafetyContractToolAnnotations (readOnlyHint=safety.read_only, destructiveHint=side_effects is DESTRUCTIVE, idempotentHint=safety.idempotent) — the outbound direction exists and defines the mapping precedent.
  • chainweaver/mcp/adapter.py (~lines 201–250): wrapped tools are constructed without consulting mcp_tool.annotations; no safety= is derived.
  • chainweaver/contracts.py: ToolSafetyContract semantics state None means unknown — so today every MCP tool is "unknown".

Proposed implementation

  1. Define the inverse mapping in the adapter (conservative by construction):
    • readOnlyHint=Trueside_effects=READ (not NONE — a remote call observed the world), read_only=True.
    • destructiveHint=Trueside_effects=DESTRUCTIVE.
    • Neither hint → side_effects=EXTERNAL as the conservative default for remote calls, or leave safety=None; pick one and document it.
    • idempotentHintidempotent; derive safe_to_retry/cacheable conservatively (e.g., cacheable only for read-only + idempotent).
  2. Make trust explicit: MCPToolAdapter(..., annotation_trust="trust" | "ignore" | "cap"):
    • trust: apply the mapping as declared,
    • ignore: current behavior (safety=None),
    • cap (suggested default): apply declared annotations but never below a conservative floor (e.g., a declared-read-only tool still gets READ, never NONE; an unannotated tool gets EXTERNAL).
  3. Set determinism_level conservatively (NONE or PARTIAL) for remote tools regardless of annotations; remote behavior is not attestable from hints.
  4. Record the annotation source on the tool metadata so reviewers can see the contract was server-declared rather than author-declared.
  5. Document the trust model in docs/security.md and the adapter docstring; note the interaction with FlowServer's outbound mapping (a flow of imported tools re-exported over MCP now carries meaningful annotations end-to-end).

Example prompt, schema, or interface

adapter = MCPToolAdapter(session, annotation_trust="cap")
tools = await adapter.discover_tools()
assert tools[0].safety.side_effects is SideEffectLevel.READ   # readOnlyHint honored
assert tools[0].safety.determinism_level is DeterminismLevel.NONE

Acceptance criteria

  • Each annotation_trust mode produces the documented contract for tools with read-only, destructive, idempotent, and absent annotations.
  • Derived contracts participate in merge_safety() and FlowServer governance exactly like author-declared ones.
  • The provenance of the contract (server-declared) is inspectable.
  • Default behavior choice is documented with rationale; all four validation commands pass.

Test and evaluation plan

  • Unit tests: mapping matrix (each hint combination × each trust mode).
  • Round-trip test: import annotated tools → build flow → Tool.from_flow derived safety → FlowServer re-export → annotations preserved or conservatively widened, never narrowed.
  • Governance integration test: a destructive-annotated imported tool is filtered by default FlowServer exposure.

Migration notes

Depends on the chosen default: ignore is fully non-breaking; cap changes wrapped tools from safety=None to a populated conservative contract, which can change FlowServer filtering and mining scores for existing setups — if chosen, call it out in the changelog with the one-line revert (annotation_trust="ignore").

Risks and tradeoffs

  • Annotations are self-declared by servers; the cap mode and explicit provenance keep that trust decision visible rather than implicit.
  • Conservative defaults (EXTERNAL for unannotated tools) may exclude tools from default FlowServer exposure that previously slipped through as "unknown" — arguably the correct direction, but a behavior change to weigh.

Suggested labels

security, architecture, agents

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions