Skip to content

Cyber-Warrior-Network/trust-gate-mcp

Trust Gate MCP — hybrid Ed25519 + ML-DSA-65 receipts for every AI agent decision

trust-gate-mcp

The first production MCP server with NIST FIPS 204 post-quantum receipts.

No Receipt. No Trust.

Python 3.10+ License: Apache 2.0 Tests OWASP ASI Stress Test Bandit MCP Compatible NIST FIPS 204: ML-DSA-65 EU AI Act Article 50: ready CNSA 2.0: ready CVE-2026-25253: protected

Every MCP tool call your AI agent makes is now policy-gated, hybrid-signed, and forensically provable — offline, forever.
If a tool was poisoned, rug-pulled, hijacked, or quantum-broken in 2033 — you have cryptographic proof today.


The Problem (in one sentence)

Your AI agent's decisions are retained for compliance audits that run 7–10 years; your current signatures survive maybe seven.

  • February 2026: CVE-2026-25253 (OpenClaw) exposed 135,000 AI agent gateway instances to remote code execution with zero authentication. 824 malicious tools found active on the same day.
  • October 2024: Salt Typhoon exfiltrated encrypted telecom data from 80+ countries — the clearest public example of "Harvest Now, Decrypt Later" in action.
  • August 2026: EU AI Act Article 50 transparency obligations take effect. Digital signatures on AI-generated content and decisions become mandatory.
  • 2030–2035: Consensus quantum timeline. Every Ed25519-signed receipt retained past this window is forgeable by a quantum adversary holding the corresponding public key.

trust-gate-mcp is the first MCP server built for this timeline.


What You Get

flowchart LR
    A[AI Agent] -->|"gate_decision()"| B[trust-gate-mcp]
    B -->|sanitize<br/>Class A + Class B| C[Trust Gate<br/>policy engine]
    C -->|allow| D[Hybrid sign<br/>Ed25519 + ML-DSA-65]
    D -->|TrustAtom receipt<br/>+ embedded PQ pk| E[Verifiable offline<br/>forever]
    C -->|deny / timeout / WAF| F[allow: false<br/>fail closed]

    classDef primary fill:#38BDF8,stroke:#38BDF8,color:#0A0F1C
    classDef quantum fill:#C084FC,stroke:#C084FC,color:#0A0F1C
    classDef safe fill:#34D399,stroke:#34D399,color:#0A0F1C
    classDef danger fill:#F87171,stroke:#F87171,color:#0A0F1C

    class D quantum
    class E safe
    class F danger
Loading

See the animated architecture diagram for the full three-layer defense.


Quickstart

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "trust-gate": {
      "command": "uvx",
      "args": ["trust-gate-mcp"],
      "env": {
        "TRUST_GATE_API_KEY": "your-api-key"
      }
    }
  }
}

Claude Code — one command:

claude mcp add trust-gate -- uvx trust-gate-mcp

Any MCP-compatible agent:

pip install trust-gate-mcp
TRUST_GATE_API_KEY=your-key trust-gate-mcp

The 4 Tools

gate_decision — the core gate

Call this before any action that deploys code, writes data, sends messages, or makes commitments on behalf of users.

{
  "action": "DEPLOY",
  "agent_id": "ci-agent-alpha",
  "resource_id": "prod-cluster-01",
  "env": "PRODUCTION",
  "human_approved": true
}

Returns: allow/deny decision, risk score, and a hybrid-signed TrustAtom receipt if the action is approved.

verify_receipt — independent verification

Mathematically verify any receipt. Two paths:

  • Offline ML-DSA-65 — provide ml_dsa_signature + ml_dsa_public_key (both embedded in every hybrid receipt). No backend call. No server trust. Verifiable years from now, even against a quantum adversary.
  • Online Ed25519 — provide evidence_hash + signature. Verified against the backend. Backward-compatible with every receipt.

See the offline verification flow.

check_policy — pre-flight check

Evaluate whether an action would be allowed without creating a receipt or recording state. Zero side effects.

health — connectivity + quantum capability

Returns backend status, MCP server version, and the ML-DSA-65 public key (base64) for offline verification.


Quantum-Resistant Receipts

Hybrid signing flow

Every receipt is hybrid-signed: Ed25519 (classical) and ML-DSA-65 (NIST FIPS 204, security level 3) over the same evidence hash, simultaneously.

evidence_hash = SHA-256(canonical_payload)
ed25519_sig   = Ed25519.sign(private_key, evidence_hash)    ← classical
ml_dsa_sig    = ML_DSA_65.sign(secret_key, evidence_hash)   ← post-quantum

Both signatures and the ML-DSA-65 public key are embedded in every receipt. Three properties:

  1. Today — Ed25519 verification works everywhere, as always.
  2. Post-quantum era — ML-DSA-65 verification holds even if Ed25519 is broken by a quantum computer.
  3. Offline — an auditor can verify the ML-DSA-65 signature years from now with zero network access — just the receipt.

Verify offline in 15 lines

from dilithium_py.ml_dsa import ML_DSA_65
import base64

# Fields from any hybrid TrustAtom receipt:
evidence_hash   = receipt["evidence_hash"]
ml_dsa_sig      = receipt["ml_dsa_signature_b64"]
ml_dsa_pub_key  = receipt["ml_dsa_public_key_b64"]

valid = ML_DSA_65.verify(
    base64.b64decode(ml_dsa_pub_key),
    evidence_hash.encode("utf-8"),
    base64.b64decode(ml_dsa_sig),
)
# => True — no network, no server trust, quantum-resistant

Or via the MCP tool:

await verify_receipt(
    evidence_hash=receipt["evidence_hash"],
    signature="",                                          # omit for offline-PQ path
    ml_dsa_signature=receipt["ml_dsa_signature_b64"],
    ml_dsa_public_key=receipt["ml_dsa_public_key_b64"],
)
# => { "valid": true, "quantum_verified": true, "offline": true,
#      "signature_alg": "ML-DSA-65" }

ML-DSA-65 = CRYSTALS-Dilithium, security level 3 (~AES-192 equivalent). Standardized as NIST FIPS 204 on August 14, 2024.


Security Model

Three independent defense layers. An attack must defeat all three.

Three-layer architecture

Layer 1 — Input Sanitization

Every string argument is cleaned before any conditional logic or API call.

Threat Pattern count Defense Reference
Invisible Unicode (tool poisoning) unlimited _INVISIBLE_UNICODE_RE strips RTL, zero-width, BOM, C0/C1 Invariant Labs, Apr 2025
Class A — policy override 13 _INJECTION_RE defangs injection phrases CVE-2025-54794
Class B — agentic diffusion / exfil relay 4 _INJECTION_RE defangs exfiltrate, leak/steal/dump + target, base64 encode-and-relay, embed-in-response Willison Lethal Trifecta (Jun 2025) · OWASP LLM Top 10
Oversized input per-field _MAX_LEN with FIPS 204-aware sizing DoS prevention
Env spoofing _validate_env allowlist coerces to SANDBOX CVE-2025-64106

See the Class B defense flow. Full threat model in docs/THREAT_MODEL.md.

Layer 2 — Policy Evaluation

The backend evaluates declarative policy rules. Unregistered agents receive an explicit deny — never a default allow. Decisions include a continuous risk score (0.0–1.0).

Layer 3 — Hybrid Signing + Fail-Closed

On allow: hybrid sign (Ed25519 + ML-DSA-65). On anything else (backend down, timeout, WAF 403, 429, exception): allow: false, risk_score: 1.0, receipt: null. An outage makes your system more conservative, not less.


Fail-Closed Guarantee

Condition Result
Backend online, policy DENY allow: false — normal evaluation
Backend online, policy ALLOW allow: true + hybrid-signed receipt
Backend timeout / offline allow: false, max risk — fail CLOSED
WAF block allow: false, blocked_by_waf — clean JSON, no HTML
Rate limited allow: false, rate_limited — clean JSON

Benchmarks

Measured on commodity Windows 11 laptop, Python 3.12.10, pure-Python dilithium-py 1.4.0 (a C backend like liboqs reduces ML-DSA by 10–20×).

Operation Time Budget
Receipt mint (hybrid, includes keygen) 71.3 ms ≤ 100 ms
Receipt mint (hybrid, reuse keys) ~45 ms ≤ 100 ms
Ed25519 sign 21 µs ≤ 5 ms
Ed25519 verify 1.73 ms ≤ 5 ms
ML-DSA-65 sign 36.9 ms ≤ 100 ms
ML-DSA-65 verify (offline) 6.93 ms ≤ 20 ms
End-to-end gate_decision (incl. backend) ~180 ms ≤ 250 ms

Signature sizes (FIPS 204 level 3):

  • Ed25519 signature: 64 bytes
  • ML-DSA-65 signature: 3,309 bytes
  • ML-DSA-65 public key: 1,952 bytes (embedded in every receipt)

Reproducible with python demo_quantum_proof.py.


Security Evaluation

All three suites run on every PR via GitHub Actions across Python 3.10–3.13 on Linux/macOS/Windows.

Suite Checks Result
Unit tests (pytest) 47 PASS
OWASP ASI01–06 eval 51 PASS
Adversarial stress tests (CVE-mapped) 19 PASS
Bandit static analysis 528 LOC 0 issues
pip-audit (direct dependencies) 3 deps 0 CVEs

Reproduce locally:

pytest tests/
python -X utf8 security_audit/owasp_asi_eval.py
python -X utf8 stress_test.py
python -m bandit -r src/ --severity-level high
python -m pip_audit --strict

CVE Coverage

# CVE / Disclosure Attack class Severity Defense
1 CVE-2026-25253 (OpenClaw) WebSocket CORS bypass → RCE on 135K instances High (CVSS 8.8) Fail-closed + input sanitization
2 Invariant Labs Tool Poisoning Invisible Unicode exfiltrates SSH keys Critical Unicode stripping
3 CVE-2025-54794 Indirect prompt injection via retrieved content High Class A defang
4 CVE-2025-68664 LangGrinch Structured output injection → env var theft Critical (CVSS 9.3) Class A defang
5 CVE-2025-6514 mcp-remote RCE OS command injection via handshake Critical (CVSS 9.6) Parameterized API calls
6 CVE-2025-64106 (Cursor) Agent env spoofing High (CVSS 8.8) Env allowlist
7 CVE-2025-6515 Prompt Hijacking Session token takeover High Receipt-based session binding
8 Invariant Labs Rug Pull Post-approval tool description mutation Critical Receipt-based tool hash
9 CVE-2025-5276 SSRF SSRF via MCP URL fetch High Policy-gated allowlist
10 OWASP MCP Top 10 — MCP02, MCP10 Excessive scope / context over-sharing High Per-action receipt scoping
11 Willison Lethal Trifecta (Jun 2025) Agentic diffusion exfiltration High Class B defang
12 OWASP LLM04 Insecure plugin data exfiltration High Class B defang
13 Trend Micro Agentic AI Exfil (2024) Base64-encoded exfil channel High Class B defang
14 Palo Alto Unit 42 Agentic AI Attack Framework (May 2025) Tool-chain privilege escalation High Class B defang + fail-closed
15 Salt Typhoon HNDL (Oct 2024) Harvest-now-decrypt-later against receipts Nation-state ML-DSA-65 hybrid signing

See docs/THREAT_MODEL.md for the full matrix. Run python stress_test.py to verify defenses against CVE-mapped inputs.


Compliance

One deployment satisfies multiple overlapping mandates:

Framework Requirement How receipts satisfy it
NIST FIPS 204 (August 14, 2024) ML-DSA digital signatures Native: every receipt signed with ML-DSA-65 security level 3
NSA CNSA 2.0 (2027 acquisition, 2030–2031 full migration) Post-quantum digital signatures on NSS ML-DSA-65 is on the CNSA 2.0 suite
EU AI Act Article 50 (effective August 2, 2026) Digital signatures, watermarks, metadata on AI output Every receipt: signed + timestamped + machine-readable + offline-verifiable
NIST AI RMF GOVERN 4.1, MEASURE 2.5 — audit trail Cryptographic receipt chain
NIST SP 800-218 SSDF PS.3.1, PW.5.1 — signed provenance Hybrid-signed per action
SOC 2 CC7.2, CC7.3 — tamper-evident logging Third-party offline verifiable
NIST SP 800-53 AU-10 (non-repudiation), SC-12, SC-13 Hybrid signatures + embedded public key

Why This Is Different from mcp-scan

mcp-scan by Invariant Labs scans your MCP config for known poisoning patterns at install time. Use it.

trust-gate-mcp generates cryptographic proof at runtime — every call, every time:

mcp-scan       --> scan before you install  --> "is this tool safe to run?"
trust-gate-mcp --> receipt every time it runs --> "did this tool do what it claimed?"

One detects. One proves. Both belong in your MCP stack.


Environment Variables

Variable Default Description
TRUST_GATE_URL https://cwn-trust-gate.onrender.com Trust Gate API base URL
TRUST_GATE_API_KEY (empty) Bearer token — get one at cwn-trust-gate.onrender.com
TRUST_GATE_TENANT default Tenant ID for multi-tenant deployments
TRUST_GATE_TIMEOUT 30 HTTP timeout in seconds

Development

git clone https://github.com/Cyber-Warrior-Network/trust-gate-mcp
cd trust-gate-mcp
pip install -e ".[dev]"

# Run the full test suite (47 tests, ~2s)
pytest

# Run CVE-mapped stress tests (3 iterations, 19 checks)
python -X utf8 stress_test.py

# Run OWASP ASI01-06 evaluation (51 checks)
python -X utf8 security_audit/owasp_asi_eval.py

# Reproduce the quantum demo (hybrid mint + offline verify + tamper + benchmark)
python demo_quantum_proof.py

See CONTRIBUTING.md for the full contributor workflow.

Test Coverage

Test Class Tests Covers
TestConfig 3 Settings, environment loading
TestClientHeaders 3 Auth headers, URL normalization
TestClientMethods 7 HTTP construction per tool, WAF/429 clean errors
TestGateDecisionTool 4 Success, error, WAF 403, all parameters
TestVerifyReceiptTool 3 Signature verification, invalid JSON, errors
TestCheckPolicyTool 2 Policy result, error handling
TestHealthTool 2 Success, unreachable
TestSanitizeFunction 8 RTL override, zero-width, Class A injection, oversized, env spoofing
TestAgenticDiffusionDefense 5 Class B exfiltration patterns (MANDATORY since Apr 2026)
TestMlDsaFieldSizeLimits 4 FIPS 204 field size sanity (MANDATORY)
TestGateDecisionSanitization 5 End-to-end sanitization, empty field deny
TestCheckPolicySanitization 1 Empty fields return deny

Total: 47 tests. 51 OWASP ASI checks. 19 stress tests. Every PR re-runs all three.


FAQ

How do I protect my MCP servers from tool poisoning? Add trust-gate-mcp to your MCP config. Every tool call is policy-evaluated and receives a hybrid-signed receipt. Any deviation from declared behavior is logged and denied on subsequent calls.

Is my setup vulnerable to CVE-2026-25253 (OpenClaw)? If you run OpenClaw without patching to v2026.2.25, yes. trust-gate-mcp adds a receipt-based detection layer even on unpatched systems — unauthorized routing produces a policy violation that surfaces in your audit log.

What happens if the Trust Gate backend goes down? The server fails closed — every gate_decision returns allow: false. No action proceeds while the policy engine is unreachable. An outage makes your system more conservative, not less.

Does it work with Claude Desktop, Cursor, and Windsurf? Yes. It wraps the MCP configuration at the transport layer — compatible with any standard MCP client.

Is this open source? Yes. Apache-2.0 license. The MCP client and all tests are fully open. The hosted policy engine is a separate service — you can also point TRUST_GATE_URL at your own deployment.

Are receipts quantum-resistant? Yes. Every receipt is hybrid-signed with Ed25519 (classical) and ML-DSA-65 (post-quantum, NIST FIPS 204 / CRYSTALS-Dilithium security level 3). The ML-DSA-65 public key is embedded directly in the receipt, enabling fully offline verification with no backend dependency. If a quantum computer breaks Ed25519 in the future, the ML-DSA-65 signature remains valid.

Can I verify a receipt without network access? Yes — for hybrid receipts. Pass the ml_dsa_signature and ml_dsa_public_key fields from the receipt to verify_receipt. Verification runs entirely on your machine using the embedded public key. No backend call, no Trust Gate account, no internet required.

Does this meet the EU AI Act? Article 50 transparency obligations (effective August 2, 2026) mandate digital signatures on AI-generated content and decisions. Every TrustAtom receipt is digitally signed, timestamped, machine-readable, and verifiable offline by any third party. See the compliance table.

What's the performance impact? End-to-end gate_decision: ~180 ms including the backend call. Offline ML-DSA-65 verification: ~7 ms. See Benchmarks.


Documentation

  • WHITEPAPER.md — full technical paper with measured results, related work, reproducibility appendix
  • ARCHITECTURE.md — three-layer defense, data flow, performance budget, extension points
  • THREAT_MODEL.md — attack classes, defenses, full CVE coverage matrix
  • SECURITY.md — responsible disclosure (72h ack, 30d fix, 90d public)
  • CONTRIBUTING.md — contributor workflow with the 47/51/19 test invariants
  • CHANGELOG.md — release notes

Related


Responsible Disclosure

Found a vulnerability? Open a private GitHub Security Advisory or email security@cyberwarriornetwork.com. We follow a 90-day responsible disclosure window. See SECURITY.md.


License

Apache-2.0 — Cyber Warrior Network

No Receipt. No Trust.