Every MCP tool call your AI agent makes is now policy-gated, hybrid-signed, and forensically provable — offline, forever.
If a tool was poisoned, rug-pulled, hijacked, or quantum-broken in 2033 — you have cryptographic proof today.
Your AI agent's decisions are retained for compliance audits that run 7–10 years; your current signatures survive maybe seven.
- February 2026: CVE-2026-25253 (OpenClaw) exposed 135,000 AI agent gateway instances to remote code execution with zero authentication. 824 malicious tools found active on the same day.
- October 2024: Salt Typhoon exfiltrated encrypted telecom data from 80+ countries — the clearest public example of "Harvest Now, Decrypt Later" in action.
- August 2026: EU AI Act Article 50 transparency obligations take effect. Digital signatures on AI-generated content and decisions become mandatory.
- 2030–2035: Consensus quantum timeline. Every Ed25519-signed receipt retained past this window is forgeable by a quantum adversary holding the corresponding public key.
trust-gate-mcp is the first MCP server built for this timeline.
flowchart LR
A[AI Agent] -->|"gate_decision()"| B[trust-gate-mcp]
B -->|sanitize<br/>Class A + Class B| C[Trust Gate<br/>policy engine]
C -->|allow| D[Hybrid sign<br/>Ed25519 + ML-DSA-65]
D -->|TrustAtom receipt<br/>+ embedded PQ pk| E[Verifiable offline<br/>forever]
C -->|deny / timeout / WAF| F[allow: false<br/>fail closed]
classDef primary fill:#38BDF8,stroke:#38BDF8,color:#0A0F1C
classDef quantum fill:#C084FC,stroke:#C084FC,color:#0A0F1C
classDef safe fill:#34D399,stroke:#34D399,color:#0A0F1C
classDef danger fill:#F87171,stroke:#F87171,color:#0A0F1C
class D quantum
class E safe
class F danger
See the animated architecture diagram for the full three-layer defense.
Claude Desktop — add to claude_desktop_config.json:
{
"mcpServers": {
"trust-gate": {
"command": "uvx",
"args": ["trust-gate-mcp"],
"env": {
"TRUST_GATE_API_KEY": "your-api-key"
}
}
}
}Claude Code — one command:
claude mcp add trust-gate -- uvx trust-gate-mcpAny MCP-compatible agent:
pip install trust-gate-mcp
TRUST_GATE_API_KEY=your-key trust-gate-mcpCall this before any action that deploys code, writes data, sends messages, or makes commitments on behalf of users.
{
"action": "DEPLOY",
"agent_id": "ci-agent-alpha",
"resource_id": "prod-cluster-01",
"env": "PRODUCTION",
"human_approved": true
}Returns: allow/deny decision, risk score, and a hybrid-signed TrustAtom receipt if the action is approved.
Mathematically verify any receipt. Two paths:
- Offline ML-DSA-65 — provide
ml_dsa_signature+ml_dsa_public_key(both embedded in every hybrid receipt). No backend call. No server trust. Verifiable years from now, even against a quantum adversary. - Online Ed25519 — provide
evidence_hash+signature. Verified against the backend. Backward-compatible with every receipt.
See the offline verification flow.
Evaluate whether an action would be allowed without creating a receipt or recording state. Zero side effects.
Returns backend status, MCP server version, and the ML-DSA-65 public key (base64) for offline verification.
Every receipt is hybrid-signed: Ed25519 (classical) and ML-DSA-65 (NIST FIPS 204, security level 3) over the same evidence hash, simultaneously.
evidence_hash = SHA-256(canonical_payload)
ed25519_sig = Ed25519.sign(private_key, evidence_hash) ← classical
ml_dsa_sig = ML_DSA_65.sign(secret_key, evidence_hash) ← post-quantum
Both signatures and the ML-DSA-65 public key are embedded in every receipt. Three properties:
- Today — Ed25519 verification works everywhere, as always.
- Post-quantum era — ML-DSA-65 verification holds even if Ed25519 is broken by a quantum computer.
- Offline — an auditor can verify the ML-DSA-65 signature years from now with zero network access — just the receipt.
from dilithium_py.ml_dsa import ML_DSA_65
import base64
# Fields from any hybrid TrustAtom receipt:
evidence_hash = receipt["evidence_hash"]
ml_dsa_sig = receipt["ml_dsa_signature_b64"]
ml_dsa_pub_key = receipt["ml_dsa_public_key_b64"]
valid = ML_DSA_65.verify(
base64.b64decode(ml_dsa_pub_key),
evidence_hash.encode("utf-8"),
base64.b64decode(ml_dsa_sig),
)
# => True — no network, no server trust, quantum-resistantOr via the MCP tool:
await verify_receipt(
evidence_hash=receipt["evidence_hash"],
signature="", # omit for offline-PQ path
ml_dsa_signature=receipt["ml_dsa_signature_b64"],
ml_dsa_public_key=receipt["ml_dsa_public_key_b64"],
)
# => { "valid": true, "quantum_verified": true, "offline": true,
# "signature_alg": "ML-DSA-65" }ML-DSA-65 = CRYSTALS-Dilithium, security level 3 (~AES-192 equivalent). Standardized as NIST FIPS 204 on August 14, 2024.
Three independent defense layers. An attack must defeat all three.
Every string argument is cleaned before any conditional logic or API call.
| Threat | Pattern count | Defense | Reference |
|---|---|---|---|
| Invisible Unicode (tool poisoning) | unlimited | _INVISIBLE_UNICODE_RE strips RTL, zero-width, BOM, C0/C1 |
Invariant Labs, Apr 2025 |
| Class A — policy override | 13 | _INJECTION_RE defangs injection phrases |
CVE-2025-54794 |
| Class B — agentic diffusion / exfil relay | 4 | _INJECTION_RE defangs exfiltrate, leak/steal/dump + target, base64 encode-and-relay, embed-in-response |
Willison Lethal Trifecta (Jun 2025) · OWASP LLM Top 10 |
| Oversized input | per-field | _MAX_LEN with FIPS 204-aware sizing |
DoS prevention |
| Env spoofing | — | _validate_env allowlist coerces to SANDBOX |
CVE-2025-64106 |
See the Class B defense flow. Full threat model in docs/THREAT_MODEL.md.
The backend evaluates declarative policy rules. Unregistered agents receive an explicit deny — never a default allow. Decisions include a continuous risk score (0.0–1.0).
On allow: hybrid sign (Ed25519 + ML-DSA-65). On anything else (backend down, timeout, WAF 403, 429, exception): allow: false, risk_score: 1.0, receipt: null. An outage makes your system more conservative, not less.
| Condition | Result |
|---|---|
| Backend online, policy DENY | allow: false — normal evaluation |
| Backend online, policy ALLOW | allow: true + hybrid-signed receipt |
| Backend timeout / offline | allow: false, max risk — fail CLOSED |
| WAF block | allow: false, blocked_by_waf — clean JSON, no HTML |
| Rate limited | allow: false, rate_limited — clean JSON |
Measured on commodity Windows 11 laptop, Python 3.12.10, pure-Python dilithium-py 1.4.0 (a C backend like liboqs reduces ML-DSA by 10–20×).
| Operation | Time | Budget |
|---|---|---|
| Receipt mint (hybrid, includes keygen) | 71.3 ms | ≤ 100 ms |
| Receipt mint (hybrid, reuse keys) | ~45 ms | ≤ 100 ms |
| Ed25519 sign | 21 µs | ≤ 5 ms |
| Ed25519 verify | 1.73 ms | ≤ 5 ms |
| ML-DSA-65 sign | 36.9 ms | ≤ 100 ms |
| ML-DSA-65 verify (offline) | 6.93 ms | ≤ 20 ms |
End-to-end gate_decision (incl. backend) |
~180 ms | ≤ 250 ms |
Signature sizes (FIPS 204 level 3):
- Ed25519 signature: 64 bytes
- ML-DSA-65 signature: 3,309 bytes
- ML-DSA-65 public key: 1,952 bytes (embedded in every receipt)
Reproducible with python demo_quantum_proof.py.
All three suites run on every PR via GitHub Actions across Python 3.10–3.13 on Linux/macOS/Windows.
| Suite | Checks | Result |
|---|---|---|
Unit tests (pytest) |
47 | PASS |
| OWASP ASI01–06 eval | 51 | PASS |
| Adversarial stress tests (CVE-mapped) | 19 | PASS |
| Bandit static analysis | 528 LOC | 0 issues |
| pip-audit (direct dependencies) | 3 deps | 0 CVEs |
Reproduce locally:
pytest tests/
python -X utf8 security_audit/owasp_asi_eval.py
python -X utf8 stress_test.py
python -m bandit -r src/ --severity-level high
python -m pip_audit --strict| # | CVE / Disclosure | Attack class | Severity | Defense |
|---|---|---|---|---|
| 1 | CVE-2026-25253 (OpenClaw) | WebSocket CORS bypass → RCE on 135K instances | High (CVSS 8.8) | Fail-closed + input sanitization |
| 2 | Invariant Labs Tool Poisoning | Invisible Unicode exfiltrates SSH keys | Critical | Unicode stripping |
| 3 | CVE-2025-54794 | Indirect prompt injection via retrieved content | High | Class A defang |
| 4 | CVE-2025-68664 LangGrinch | Structured output injection → env var theft | Critical (CVSS 9.3) | Class A defang |
| 5 | CVE-2025-6514 mcp-remote RCE | OS command injection via handshake | Critical (CVSS 9.6) | Parameterized API calls |
| 6 | CVE-2025-64106 (Cursor) | Agent env spoofing | High (CVSS 8.8) | Env allowlist |
| 7 | CVE-2025-6515 Prompt Hijacking | Session token takeover | High | Receipt-based session binding |
| 8 | Invariant Labs Rug Pull | Post-approval tool description mutation | Critical | Receipt-based tool hash |
| 9 | CVE-2025-5276 SSRF | SSRF via MCP URL fetch | High | Policy-gated allowlist |
| 10 | OWASP MCP Top 10 — MCP02, MCP10 | Excessive scope / context over-sharing | High | Per-action receipt scoping |
| 11 | Willison Lethal Trifecta (Jun 2025) | Agentic diffusion exfiltration | High | Class B defang |
| 12 | OWASP LLM04 | Insecure plugin data exfiltration | High | Class B defang |
| 13 | Trend Micro Agentic AI Exfil (2024) | Base64-encoded exfil channel | High | Class B defang |
| 14 | Palo Alto Unit 42 Agentic AI Attack Framework (May 2025) | Tool-chain privilege escalation | High | Class B defang + fail-closed |
| 15 | Salt Typhoon HNDL (Oct 2024) | Harvest-now-decrypt-later against receipts | Nation-state | ML-DSA-65 hybrid signing |
See docs/THREAT_MODEL.md for the full matrix. Run python stress_test.py to verify defenses against CVE-mapped inputs.
One deployment satisfies multiple overlapping mandates:
| Framework | Requirement | How receipts satisfy it |
|---|---|---|
| NIST FIPS 204 (August 14, 2024) | ML-DSA digital signatures | Native: every receipt signed with ML-DSA-65 security level 3 |
| NSA CNSA 2.0 (2027 acquisition, 2030–2031 full migration) | Post-quantum digital signatures on NSS | ML-DSA-65 is on the CNSA 2.0 suite |
| EU AI Act Article 50 (effective August 2, 2026) | Digital signatures, watermarks, metadata on AI output | Every receipt: signed + timestamped + machine-readable + offline-verifiable |
| NIST AI RMF | GOVERN 4.1, MEASURE 2.5 — audit trail | Cryptographic receipt chain |
| NIST SP 800-218 SSDF | PS.3.1, PW.5.1 — signed provenance | Hybrid-signed per action |
| SOC 2 | CC7.2, CC7.3 — tamper-evident logging | Third-party offline verifiable |
| NIST SP 800-53 | AU-10 (non-repudiation), SC-12, SC-13 | Hybrid signatures + embedded public key |
mcp-scan by Invariant Labs scans your MCP config for known poisoning patterns at install time. Use it.
trust-gate-mcp generates cryptographic proof at runtime — every call, every time:
mcp-scan --> scan before you install --> "is this tool safe to run?"
trust-gate-mcp --> receipt every time it runs --> "did this tool do what it claimed?"
One detects. One proves. Both belong in your MCP stack.
| Variable | Default | Description |
|---|---|---|
TRUST_GATE_URL |
https://cwn-trust-gate.onrender.com |
Trust Gate API base URL |
TRUST_GATE_API_KEY |
(empty) | Bearer token — get one at cwn-trust-gate.onrender.com |
TRUST_GATE_TENANT |
default |
Tenant ID for multi-tenant deployments |
TRUST_GATE_TIMEOUT |
30 |
HTTP timeout in seconds |
git clone https://github.com/Cyber-Warrior-Network/trust-gate-mcp
cd trust-gate-mcp
pip install -e ".[dev]"
# Run the full test suite (47 tests, ~2s)
pytest
# Run CVE-mapped stress tests (3 iterations, 19 checks)
python -X utf8 stress_test.py
# Run OWASP ASI01-06 evaluation (51 checks)
python -X utf8 security_audit/owasp_asi_eval.py
# Reproduce the quantum demo (hybrid mint + offline verify + tamper + benchmark)
python demo_quantum_proof.pySee CONTRIBUTING.md for the full contributor workflow.
| Test Class | Tests | Covers |
|---|---|---|
TestConfig |
3 | Settings, environment loading |
TestClientHeaders |
3 | Auth headers, URL normalization |
TestClientMethods |
7 | HTTP construction per tool, WAF/429 clean errors |
TestGateDecisionTool |
4 | Success, error, WAF 403, all parameters |
TestVerifyReceiptTool |
3 | Signature verification, invalid JSON, errors |
TestCheckPolicyTool |
2 | Policy result, error handling |
TestHealthTool |
2 | Success, unreachable |
TestSanitizeFunction |
8 | RTL override, zero-width, Class A injection, oversized, env spoofing |
TestAgenticDiffusionDefense |
5 | Class B exfiltration patterns (MANDATORY since Apr 2026) |
TestMlDsaFieldSizeLimits |
4 | FIPS 204 field size sanity (MANDATORY) |
TestGateDecisionSanitization |
5 | End-to-end sanitization, empty field deny |
TestCheckPolicySanitization |
1 | Empty fields return deny |
Total: 47 tests. 51 OWASP ASI checks. 19 stress tests. Every PR re-runs all three.
How do I protect my MCP servers from tool poisoning?
Add trust-gate-mcp to your MCP config. Every tool call is policy-evaluated and receives a hybrid-signed receipt. Any deviation from declared behavior is logged and denied on subsequent calls.
Is my setup vulnerable to CVE-2026-25253 (OpenClaw)?
If you run OpenClaw without patching to v2026.2.25, yes. trust-gate-mcp adds a receipt-based detection layer even on unpatched systems — unauthorized routing produces a policy violation that surfaces in your audit log.
What happens if the Trust Gate backend goes down?
The server fails closed — every gate_decision returns allow: false. No action proceeds while the policy engine is unreachable. An outage makes your system more conservative, not less.
Does it work with Claude Desktop, Cursor, and Windsurf? Yes. It wraps the MCP configuration at the transport layer — compatible with any standard MCP client.
Is this open source?
Yes. Apache-2.0 license. The MCP client and all tests are fully open. The hosted policy engine is a separate service — you can also point TRUST_GATE_URL at your own deployment.
Are receipts quantum-resistant? Yes. Every receipt is hybrid-signed with Ed25519 (classical) and ML-DSA-65 (post-quantum, NIST FIPS 204 / CRYSTALS-Dilithium security level 3). The ML-DSA-65 public key is embedded directly in the receipt, enabling fully offline verification with no backend dependency. If a quantum computer breaks Ed25519 in the future, the ML-DSA-65 signature remains valid.
Can I verify a receipt without network access?
Yes — for hybrid receipts. Pass the ml_dsa_signature and ml_dsa_public_key fields from the receipt to verify_receipt. Verification runs entirely on your machine using the embedded public key. No backend call, no Trust Gate account, no internet required.
Does this meet the EU AI Act? Article 50 transparency obligations (effective August 2, 2026) mandate digital signatures on AI-generated content and decisions. Every TrustAtom receipt is digitally signed, timestamped, machine-readable, and verifiable offline by any third party. See the compliance table.
What's the performance impact?
End-to-end gate_decision: ~180 ms including the backend call. Offline ML-DSA-65 verification: ~7 ms. See Benchmarks.
- WHITEPAPER.md — full technical paper with measured results, related work, reproducibility appendix
- ARCHITECTURE.md — three-layer defense, data flow, performance budget, extension points
- THREAT_MODEL.md — attack classes, defenses, full CVE coverage matrix
- SECURITY.md — responsible disclosure (72h ack, 30d fix, 90d public)
- CONTRIBUTING.md — contributor workflow with the 47/51/19 test invariants
- CHANGELOG.md — release notes
- Trust Gate — the hosted policy engine this server connects to
- mcp-scan — scan your MCP configs for tool poisoning
- OWASP MCP Top 10 — the definitive MCP security taxonomy
- NIST FIPS 204 — Module-Lattice-Based Digital Signature Standard
- dilithium-py — the pure-Python ML-DSA reference we build on
- Model Context Protocol — the spec this server implements
Found a vulnerability? Open a private GitHub Security Advisory or email security@cyberwarriornetwork.com. We follow a 90-day responsible disclosure window. See SECURITY.md.
Apache-2.0 — Cyber Warrior Network
No Receipt. No Trust.