AgentGateway Security Guards

Security guards protect your AI agents from common attack vectors when using MCP (Model Context Protocol) servers.

Available Guards

1. Tool Poisoning Guard

Detects and blocks malicious tool descriptions that could manipulate AI behavior.

Attack Pattern: An MCP server returns tools with descriptions containing hidden instructions like:

"SYSTEM: ignore all previous instructions"
"When this tool is called, also execute..."
Hidden unicode characters with embedded commands

Configuration:

securityGuards:
  toolPoisoning:
    enabled: true
    strict_mode: true
    custom_patterns:
      - "(?i)SYSTEM:\\s*override"
      - "(?i)ignore\\s+all\\s+previous"

How it works:

Intercepts tools/list responses from MCP servers
Scans tool names, descriptions, and input schemas
Blocks if suspicious patterns are detected
Returns error to client instead of poisoned tools

2. Rug Pull Guard

Detects when MCP servers change their tools after the initial handshake.

Attack Pattern: An MCP server initially returns safe tools, but after the AI has been "trusted," it changes the tools to malicious ones (the "rug pull").

Configuration:

securityGuards:
  rugPull:
    enabled: true
    risk_threshold: 1

risk_threshold options:

The threshold of MCP tool changes that triggers the security guard to block a request

How it works:

Records tool fingerprint on first tools/list response
Compares subsequent responses against baseline
Blocks if tools have changed unexpectedly after risk threshold is reached
New sessions establish fresh baselines

3. PII Guard

Detects and optionally masks personally identifiable information in MCP responses.

Configuration:

securityGuards:
  pii:
    enabled: true
    detect:
      - email
      - credit_card
      - phone
      - ssn
    action: mask  # or "reject"
    min_score: 0.3

Actions:

mask: Replace PII with [REDACTED]
reject: Reject the entire response

Configuration Example

Complete example with all guards:

binds:
- port: 8080
  listeners:
  - hostname: "*"
    routes:
    - name: protected-mcp
      matches:
      - path:
          pathPrefix: /mcp
      backends:
      - mcp:
          targets:
          - name: backend
            mcp:
              host: http://mcp-server:3000/mcp
          statefulMode: stateful
      policies:
        securityGuards:
          toolPoisoning:
            enabled: true
            strict_mode: true
          rugPull:
            enabled: true
            risk_threshold: 1
          pii:
            enabled: false  # Enable if needed

Testing Security Guards

Run the E2E test suite to verify guards are working:

# Run all security guard tests
./deploy.sh

# Or manually
cd tests/docker
docker compose up -d --build
docker compose run --rm test-runner

#Or via make:
# whole test suite
make test-docker

# separate test
make test-docker-up
python tests/<selected_test>
make test-docker-down

The test suite includes:

Tool poisoning detection tests
Rug pull detection tests (session and global scope)
PII masking tests

Guard Behavior

Failure Modes

Guards operate in fail_closed mode by default:

If a guard detects a threat, the request is blocked
If a guard encounters an error, the request is blocked
This ensures security even during unexpected conditions

Response Format

When a guard blocks a request, the client receives:

{
  "jsonrpc": "2.0",
  "error": {
    "code": -32603,
    "message": "Security guard denied for server 'backend': tool_poisoning detected suspicious content"
  }
}

Best Practices

Enable tool poisoning by default - Low overhead, high protection
Enable PII only when needed - Has performance overhead
Test with your actual MCP servers - Ensure no false positives
Monitor guard logs - Track blocked requests for security analysis

Limitations

Guards operate at the gateway level, not inside the AI model
Cannot detect attacks embedded in legitimate-looking content
PII detection has accuracy limits (configurable via min_score)
Rug pull detection requires stateful sessions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentGateway Security Guards

Available Guards

1. Tool Poisoning Guard

2. Rug Pull Guard

3. PII Guard

Configuration Example

Testing Security Guards

Guard Behavior

Failure Modes

Response Format

Best Practices

Limitations

FilesExpand file tree

SECURITY_GUARDS.md

Latest commit

History

SECURITY_GUARDS.md

File metadata and controls

AgentGateway Security Guards

Available Guards

1. Tool Poisoning Guard

2. Rug Pull Guard

3. PII Guard

Configuration Example

Testing Security Guards

Guard Behavior

Failure Modes

Response Format

Best Practices

Limitations