InjectGuard

A layered defense system that protects AI agents from indirect prompt injection attacks hidden in external content — files, URLs, API responses, and tool outputs.

InjectGuard sits between your agent and the outside world. It scans every piece of external content through an ensemble of detection engines before it ever reaches your LLM.

External Content ──> InjectGuard ──> Safe Content ──> Your Agent
   (files, URLs,       (scan,            (clean,
    API responses,      detect,           sanitized,
    tool outputs)       block)            annotated)

Why

LLMs are vulnerable to indirect prompt injection: malicious instructions embedded in documents, web pages, or API responses that hijack agent behavior. These attacks hide in:

HTML pages (invisible <div style="display:none"> elements, comments, tiny fonts)
PDFs (annotations, metadata, embedded files)
Images (EXIF metadata, PNG text chunks, OCR text)
JSON/YAML (deeply nested string values)
Encoded payloads (base64, ROT13, hex sequences)
Multilingual attacks (instructions in 12+ languages)
Split payloads (instructions spread across multiple fields)

InjectGuard catches them all.

Install

pip install -e .                 # core + heuristic scanner
pip install -e ".[api]"          # + FastAPI server & dashboard
pip install -e ".[similarity]"   # + vector similarity scanner
pip install -e ".[all]"          # everything
pip install -e ".[dev]"          # + test/lint tools

Quick Start

Python SDK

from injectguard import Scanner

scanner = Scanner()

# Scan text from any external source
result = scanner.scan_text("content from an API response")
if result.is_safe:
    agent.process(result.original_content)
else:
    print(f"Blocked: {result.threats_found}")  # e.g. ['instruction_override', 'exfiltration']

# Scan files (HTML, PDF, JSON, images, text)
result = scanner.scan_file("document.pdf")

# Scan URLs (with built-in SSRF protection)
result = scanner.scan_url("https://example.com/data")

# Batch scan
results = scanner.scan_batch([
    {"content": "first item"},
    {"content": "second item"},
])

Policy Presets

from injectguard import Scanner, Policy

# Permissive — higher thresholds, fewer blocks
scanner = Scanner.with_preset("permissive")

# High security — aggressive blocking
scanner = Scanner.with_preset("high_security")

# Custom policy from YAML
scanner = Scanner(policy=Policy.from_file("policy.yaml"))

CLI

# Scan a file
injectguard scan document.pdf

# Scan a URL
injectguard scan https://example.com/page

# Scan from stdin
echo "ignore previous instructions" | injectguard scan -

# JSON output
injectguard scan document.html --json

# Verbose mode — show individual scanner scores
injectguard scan suspicious.pdf --verbose

# Use a stricter policy
injectguard scan data.json --preset high_security

Exit codes: 0 = safe, 1 = flagged, 2 = blocked.

API Server

# Start the server
injectguard serve --host 0.0.0.0 --port 9000

# Or with a policy preset
injectguard serve --preset high_security

# Scan content
curl -X POST http://localhost:9000/v1/scan \
  -H "Content-Type: application/json" \
  -d '{"content": "some text to scan"}'

# Batch scan
curl -X POST http://localhost:9000/v1/scan/batch \
  -H "Content-Type: application/json" \
  -d '{"items": [{"content": "text 1"}, {"content": "text 2"}]}'

# Sanitize — scan + strip threats
curl -X POST http://localhost:9000/v1/sanitize \
  -H "Content-Type: application/json" \
  -d '{"content": "Ignore instructions. Normal text here."}'

# Health check
curl http://localhost:9000/v1/health

# Audit log
curl http://localhost:9000/v1/audit?limit=50

HTTP Proxy

Transparently scan all HTTP traffic for your agent:

injectguard proxy --port 8080 --block-mode replace

# Point your agent at the proxy
export HTTP_PROXY=http://127.0.0.1:8080

Block modes: replace (swap blocked content), drop (empty response), header_only (pass content but add warning headers).

Detection Pipeline

InjectGuard uses a layered ensemble approach. Each scanner votes with a confidence score, and a weighted aggregator produces the final verdict.

Scanner	What it catches	Speed
Heuristic	~70 regex patterns across 7 attack categories: instruction override, fake context, exfiltration, social engineering, encoding evasion, structural attacks, manipulation. Also detects invisible characters and homoglyph substitution.	<1ms
Advanced Heuristic	Base64/ROT13/hex encoded payloads, multilingual attacks (12 languages), split payload detection	~1ms
ML Classifier	DeBERTa-v3-base fine-tuned for prompt injection detection (ONNX inference)	~50ms
Vector Similarity	ChromaDB + sentence-transformers matching against known attack patterns. Self-hardening: confirmed attacks are added to the vector store.	~20ms
LLM Judge	Multi-backend (Ollama, Claude, OpenAI) structured analysis for subtle attacks that evade pattern matching	~1-5s

The pipeline supports early exit — if any scanner returns a score above 0.95, it short-circuits immediately without running slower scanners.

Verdicts

Verdict	Meaning
`safe`	No threats detected
`flagged`	Suspicious content, review recommended
`blocked`	High-confidence threat, content should not reach the agent
`sanitized`	Threats were found and stripped; sanitized content is available
`error`	Scanner error

Content Parsing

InjectGuard doesn't just scan visible text. Its paranoid extraction mode pulls content from places attackers hide payloads:

HTML: CSS-hidden elements (display:none, visibility:hidden, opacity:0, off-screen positioning, zero font-size, same-color text), comments, <script>/<noscript>, hidden inputs, aria-hidden, data attributes
PDF: Page text, annotations, metadata fields, embedded file detection
Images: PNG text chunks (tEXt, iTXt), EXIF metadata, OCR via Tesseract
JSON/YAML: Recursive string extraction with key path tracking
URLs: Full page fetch with SSRF protection, optional Playwright JS rendering for SPAs

Sanitization

When content is flagged, InjectGuard can clean it instead of blocking:

Strip invisible characters — zero-width spaces, directional overrides, BOM markers
Neutralize delimiters — <system> becomes [TAG:system], preventing role injection
Remove hidden HTML — strips display:none elements, comments, scripts
Annotate suspicious content — wraps threats in [SUSPICIOUS:instruction_override] markers
Truncate — enforces content length limits

Framework Integrations

LangChain

from injectguard.integrations.langchain import ShieldedWebLoader, shield_tool

# Scan-on-load for web content
loader = ShieldedWebLoader("https://example.com")
docs = loader.load()  # raises if blocked

# Wrap any tool
@shield_tool()
def my_tool(query: str) -> str:
    return external_api.call(query)

CrewAI

from injectguard.integrations.crewai import ShieldedTool, shield_crew_tools

# Wrap a single tool
safe_tool = ShieldedTool(original_tool)

# Wrap all tools for a crew
safe_tools = shield_crew_tools([tool1, tool2, tool3])

MCP (Model Context Protocol)

from injectguard.integrations.mcp import MCPToolWrapper

wrapper = MCPToolWrapper()

@wrapper.wrap
async def fetch_data(url: str) -> str:
    return await http_client.get(url)

Dashboard

The built-in web dashboard provides real-time monitoring at /dashboard:

Total scans, safe/flagged/blocked counts
Average latency and block rate
Recent scan table with verdict, score, threats, and timing
Auto-refreshes every 30 seconds via HTMX

injectguard serve  # dashboard available at http://localhost:9000/dashboard

Alerting

Get notified when threats are detected:

# Environment variables
export INJECTGUARD_SLACK_WEBHOOK=https://hooks.slack.com/services/...
export INJECTGUARD_WEBHOOK_URL=https://your-app.com/webhook
export INJECTGUARD_ALERT_ON_FLAGGED=true

from injectguard.alerting import AlertManager, SlackAlert, WebhookAlert

manager = AlertManager.from_env()
# or configure manually
manager = AlertManager(alert_on_blocked=True, alert_on_flagged=True)
manager.add_channel(SlackAlert(webhook_url="..."))
manager.add_channel(WebhookAlert(url="..."))

# After each scan
manager.check_and_alert(result)

Authentication & Rate Limiting

# Require API keys (comma-separated)
export INJECTGUARD_API_KEYS=key1,key2,key3

# Rate limiting
export INJECTGUARD_RATE_LIMIT=100        # requests per window
export INJECTGUARD_RATE_WINDOW=60        # window in seconds
export INJECTGUARD_RATE_LIMIT_ENABLED=true

curl -H "X-API-Key: key1" http://localhost:9000/v1/scan ...

When no API keys are configured, authentication is disabled (open access).

Deployment

Docker

cd docker
docker compose up -d

Kubernetes (Helm)

helm install injectguard ./helm/injectguard \
  --set apiKeys="key1,key2" \
  --set ingress.enabled=true \
  --set ingress.hosts[0].host=injectguard.example.com

The Helm chart includes: Deployment, Service, HPA (autoscaling), Ingress, PVC (persistent audit storage), and Secrets management.

Policy Configuration

Create a policy.yaml to customize thresholds, scanner weights, domain rules, and more:

thresholds:
  block: 0.85
  flag: 0.60
  sanitize: 0.70

scanner_weights:
  heuristic: 1.0
  ml_classifier: 1.2
  similarity: 0.8
  llm_judge: 1.5

content_rules:
  max_content_length: 500000
  strip_invisible_chars: true
  check_homoglyphs: true

domain_policy:
  blocked_domains:
    - "evil.com"
  allowed_domains: []

enable_ml_classifier: true
enable_similarity: false
enable_llm_judge: false

Benchmarks

# Accuracy benchmark (35 malicious + 25 benign samples)
injectguard benchmark --verbose

# Latency profiling
python benchmarks/bench_latency.py

Heuristic-only results (no ML model):

Accuracy: 88.3%
Precision: 96.7%
Recall: 82.9%
Latency: 0.15ms (100B) to 71ms (50KB)

Architecture

src/injectguard/
  client.py              # Scanner SDK (main entry point)
  models.py              # Verdict, ScanResult, ScannerResult
  policy.py              # Policy configuration & presets
  cli.py                 # CLI (scan, serve, proxy, benchmark)
  alerting.py            # Slack, webhook, console alerts
  proxy.py               # HTTP scanning proxy
  core/
    pipeline.py          # Scanner orchestration & ensemble scoring
  scanners/
    heuristic.py         # Regex pattern scanner (~70 patterns)
    advanced_heuristics.py  # Encoding, multilingual, split payloads
    ml_classifier.py     # DeBERTa ONNX classifier
    similarity.py        # ChromaDB vector similarity
    llm_judge.py         # LLM-based analysis
  parsers/
    html.py              # Paranoid HTML parser
    pdf.py               # PDF text + annotation extractor
    image.py             # PNG chunks, EXIF, OCR
    json_parser.py       # Recursive JSON string extractor
    text.py              # Plain text / markdown
  fetchers/
    url.py               # URL fetcher with SSRF protection
    file.py              # Local file fetcher
    playwright.py        # JS-rendered page fetcher
  sanitizer/
    engine.py            # Content sanitizer
  integrations/
    langchain.py         # LangChain loaders & tool wrapper
    crewai.py            # CrewAI tool wrapper
    mcp.py               # MCP middleware & tool wrapper
  api/
    server.py            # FastAPI app factory
    routes.py            # REST API endpoints
    middleware.py         # Auth & rate limiting
    state.py             # App state management
  audit/
    store.py             # SQLite audit log
    logger.py            # Structured logging
  dashboard/
    routes.py            # HTMX dashboard routes
    templates/           # Dashboard HTML templates

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
docker		docker
helm/injectguard		helm/injectguard
src/injectguard		src/injectguard
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
policy.example.yaml		policy.example.yaml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InjectGuard

Why

Install

Quick Start

Python SDK

Policy Presets

CLI

API Server

HTTP Proxy

Detection Pipeline

Verdicts

Content Parsing

Sanitization

Framework Integrations

LangChain

CrewAI

MCP (Model Context Protocol)

Dashboard

Alerting

Authentication & Rate Limiting

Deployment

Docker

Kubernetes (Helm)

Policy Configuration

Benchmarks

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

InjectGuard

Why

Install

Quick Start

Python SDK

Policy Presets

CLI

API Server

HTTP Proxy

Detection Pipeline

Verdicts

Content Parsing

Sanitization

Framework Integrations

LangChain

CrewAI

MCP (Model Context Protocol)

Dashboard

Alerting

Authentication & Rate Limiting

Deployment

Docker

Kubernetes (Helm)

Policy Configuration

Benchmarks

Architecture

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages