-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
Rana Faraz edited this page Jun 23, 2026
·
1 revision
GuardrAIl is configured through a Policy object (pydantic data model), environment variables (GUARDRAIL_* prefix), and optional .env file. All settings have offline-safe defaults.
Load a preset by name:
from guardrail import Guard
g = Guard.from_policy("default") # or "ferpa", "coppa", "gdpr"| Preset | Injection action | Input PII action | Output PII action | Toxicity action |
|---|---|---|---|---|
default |
block | redact | redact | flag |
ferpa |
block | redact | redact | flag |
coppa |
block (lower threshold) | block | redact | block |
gdpr |
block | redact | redact | flag |
Override any field:
from guardrail import Guard, Policy
g = Guard(Policy(
name="custom",
injection_threshold=0.3, # lower = more aggressive detection
detect_pii=True,
detect_toxicity=False, # disable toxicity guard
toxicity_action="block", # override action
pii_action="redact",
))All env vars use the GUARDRAIL_ prefix. They override the preset's defaults.
| Env var | Default | Description |
|---|---|---|
GUARDRAIL_POLICY |
default |
Policy preset to load (default, ferpa, coppa, gdpr) |
GUARDRAIL_PII_BACKEND |
regex |
PII detection backend (regex, presidio) |
GUARDRAIL_TOXICITY_BACKEND |
lexical |
Toxicity detection backend (lexical, detoxify) |
GUARDRAIL_INJECTION_THRESHOLD |
0.5 |
Injection score threshold for blocking (0–1) |
GUARDRAIL_TOXICITY_THRESHOLD |
0.5 |
Toxicity score threshold (0–1) |
GUARDRAIL_DETECT_PII |
true |
Enable/disable PII guard |
GUARDRAIL_DETECT_TOXICITY |
true |
Enable/disable toxicity guard |
GUARDRAIL_DETECT_INJECTION |
true |
Enable/disable injection guard |
GUARDRAIL_PII_ACTION |
redact |
Action on PII (redact, block, flag) |
GUARDRAIL_TOXICITY_ACTION |
flag |
Action on toxicity (block, flag) |
GUARDRAIL_INJECTION_ACTION |
block |
Action on injection (block, flag) |
# Policy preset
GUARDRAIL_POLICY=default
# Backends (offline by default; switch for better coverage)
# GUARDRAIL_PII_BACKEND=presidio # requires pip install -e ".[presidio]"
# GUARDRAIL_TOXICITY_BACKEND=detoxify # requires pip install -e ".[toxicity]"
# Thresholds (lower = more aggressive)
# GUARDRAIL_INJECTION_THRESHOLD=0.5
# GUARDRAIL_TOXICITY_THRESHOLD=0.5
# Guard enable/disable
# GUARDRAIL_DETECT_PII=true
# GUARDRAIL_DETECT_TOXICITY=true
# GUARDRAIL_DETECT_INJECTION=true
# Actions
# GUARDRAIL_PII_ACTION=redact
# GUARDRAIL_TOXICITY_ACTION=flag
# GUARDRAIL_INJECTION_ACTION=block| Backend | Install | Description |
|---|---|---|
regex (default) |
included | Fast regex patterns: email, phone, SSN, credit card, IP address |
presidio |
pip install -e ".[presidio]" |
Microsoft Presidio NER: adds PERSON, LOCATION, ORG detection |
pip install -e ".[presidio]"
# .env:
# GUARDRAIL_PII_BACKEND=presidio| Backend | Install | Description |
|---|---|---|
lexical (default) |
included | Word-list matching with lexical scoring, F1 = 0.94 offline |
detoxify |
pip install -e ".[toxicity]" |
Detoxify transformer classifier (Unitary) |
pip install -e ".[toxicity]"
# .env:
# GUARDRAIL_TOXICITY_BACKEND=detoxifyinjection_threshold controls sensitivity: lower values catch more injections but raise false-positive rate. The default (0.5) is tuned for the bundled eval set.
COPPA preset uses a lower threshold (~0.34) to prioritise recall over precision — a missed injection in a children's app is a more severe failure than a false block.
To tune for your deployment:
- Collect representative benign and adversarial inputs
- Run
guardrail check-input "..."on each and observerisk_score - Set
GUARDRAIL_INJECTION_THRESHOLDjust above the highest benignrisk_score - Verify with
guardrail redteamthat catch rate remains acceptable
from fastapi import FastAPI
from guardrail import Guard
from guardrail.middleware import GuardrailMiddleware
app = FastAPI()
app.add_middleware(
GuardrailMiddleware,
guard=Guard.from_policy("coppa"),
# Fields in POST body that contain the user prompt:
input_fields=["prompt", "input", "message"], # default
)Blocked requests receive:
{
"detail": "Input blocked by guardrail",
"violations": [{"guard": "injection", "severity": "high", "score": 0.5}]
}guardrail check-input "text to screen" [--policy PRESET]
guardrail redteam [--policy PRESET]
guardrail check-output "text to screen" [--policy PRESET]