Configuration

GuardrAIl is configured through a Policy object (pydantic data model), environment variables (GUARDRAIL_* prefix), and optional .env file. All settings have offline-safe defaults.

Policy presets

Load a preset by name:

from guardrail import Guard
g = Guard.from_policy("default")   # or "ferpa", "coppa", "gdpr"

Preset	Injection action	Input PII action	Output PII action	Toxicity action
`default`	block	redact	redact	flag
`ferpa`	block	redact	redact	flag
`coppa`	block (lower threshold)	block	redact	block
`gdpr`	block	redact	redact	flag

Custom policy

Override any field:

from guardrail import Guard, Policy

g = Guard(Policy(
    name="custom",
    injection_threshold=0.3,        # lower = more aggressive detection
    detect_pii=True,
    detect_toxicity=False,           # disable toxicity guard
    toxicity_action="block",        # override action
    pii_action="redact",
))

Environment variables

All env vars use the GUARDRAIL_ prefix. They override the preset's defaults.

Env var	Default	Description
`GUARDRAIL_POLICY`	`default`	Policy preset to load (`default`, `ferpa`, `coppa`, `gdpr`)
`GUARDRAIL_PII_BACKEND`	`regex`	PII detection backend (`regex`, `presidio`)
`GUARDRAIL_TOXICITY_BACKEND`	`lexical`	Toxicity detection backend (`lexical`, `detoxify`)
`GUARDRAIL_INJECTION_THRESHOLD`	`0.5`	Injection score threshold for blocking (0–1)
`GUARDRAIL_TOXICITY_THRESHOLD`	`0.5`	Toxicity score threshold (0–1)
`GUARDRAIL_DETECT_PII`	`true`	Enable/disable PII guard
`GUARDRAIL_DETECT_TOXICITY`	`true`	Enable/disable toxicity guard
`GUARDRAIL_DETECT_INJECTION`	`true`	Enable/disable injection guard
`GUARDRAIL_PII_ACTION`	`redact`	Action on PII (`redact`, `block`, `flag`)
`GUARDRAIL_TOXICITY_ACTION`	`flag`	Action on toxicity (`block`, `flag`)
`GUARDRAIL_INJECTION_ACTION`	`block`	Action on injection (`block`, `flag`)

`.env.example`

# Policy preset
GUARDRAIL_POLICY=default

# Backends (offline by default; switch for better coverage)
# GUARDRAIL_PII_BACKEND=presidio       # requires pip install -e ".[presidio]"
# GUARDRAIL_TOXICITY_BACKEND=detoxify  # requires pip install -e ".[toxicity]"

# Thresholds (lower = more aggressive)
# GUARDRAIL_INJECTION_THRESHOLD=0.5
# GUARDRAIL_TOXICITY_THRESHOLD=0.5

# Guard enable/disable
# GUARDRAIL_DETECT_PII=true
# GUARDRAIL_DETECT_TOXICITY=true
# GUARDRAIL_DETECT_INJECTION=true

# Actions
# GUARDRAIL_PII_ACTION=redact
# GUARDRAIL_TOXICITY_ACTION=flag
# GUARDRAIL_INJECTION_ACTION=block

Backend selection

PII backends

Backend	Install	Description
`regex` (default)	included	Fast regex patterns: email, phone, SSN, credit card, IP address
`presidio`	`pip install -e ".[presidio]"`	Microsoft Presidio NER: adds PERSON, LOCATION, ORG detection

pip install -e ".[presidio]"
# .env:
# GUARDRAIL_PII_BACKEND=presidio

Toxicity backends

Backend	Install	Description
`lexical` (default)	included	Word-list matching with lexical scoring, F1 = 0.94 offline
`detoxify`	`pip install -e ".[toxicity]"`	Detoxify transformer classifier (Unitary)

pip install -e ".[toxicity]"
# .env:
# GUARDRAIL_TOXICITY_BACKEND=detoxify

Threshold tuning

injection_threshold controls sensitivity: lower values catch more injections but raise false-positive rate. The default (0.5) is tuned for the bundled eval set.

COPPA preset uses a lower threshold (~0.34) to prioritise recall over precision — a missed injection in a children's app is a more severe failure than a false block.

To tune for your deployment:

Collect representative benign and adversarial inputs
Run guardrail check-input "..." on each and observe risk_score
Set GUARDRAIL_INJECTION_THRESHOLD just above the highest benign risk_score
Verify with guardrail redteam that catch rate remains acceptable

FastAPI middleware configuration

from fastapi import FastAPI
from guardrail import Guard
from guardrail.middleware import GuardrailMiddleware

app = FastAPI()
app.add_middleware(
    GuardrailMiddleware,
    guard=Guard.from_policy("coppa"),
    # Fields in POST body that contain the user prompt:
    input_fields=["prompt", "input", "message"],   # default
)

Blocked requests receive:

{
  "detail": "Input blocked by guardrail",
  "violations": [{"guard": "injection", "severity": "high", "score": 0.5}]
}

CLI

guardrail check-input "text to screen" [--policy PRESET]
guardrail redteam [--policy PRESET]
guardrail check-output "text to screen" [--policy PRESET]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration

Configuration

Policy presets

Custom policy

Environment variables

`.env.example`

Backend selection

PII backends

Toxicity backends

Threshold tuning

FastAPI middleware configuration

CLI

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally