-
Notifications
You must be signed in to change notification settings - Fork 0
Development
- Python 3.10, 3.11, or 3.12
- Git
- (Optional) Docker for containerised gate runs
git clone https://github.com/ranafaraz/GuardrAIl.git
cd GuardrAIl
# Create and activate a virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install offline/dev stack (what CI uses — no downloads, no API keys)
pip install -e ".[dev]"
# Optional extras
pip install -e ".[api]" # FastAPI middleware
pip install -e ".[presidio]" # Microsoft Presidio PII NER
pip install -e ".[toxicity]" # Detoxify transformer toxicity# Full offline test suite (45 tests, no downloads)
pytest -q
# One file
pytest tests/test_guard.py -q
# Lint
ruff check .
ruff check . --fix # auto-fix import ordering etc.
# Regenerate eval report
python -m guardrail.evals.harness
# CI quality gate (exits 1 on metric regression)
python -m guardrail.evals.gate
# Red-team battery
guardrail redteam --policy default
guardrail redteam --policy coppa
# Docker gate run
docker build -t guardrail . && docker run --rm guardrailguardrail/
config.py Policy + presets (FERPA/COPPA/GDPR) + env Settings
guard.py Guard orchestrator (check_input, check_output)
types.py Violation / GuardResult / Action vocabulary
text_utils.py Shared tokenization and lexical scoring
input_guards/
injection.py Prompt-injection / jailbreak detection (rules)
pii.py PII redaction (regex | presidio)
output_guards/
pii_leak.py Output PII-leak detection and redaction
toxicity.py Toxicity detection (lexical | detoxify)
schema.py JSON-schema validation
middleware/
fastapi.py ASGI middleware + FastAPI dependency
evals/
data/ Labelled sets (injection, pii, toxicity, refusal)
metrics.py Precision / recall / F1 / accuracy computation
redteam.py 15-case adversarial battery
harness.py Runs all evals, writes RESULTS.md
gate.py CI quality gate (metric floors)
docs/
ARCHITECTURE.md Full architecture and pipeline documentation
DECISIONS.md Design decision log
demo.gif Terminal capture placeholder
examples/
fastapi_app.py Runnable FastAPI example with middleware
tests/ pytest suite (45 offline tests)
All guards are pure functions or simple classes returning values from guardrail/types.py. A guard has one job: detect and return a Violation | None (or (text, [Violation]) for redaction guards). It does not decide what to do — that is the Policy's job.
Example: adding a new input guard
- Create
guardrail/input_guards/my_guard.py:
from guardrail.types import Violation, Action
def detect_my_threat(text: str, threshold: float = 0.5) -> Violation | None:
score = _compute_score(text) # your detection logic
if score >= threshold:
return Violation(
guard="my_guard",
category="my_category",
severity="high",
score=score,
action=Action.block,
)
return None- Wire it into
Guard.check_input()inguardrail/guard.py:
from guardrail.input_guards.my_guard import detect_my_threat
def check_input(self, text: str) -> GuardResult:
# existing injection check ...
my_result = detect_my_threat(text, self.policy.my_threshold)
if my_result and self.policy.my_action == "block":
return GuardResult(blocked=True, violations=[my_result])
# ...- Add the threshold and action to
Policyinguardrail/config.py:
class Policy(BaseModel):
my_threshold: float = 0.5
my_action: Action = Action.block-
Add a
GUARDRAIL_MY_THRESHOLDenv var toSettingsand document it in the Configuration wiki page. -
Add labelled test cases to
guardrail/evals/data/and update the harness to call your new metric. -
Run
pytest -qandpython -m guardrail.evals.gate— both must pass.
Presets live in guardrail/config.py as a dict of Policy instances:
POLICY_PRESETS: dict[str, Policy] = {
"default": Policy(name="default", ...),
"ferpa": Policy(name="ferpa", ...),
# Add your preset here:
"hipaa": Policy(
name="hipaa",
detect_pii=True,
pii_action=Action.redact,
injection_threshold=0.4,
detect_toxicity=False,
),
}Then it is available as Guard.from_policy("hipaa").
GitHub Actions runs on every push and pull request:
-
Lint —
ruff check . -
Tests —
pytest -qon Python 3.10, 3.11, 3.12 with offline backends -
Eval gate —
python -m guardrail.evals.gatewith offline backends
All three steps must pass. The eval gate catches safety-metric regressions before merge.
Any import of fastapi, presidio_analyzer, or detoxify must be inside a function body (not at module level), so import guardrail works on the base install without those packages:
# DO NOT do this at module level:
# from presidio_analyzer import AnalyzerEngine # breaks base install
def _get_presidio():
try:
from presidio_analyzer import AnalyzerEngine
return AnalyzerEngine()
except ImportError:
return None # fall back to regex- The venv interpreter is at
.venv\Scripts\python.exe - Windows console is cp1252 — do not
print()non-ASCII characters from scripts; write UTF-8 to files or setPYTHONIOENCODING=utf-8