SentrySkills

Self-guarding security framework for AI agents. Three-stage protection (preflight → runtime → output) with predictive risk analysis. Zero dependencies, production-ready.

🎯 What It Does

SentrySkills protects AI agents from:

Preflight: Detects threats before execution
Runtime: Monitors behavior during execution
Output: Redacts sensitive data in responses
Predictive: Warns about potential risks before they materialize

✨ Key Features

Zero dependencies - 100% Python standard library
33+ detection rules - AI attacks, web vulnerabilities, data leaks, code security
7 risk predictors - Anticipates problems before execution
Policy profiles - Balanced, strict, permissive modes
Full traceability - JSONL event logs

📥 Installation

OpenClaw

# Install ClawHub CLI
npm i -g clawhub

# Install SentrySkills
clawhub install sentryskills

# Enable auto-protection
cat > ~/.codex/AGENTS.md << 'EOF'
# Security: SentrySkills runs automatically
Before EVERY response, run:
python ./skills/sentry-skills/shared/scripts/self_guard_runtime_hook_template.py input.json --policy-profile balanced --out result.json
EOF

# Restart OpenClaw

📖 Detailed guide →

Codex (Legacy)

git clone https://github.com/AI45Lab/SentrySkills.git ~/.codex/sentryskills
mkdir -p ~/.agents/skills
ln -s ~/.codex/sentryskills ~/.agents/skills/sentryskills

📖 Detailed guide →

🛡️ Detection Coverage

AI/LLM Attacks: Prompt injection, jailbreak, system prompt leakage Web Security: SQL injection, XSS, command injection, SSTI, path traversal Data Leaks: SSH keys, AWS credentials, API keys, database strings Code Security: Hardcoded secrets, weak crypto, unsafe eval/exec Predictive: Resource exhaustion, scope creep, privilege escalation, data exfiltration

Skill Package Structure

sentry-skills/
├── using-sentryskills/          # ① Entry point
├── sentryskills-orchestrator/   # ② Coordination layer
├── sentryskills-preflight/      # ③ Pre-execution checks
├── sentryskills-runtime/        # ④ Runtime monitoring
└── sentryskills-output/         # ⑤ Output validation

⚙️ Configuration

Balanced (default): Standard security
Strict: Maximum security
Permissive: Minimal interference

🔄 Skill Package Execution Flow

SentrySkills is a multi-skill orchestration package that executes in a specific sequence:

┌─────────────────────────────────────────────────────────────┐
│  1. using-sentryskills (Entry Point)                        │
│     ├─ Triggered automatically via AGENTS.md                │
│     ├─ Prepares input JSON with user prompt + actions       │
│     └─ Calls orchestrator                                   │
├─────────────────────────────────────────────────────────────┤
│  2. sentryskills-orchestrator (Coordination)                │
│     ├─ Manages execution sequence                           │
│     ├─ Aggregates results from all stages                   │
│     └─ Makes final allow/downgrade/block decision           │
├─────────────────────────────────────────────────────────────┤
│  3. sentryskills-preflight (Pre-Execution)                  │
│     ├─ BEFORE any action is taken                           │
│     ├─ Analyzes user prompt for malicious intent            │
│     ├─ Checks planned actions against detection rules       │
│     └─ Returns: block/allow with matched threats            │
├─────────────────────────────────────────────────────────────┤
│  4. sentryskills-runtime (During Execution)                 │
│     ├─ WHILE agent executes commands/tool calls             │
│     ├─ Monitors runtime events (file ops, network calls)    │
│     ├─ Detects behavioral anomalies                         │
│     └─ Returns: continue/alert/abort                        │
├─────────────────────────────────────────────────────────────┤
│  5. sentryskills-output (Post-Execution)                    │
│     ├─ BEFORE agent outputs response                        │
│     ├─ Scans response for sensitive data                    │
│     ├─ Redacts secrets, credentials, private keys           │
│     └─ Returns: safe/redacted response                      │
├─────────────────────────────────────────────────────────────┤
│  6. Orchestrator Final Decision                             │
│     ├─ Compiles all stage results                           │
│     ├─ Applies policy profile (balanced/strict/permissive)  │
│     └─ Outputs final action + trace ID                      │
└─────────────────────────────────────────────────────────────┘

Decision Flow

Preflight BLOCK → → → → → → → → → → → → → → → → → → → ┐
       ↓                                              │
      ALLOW                                           │
       ↓                                              │
Runtime CONTINUE ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ← ←┘
       ↓
    ALERT/ABORT → BLOCK
       ↓
      CONTINUE
       ↓
Output REDACTED → Safe response
       ↓
     CLEAN
       ↓
   Final Decision (allow/downgrade/block)

Key Points

Sequential execution: Each stage must pass before the next begins
Early termination: Any BLOCK decision stops execution immediately
Cumulative evidence: All detections contribute to final decision
Traceability: Every stage emits events with shared trace ID

📈 Performance

Latency: ~50-100ms per check
Memory: <50MB
No LLM calls

📋 Requirements

Python 3.8+
No external dependencies

🤝 Contributing

Contributions welcome:

Report vulnerabilities privately
Submit PRs for new detection patterns
Improve documentation and performance

🔗 Links

GitHub Pages: https://zengbiaojie.github.io/SentrySkills/
ClawHub: https://clawhub.ai/zengbiaojie/sentryskills
Issues: https://github.com/AI45Lab/SentrySkills/issues
Documentation: install/

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
docs		docs
install		install
sentryskills-orchestrator		sentryskills-orchestrator
sentryskills-output		sentryskills-output
sentryskills-preflight		sentryskills-preflight
sentryskills-runtime		sentryskills-runtime
shared		shared
using-sentryskills		using-sentryskills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SentrySkills

🎯 What It Does

✨ Key Features

📥 Installation

OpenClaw

Codex (Legacy)

🛡️ Detection Coverage

Skill Package Structure

⚙️ Configuration

🔄 Skill Package Execution Flow

Decision Flow

Key Points

📈 Performance

📋 Requirements

🤝 Contributing

🔗 Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SentrySkills

🎯 What It Does

✨ Key Features

📥 Installation

OpenClaw

Codex (Legacy)

🛡️ Detection Coverage

Skill Package Structure

⚙️ Configuration

🔄 Skill Package Execution Flow

Decision Flow

Key Points

📈 Performance

📋 Requirements

🤝 Contributing

🔗 Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages