Human-in-the-loop execution for LLM agents
-
Updated
Jan 11, 2026 - Python
Human-in-the-loop execution for LLM agents
🛡️ Safe AI Agents through Action Classifier
Guardrails for LLMs: detect and block hallucinated tool calls to improve safety and reliability.
Open Threat Classification (OTC) — 10 threat patterns for AI agent skills, MCP servers, and plugins. CC-BY-4.0.
Runtime network egress control for Python. One function call to restrict which hosts your code can connect to.
🛡️ Open-source safety guardrail for AI agent tool calls. <2ms, zero dependencies.
The missing safety layer for AI Agents. Adaptive High-Friction Guardrails (Time-locks, Biometrics) for critical operations to prevent catastrophic errors.
Runtime detector for reward hacking and misalignment in LLM agents (89.7% F1 on 5,391 trajectories).
A runtime authorization layer for LLM tool calls policy, approval, audit logs.
ETHICS.md — A statement of ethical principles for AI agents. Drop it in your repo root.
Safety-first agentic toolkit: 10 packages for collapse detection, governance, and reproducible runs.
🛡️ A curated list of tools, frameworks, standards, and resources for AI agent governance, safety, and compliance
A2A version of Agent Action Guard: Safe AI Agents through Action Classifier
An open-source engineering blueprint for defining and designing the core capabilities, boundaries, and ethics of any AI agent.
MCP server for intent security pre-flight checks for autonomous AI agents
Energy based legality gating SDK for AI reasoning. Predicts, repairs, and audits collapse before it happens; reduces hallucinations and provides numeric audit logs.
Runtime-agnostic hook harness that catches unverifiable prompts, enforces failure-mode templates, and gates task completion on passing tests.
Canonical texts and implementation primitives for the Safe Superintelligence Framework (v1.2.1): Constitution, Minimum Rescue Protocol, system prompt, decision matrix.
MCP server for reversibility intelligence — check if actions can be undone
MCP server for situational awareness — holidays, business hours, platform status
Add a description, image, and links to the agent-safety topic page so that developers can more easily learn about it.
To associate your repository with the agent-safety topic, visit your repo's landing page and select "manage topics."