Architecture

Goal

Provide a control layer for AI agents in production environments.

The system focuses on questions that sit above the model call:

A tenant-scoped user submits a run request.
The system resolves agent, runtime policy, MCP server, and requestor scope.
The policy engine evaluates budget, tool call count, MCP risk tier, and approval rules.
The run is approved, blocked, or routed to approval.
A tool trace is stored for observability.
Runtime state is written to Redis or an in-memory fallback for fast operator access.
Blocked runs create incident records.
Existing runs can be replayed to validate a stricter policy or server choice.

DATABASE_URL-based repository with SQLAlchemy, allowing local SQLite and production PostgreSQL setups.
runtime state store abstraction with Redis support and memory fallback.
replay endpoint for validating how a prior run would behave under a different policy path.
health reporting that exposes both database and state backends.