Open framework for reliability, observability, and governance of AI agents in real-world systems.
ARF is built to make AI systems more reliable by combining monitoring, policy enforcement, memory, and automated healing.
- Observes agent behavior and system health
- Detects anomalies and failure patterns
- Applies governance and policy controls
- Supports adaptive healing and recovery
- Stores incidents, outcomes, and learnings for future decisions
- Runtime — execution, monitoring, and orchestration
- Memory — incident history, retrieval, and learning
- Governance — policy engine and decision logic
- Healing — recovery actions and self-correction
- agentic-reliability-framework — core framework
- arf-api — control plane and services
- arf-spec — architecture and protocol specification
- arf-frontend — interface and observability layer
AI systems should not fail silently. They should be observable, governable, and able to recover.