FastAPI incident console for reviewing AI failures with timeline evidence, replay previews, and mitigation planning.
Most AI demos stop at generation. Real systems need incident review after things go wrong:
- prompt injection attempts
- hallucinated SQL or tool calls
- policy bypass attempts
- grounding failures
- unsafe replay loops
This project models the review layer after the incident already happened. It gives teams a place to inspect evidence, understand the runtime timeline, and plan a safe replay.
- incident queue with severity, impact, and service ownership
- timeline events across retrieval, reasoning, policy, and response stages
- evidence records for prompt fragments, tool requests, and invalid SQL drafts
- replay preview modes for safe re-runs
- mitigation tasks with owner and priority
- static dashboard for portfolio and demo screenshots
GET /healthGET /teamsGET /playbooksGET /incidentsGET /incidents/{incident_id}GET /incidents/{incident_id}/timelineGET /incidents/{incident_id}/evidenceGET /incidents/{incident_id}/mitigationsGET /incidents/{incident_id}/replay-preview
python -m pip install -e .
python -m uvicorn llm_incident_review_console.main:app --reloadOpen:
http://127.0.0.1:8000/dashboardhttp://127.0.0.1:8000/incidents
inc_1001prompt injection attempt inagent-runtime-control-towerinc_1002hallucinated SQL draft indanex-rag-service
What you can inspect immediately:
- dashboard proof:
output/playwright/screen-01-dashboard.png - health proof:
output/playwright/screen-02-health-proof.png - operations proof:
output/playwright/screen-03-ops-proof.png - product framing:
output/playwright/screen-04-product-proof.png - architecture notes:
docs/ARCHITECTURE.md - case study:
docs/CASE_STUDY.md
- the backend can model post-incident AI forensics, not only request-time inference
- seeded evidence is grouped into timeline, mitigation, and replay surfaces that a reviewer can inspect quickly
- the project has a live dashboard, Docker packaging, CI, and screenshot-ready artifacts instead of README-only claims
This is not another chatbot demo. It is incident forensics for AI systems: the layer between "something went wrong" and "we understand why it went wrong and how to replay safely."