Deployment Checklist

Pre-production checklist for VERONICA control-plane deployments. Complete all items before routing live traffic.

1. Infrastructure

Docker and Docker Compose installed on the target host
Ports 8000 (API), 5432 (PostgreSQL), 9090 (Prometheus), 3000 (Grafana), 9464 (metrics) are available
docker compose up -d completes without errors (cd deploy/)
GET /health returns {"status": "ok"} with the expected version
PostgreSQL data directory is on a persistent volume (not the container ephemeral layer)
Redis deployed if distributed budget enforcement across processes is required (VERONICA_REDIS_URL set and reachable from the API container)
Host has sufficient memory for PostgreSQL + Prometheus retention (minimum 2 GB recommended)

pip install veronica-cp[metrics] installed (metrics extra)
GET http://127.0.0.1:9464/metrics returns Prometheus-formatted output
Prometheus scraping 9464/metrics -- verify in Prometheus Targets UI (/targets)
Grafana dashboard loads and shows data (open http://127.0.0.1:3000)
Alerting rules configured for: cost ceiling breaches, HALT events, API error rate
Log retention policy confirmed (Docker logging driver or external collector configured)
step_denied metric baseline established before go-live (should be near zero initially)

At least one policy defined via PUT /policies/{chain_id} before routing traffic
Initial ceiling_usd set to at least 3x observed p95 spend (conservative start)
on_exceed set to degrade for interactive agents; halt for batch/autonomous agents
step_limit configured where unbounded agent loops are a risk
Policy simulation run against representative historical traffic (if available)
GET /policies returns all expected chain policies with correct versions
Policy version conflict behavior tested: confirm 409 Conflict on stale current_version
Gradual rollout plan documented: simulation mode -> degrade -> halt

veronica-core kernel connected to the control-plane API
ShieldPipeline and BudgetEnforcer initialized with the expected chain IDs
Event flow verified: a test LLM call produces an event visible in Grafana
HALT path tested end-to-end: trigger a ceiling breach and confirm the agent stops
DEGRADE path tested if on_exceed: degrade is in use
Redis budget synchronization tested under concurrent load (if Redis is configured)
Adapter compatibility confirmed for all LLM providers in use (OpenAI, Anthropic, etc.)

PostgreSQL backup schedule configured (daily minimum for production)
Backup includes policy table and event store
Restore procedure tested on a non-production host at least once
Policy export via GET /policies scripted and stored alongside infrastructure backups
Recovery time objective (RTO) documented and acceptable to stakeholders

Smoke test: create a policy, make a test LLM call, verify spend recorded in Grafana
On-call rotation established for HALT/DEGRADE alerts
Escalation path documented: who to contact if the API is unreachable
Rollback plan documented: steps to revert to previous policy versions
Design partner contact confirmed for first 48 hours post-launch monitoring
VERONICA_DEBUG confirmed absent from production environment one final time