Skip to content

Phase 3: Admin Console, Governance, and Observability#1

Merged
caioopra merged 52 commits into
mainfrom
develop
Apr 16, 2026
Merged

Phase 3: Admin Console, Governance, and Observability#1
caioopra merged 52 commits into
mainfrom
develop

Conversation

@caioopra

Copy link
Copy Markdown
Owner

Summary

Phase 3 ships the admin console, security governance, LLM cost tracking, and production observability. 49 commits across 5 slices (A–E), all individually reviewed by code-reviewer + security-reviewer agents before merging to develop.

Slice A — Role Infrastructure

  • users.role column + AdminUser extractor (DB-read per request, not JWT)
  • Per-email login rate limit (10/15min) with timing equalization
  • Frontend role gating: authStore.role, AdminRoute, AuthLoader

Slice B — Audit Log + Step-Up Auth

  • audit_log table with payload CHECK constraint (no secrets)
  • emit_audit helper + auth event logging (login success/fail/rate-limited, refresh)
  • Action-scoped confirm tokens (5 min TTL, cross-admin replay blocked)
  • Paginated GET /api/admin/audit with composite cursor

Slice C — LLM Cost/Usage + Runtime Config

  • pricing.rs (Gemini/Claude models), context.rs (28K token truncation)
  • Token persistence per message + llm_usage_daily rollup
  • Budget enforcement ($5/mo default, per-user semaphore for TOCTOU prevention)
  • Kill-switch (chat_enabled), SettingsCache (60s TTL, double-checked locking)
  • Admin endpoints: settings, metrics, users, rate-limit overrides

Slice D — Admin Console Frontend

  • AdminShell + sidebar + nested routes
  • Dashboard with MetricCards (React Query, 30s polling)
  • Providers, Users, Audit pages with StepUpModal for sensitive ops
  • KillSwitchToggle, RateLimitDialog

Slice E — Observability Wiring

  • LOG_FORMAT=json toggle in main.rs
  • Span enrichment: input/output tokens, cost, model on chat.turn/chat.round
  • Tool-call SSE arg allowlist (14 tools)
  • docs/runbook.md with Axiom drain setup + 8 APL dashboard queries

Stats

  • Backend: 485 tests, 0 clippy warnings
  • Frontend: 267 tests, clean build
  • Migrations: 005–008 (all additive, zero-downtime)

Test plan

  • All backend tests pass (cargo test)
  • All frontend tests pass (npm test -- --run)
  • Clippy clean (cargo clippy --all-targets -- -D warnings)
  • Frontend build clean (npm run build)
  • Migrations apply cleanly on fresh DB

🤖 Generated with Claude Code

caioopra and others added 30 commits April 15, 2026 18:22
Add `role: String` to `CurrentUser`, update `load_user` to SELECT the
role column introduced by migration 005, and fix all query_as sites
(register RETURNING, login SELECT) to include the new column so that
the sqlx FromRow mapping stays consistent.

Also add `AppError::Forbidden` (HTTP 403) to the error enum and update
settings_tests to carry the new `login_rate_limit` AppState field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New `AdminUser` extractor in `middleware/admin.rs` wraps `CurrentUser`
and returns HTTP 403 `{"error":"forbidden"}` if `role != "admin"`.
All failure paths (bad token, user not found, wrong role) return the
same body to prevent information leakage.

Expose `AdminUser` via `middleware/mod.rs`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `routes/admin.rs` with a single `GET /api/admin/dashboard` endpoint
that returns `{"ok":true,"admin_email":"..."}`.  Mount the router under
`/api/admin` in both router constructors behind `auth_middleware` +
`AdminUser` extractor (defence-in-depth: 401 for missing token, 403 for
non-admin users).

Integration tests in `tests/admin_route_tests.rs` cover: 401 (no token),
401 (tampered token), 403 (regular user), 200 (admin), and verify the
forbidden body is uniform across users.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `EmailRateLimitState` to `middleware/rate_limit.rs`: a sliding-window
DashMap keyed on normalized email (lowercase + trim), 10 attempts per
15-minute window.  Design: every attempt (success or failure) increments
the bucket; a successful login calls `clear()` to reset it.  This means
an attacker can succeed at attempt ≤10 once, but subsequent attempts
restart the counter — better than counting only failures.

Add `login_rate_limit: EmailRateLimitState` to `AppState` and wire it
into both router constructors.  The `POST /api/auth/login` handler now:
  1. Normalizes the email.
  2. Checks the bucket BEFORE any DB lookup — returns 429 with
     `Retry-After` header and `{"error":"rate_limited","retry_after_seconds":N}`
     regardless of whether the email exists.
  3. Clears the bucket on successful login.

Integration tests in `tests/auth_rate_limit_tests.rs` cover: 11th
attempt returns 429, Retry-After header present, bucket reset on success,
independent buckets per email, email-existence not leaked, and correct
password is still rejected when bucket is full (proving the fast path).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `AppError::Forbidden` (HTTP 403, body `{"error":"forbidden"}`) to
support the AdminUser extractor and future admin-gated handlers that
need to distinguish Forbidden from Unauthorized.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- EmailRateLimitState: port opportunistic sweep (atomic counter + retain)
  from RateLimitState<Uuid> to prevent unbounded map growth under
  fabricated-email enumeration attacks; add sweep_empty() helper and
  email_sweep_removes_inactive_entries unit test

- Login timing side-channel: introduce DUMMY_HASH (OnceLock<String>) so
  that the "email not found" path calls verify_password against a static
  argon2 hash, equalizing latency with the wrong-password path; add
  login_timing_equalized_for_unknown_email integration test

- AppError::Forbidden: wire AdminUser rejection paths through
  AppError::Forbidden.into_response() instead of an inline tuple;
  remove #[allow(dead_code)]; extend forbidden_response_shape test to
  assert the JSON body is {"error":"forbidden"}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Captures the workflow rules (feature branches, per-feature reviews,
--no-ff merges into develop), what's shipped in Slice A so far
(migration 005 + backend role infra), and a copy-paste-ready
dispatch prompt for Feature 2 (frontend role gating) so the next
session can resume without re-reading the plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `role` to `MeResponse` in both `routes/auth.rs` and `routes/me.rs`,
and update the SQL RETURNING clause in `me.rs` to include the column.
Regenerate `.sqlx/` metadata accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…shboard

- authStore: add `role: null` to initial state, `loadMe` action (calls
  GET /api/auth/me and stores role), `role` in partialize, `useIsAdmin` hook
- AdminRoute: wraps routes with auth + role checks; null role shows loading
  state, non-admin redirects to /, unauthenticated redirects to /login
- AdminDashboard: placeholder page for Slice D
- App.jsx: register /admin/dashboard route under AdminRoute
- MSW handlers: GET /api/auth/me now returns `role`; add setMockUserRole()
  and reset role in resetMockState()
- Tests: 4 AdminRoute tests + 2 authStore tests (loadMe role, persistence)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add AuthLoader component to App.jsx that calls loadMe whenever a token
is present but role is null, covering page reload with persisted token and
fresh login. Consolidate AdminRoute to a single useAuthStore import deriving
isAuthenticated from !!token. Add role field to PUT /api/me/planner-context
MSW handler so the mock response matches the real backend shape.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates an append-only audit_log table for admin-initiated operations
(role changes, impersonation, deletions) with actor/impersonation FK
columns, INET ip, JSONB payload, and composite indexes on
(actor_id, created_at DESC) and (action, created_at DESC).
Updates docs/schema.md with full column reference and diagram.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a standalone audit_log_created_at index on (created_at DESC) so the
90-day retention DELETE can use an index range scan.  The existing composite
indexes on (actor_id, created_at) and (action, created_at) are not usable
for a bare created_at predicate.

Also add a payload safety comment in the migration header and in schema.md
documenting that emit_audit must strip sensitive keys (password, token,
secret, api_key, etc.) before insert.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…log payload

Security reviewer finding: payload JSONB had no DB-level guard against
accidental secret storage. Adds a denylist CHECK that rejects rows
containing password, password_hash, token, refresh_token, secret, or
api_key as top-level keys.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces backend/src/middleware/audit.rs with emit_audit(), the single
code path for inserting rows into audit_log.  Strips forbidden top-level
payload keys (password, password_hash, token, refresh_token, secret,
api_key) before the INSERT as defense-in-depth alongside the DB CHECK
constraint.  Accepts Option<Uuid> for actor_id to support unauthenticated
events (e.g. login failures for unknown emails).  Re-exported from
middleware::mod.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds TokenKind::Confirm variant and ConfirmClaims struct to jwt.rs, with
encode_confirm_token() (5-minute TTL) and decode_confirm_token() that
validates the action claim and returns Forbidden on mismatch.

Adds POST /api/admin/confirm: re-verifies the admin's password, mints a
confirm token for the requested action, emits an admin.confirm audit row,
and returns {confirm_token}.  Also adds validate_confirm_token() helper
that reads x-confirm-token from headers for use by future mutation handlers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instruments the login handler to emit auth.login.success on successful
login and auth.login.fail on wrong password or unknown email (but NOT on
rate-limit rejection).  IP is extracted from the X-Forwarded-For header
when present.  The refresh handler emits auth.token.refresh after issuing
new tokens.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion tests

GET /api/admin/audit supports cursor-based pagination (before UUID),
optional action-prefix filtering, and configurable page size (default 50,
max 100).  Requires AdminUser extractor.

Adds backend/tests/audit_tests.rs with 13 integration tests covering all
Slice B deliverables: confirm endpoint auth/password flows, confirm token
action mismatch, login success/failure audit rows, token refresh audit,
audit list endpoint, and action-prefix filtering.  Also adds 6 unit tests
in jwt.rs for confirm token roundtrip, wrong action, expiry, and wrong
secret.  Regenerates .sqlx/ offline metadata.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…on, sub check

- Fix 1: validate action param in audit LIKE filter to [a-zA-Z0-9._-];
  reject anything containing '%', '\', or other non-allowed chars (422).
- Fix 2: replace single-column created_at cursor with composite
  (created_at, id) tuple comparison to prevent silent row skips when
  two rows share the same microsecond timestamp.
- Fix 3: add confirm_rate_limit (5 attempts / 5-min window, keyed on
  admin user_id) to AppState and enforce it at the start of the confirm
  handler.
- Fix 4: validate the action field in the confirm request body — max 128
  bytes, allowed chars [a-zA-Z0-9._-]; return 422 on violation.
- Fix 5: add expected_user_id param to validate_confirm_token; reject
  with 403 if claims.sub != expected_user_id (prevents cross-admin
  token reuse).
- Fix 6: fire-and-forget audit row (auth.login.rate_limited) before
  returning 429 on rate-limited login so credential-stuffing attempts
  are visible in the audit log.
- Add unit and integration tests covering all six fixes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tables

Creates llm_usage_daily for per-user/provider/model daily token rollups
(composite PK enables idempotent upsert) and app_settings as a flat
key-value runtime config store, seeded with default provider, model,
budget, and feature-flag values. Schema doc updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a one-to-one user_rate_limits table that stores optional per-user
overrides for daily LLM token and request limits, with nullable columns
so each dimension can be controlled independently. Falls back to global
app_settings values when no override row is present. Schema doc updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…allowlist

- Define reusable set_updated_at() plpgsql function in migration 007
- Add BEFORE UPDATE triggers on app_settings and user_rate_limits so
  updated_at stays current without requiring explicit application-level
  SET in every UPDATE statement
- Add CHECK (char_length(value) <= 1024) on app_settings.value to
  prevent oversized writes
- Add key allowlist CHECK on app_settings.key constraining inserts to
  the six known setting names
- Apply all changes to the live DB directly (ALTERs + CREATE FUNCTION/TRIGGER)
  and fix _sqlx_migrations checksums to match the updated files
- Update docs/schema.md to document triggers and constraints

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hard-coded ModelPrice table for gemini/claude models with
estimate_cost_usd() helper. Unit tests cover known models,
zero tokens, and unknown provider fallback to 0.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
truncate_to_budget() trims Vec<Message> to a token budget using a
char/4 heuristic, always preserving the system message and last
user message. Appends a truncation notice when messages are dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…check and kill-switch

- INSERT now binds input_tokens, output_tokens, model to messages table
- Budget check queries llm_usage_daily monthly sum before each turn;
  returns 429 BudgetExceeded when over limit
- Kill-switch reads chat_enabled from SettingsCache; returns 503 when false
- Usage rollup upserts into llm_usage_daily after each turn using
  estimate_cost_usd from the pricing table
- done SSE event includes budget_warning flag
- AppError gains BudgetExceeded and ServiceUnavailable variants
- Integration tests: token persistence, kill-switch 503, budget 429,
  budget_warning field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Arc<RwLock<HashMap>> backed cache for app_settings rows. Refreshes
from DB when stale (>60s). invalidate() forces next read to reload.
Added to AppState and initialized with SettingsCache::new() in all
router constructors. Updated settings_tests.rs to include the new
field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
caioopra and others added 22 commits April 16, 2026 10:41
…ints

New routes:
  GET  /api/admin/settings         — list all app_settings rows
  POST /api/admin/settings         — update a setting (confirm token req for
                                     provider/model keys)
  GET  /api/admin/metrics/usage    — LLM usage grouped by day/provider/model
                                     (days param, default 30, max 90)
  GET  /api/admin/users            — user list without passwords
  POST /api/admin/users/:id/rate-limit — upsert per-user rate limit override

All endpoints gate on AdminUser. settings_update invalidates SettingsCache
and emits audit events. Integration tests cover seed values, update, 403 for
non-admin, unknown key 422, metrics after a chat turn, and rate-limit upsert.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ncation, sensitive keys

- Fix 1: Consolidate SettingsCache into a single CacheInner struct under one
  tokio::sync::RwLock, eliminating the split-lock race between inner map and
  refreshed_at. Implements double-checked locking so only the first writer
  in a thundering-herd scenario performs the DB refresh.

- Fix 2: Add per-user chat_semaphores (DashMap<Uuid, Arc<Semaphore>>) to
  AppState. The chat handler acquires the 1-permit semaphore before the
  budget check and holds it through the usage upsert, closing the TOCTOU
  window where two concurrent requests could both pass the budget guard.

- Fix 3: Import truncate_to_budget and DEFAULT_MAX_TOKENS from ai::context
  and apply truncation at the start of every tool-use round in chat.rs so
  the context window is always capped before each provider call.

- Fix 4: Add chat_enabled, budget_monthly_usd, and budget_warn_pct to
  SENSITIVE_SETTING_KEYS in admin.rs, requiring step-up confirm tokens for
  all high-impact runtime settings.

- Fix 5: Simplify AppError::BudgetExceeded to a unit variant and remove
  monthly_spend/budget fields from the 429 response body to avoid leaking
  financial data in HTTP error payloads.

- Fix 6: Remove dead build_settings_cache function (was gated behind
  #[allow(dead_code)] and never called).

Updated tests: chat_token_usage_tests.rs (no financial fields in 429 body),
admin_settings_tests.rs (all keys now sensitive, supply confirm token),
settings_tests.rs (add chat_semaphores to AppState constructors).
470 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…M call lifecycle

The OwnedSemaphorePermit is now moved into the async_stream closure so it
is held for the entire duration of the LLM call and usage upsert, not just
until the handler returns the SSE response.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exports getDashboard, getSettings, updateSetting, getUsageMetrics,
getUsers, getAuditLog, getConfirmToken, and setUserRateLimit using
the shared axios apiClient with auth token injection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces AdminShell (persistent left sidebar + Outlet) and
AdminSidebar (NavLink-based nav for Dashboard/Providers/Users/Audit).
Rewires /admin to a nested route tree inside AdminRoute, removes the
old flat /admin/dashboard route, and adds MSW handlers for all admin
endpoints. Tests cover shell rendering, nav links, and active-link
accent styling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the placeholder AdminDashboard with a real page that fetches
usage metrics, settings, and users via React Query (30s refetch) and
renders four MetricCards: Total Users, Monthly Cost, Active Provider,
and Chat Status. Adds MetricCard component with raised surface styling
and accent border. Deletes the old pages/AdminDashboard.jsx placeholder.
Tests cover card rendering, loading state, and all four metric values
against MSW mock data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…en tests

- AdminDashboard: destructure isError from each useQuery call; render red
  alert banner when any query fails while keeping partial data visible
- AdminDashboard: remove unused getDashboard import (was never used)
- AdminShell: import useEffect + useLocation; close mobile sidebar on
  every pathname change via useEffect([location.pathname])
- AdminShell.test: strengthen toggle test — assert second AdminSidebar nav
  mounts in the DOM after hamburger click (two navigation landmarks)
- AdminDashboard.test: add error-state test using non-admin role to trigger
  403s from MSW, asserting the role="alert" banner appears

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
StepUpModal handles password re-entry before sensitive admin operations
by calling getConfirmToken and forwarding the token via onSuccess.
KillSwitchToggle wraps the chat_enabled setting toggle with StepUp flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AdminProviders shows LLM settings (default provider, model names) with a
form that requires StepUp confirmation to save; also embeds KillSwitchToggle
for the chat_enabled flag. AdminUsers renders a styled table with role badges
and a per-user rate-limit dialog backed by setUserRateLimit. AdminAudit lists
audit events in a timeline layout with action-prefix filtering and cursor-based
load-more pagination. All three routes are wired in App.jsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g, step-up auth on rate limits

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the single tracing_subscriber::fmt() call with a match on
LOG_FORMAT env var that selects JSON or text output. Logs an info
event immediately after init to confirm the active format.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
chat.turn now carries input_tokens, output_tokens, estimated_cost_usd,
and model — all recorded at stream end from total_usage. chat.round
gets input_tokens and output_tokens recorded per-round from round_usage.
turn_span is cloned and moved into the stream closure so recording works
across the async boundary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add LOG_FORMAT=text default and a commented AXIOM_TOKEN placeholder
for Fly.io log drain configuration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…emove phantom entry

- Remove phantom `set_block_labels` entry (tool not in executor dispatch table)
- Add `list_rules`, `create_rule`, `update_rule`, `delete_rule` entries with
  correct field names from their `*Args` structs in schemas.rs
- Fix `undo_last_action` allowlist from `&["routine_id"]` to `&[]` since
  `UndoLastActionArgs` is an empty struct with no parameters
- Add `"note"` and `"label_names"` to `create_block` allowlist (non-sensitive,
  useful for frontend progress display)
- Add `"label_names"` to `update_block` allowlist
- Update and extend unit tests to cover all new/changed entries

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… to audit

- POST /api/admin/users/:id/rate-limit now requires x-confirm-token with
  action "admin.user.rate_limit" (validate_confirm_token).
- Frontend AdminUsers.jsx updated to request the same canonical action string.
- Extract IP (x-forwarded-for) and user-agent in both settings_update and
  users_set_rate_limit; pass them to emit_audit for forensic traceability.
- Add extract_client_info() helper in admin.rs.
- Add dedicated register_rate_limit (EmailRateLimitState, 5 req / 15 min)
  to AppState so registration spam is throttled independently from login.
  POST /api/auth/register now returns 429 + Retry-After when the bucket
  is full, with a fire-and-forget audit row (auth.register.rate_limited).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- chat_done_event_includes_budget_warning_true_when_near_limit: seeds
  $4.01 spend (above 80% of $5.00 budget) and asserts the done SSE event
  carries budget_warning=true.
- settings_update_writes_audit_log_row: updates budget_warn_pct and queries
  audit_log to assert an "admin.settings.update" row was written with the
  correct target_type and target_id.
- users_set_rate_limit_requires_confirm_token: asserts 403 when no
  x-confirm-token is supplied (regression guard for Fix 1).
- Update users_set_rate_limit_succeeds / _upserts / _404 to obtain a
  confirm token before each call (required by the new handler validation).
- settings_tests.rs: add register_rate_limit field to inline AppState
  constructions to match the updated AppState struct.
- Regenerate backend/.sqlx/ offline metadata for the new sqlx::query! in
  settings_update_writes_audit_log_row.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@caioopra caioopra merged commit 3262bf5 into main Apr 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant