From f46be94f457304c0a58c3668e5c3c3775684c0d3 Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 22 May 2026 21:47:09 +0000 Subject: [PATCH] Rewrite README as coherent product doc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drops the iterative phase-by-phase framing (relevant during the refactor, confusing now that everything's merged) and replaces it with Features → Architecture → Quick start → Per-service runbook → Deploy → Configuration → Project layout. Adds an ASCII architecture diagram and a dedicated section per microservice with run commands; fixes the layout tree which was missing services/, deploy/, core/events, core/threat, core/observability, and .github/workflows. https://claude.ai/code/session_01THsbGHdqjcvJeWUwrzZtp8 --- README.md | 475 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 283 insertions(+), 192 deletions(-) diff --git a/README.md b/README.md index d4cdf5b..5a55fb2 100644 --- a/README.md +++ b/README.md @@ -1,79 +1,177 @@ # segmentation-copilot -An AI Security Analyst agent that turns Cisco SD-Access SG-ACL hit logs into a TrustSec contract matrix proposal — designed to help networks switch the matrix default rule to `deny-ip` without breaking legitimate flows. +An AI Security Analyst for **Cisco TrustSec** networks. It watches SG-ACL hit +logs, proposes the explicit `permit` / `deny` contracts that should exist for +legitimate traffic, and lets a real human review every change before it +touches the live matrix — a safe path to flipping the matrix default to +`deny-ip`. + +The agent runs both **on demand** (operator asks for a one-off analysis) and +**autonomously** (a scheduler scans for unknown flows; a daemon reacts to +threat-flagged destinations in real time). Operators can drive it from the +Streamlit UI, a CLI, any MCP-aware chat client (Claude Code, Claude Desktop, +LibreChat, …), or a WebEx bot — all backed by the same FastAPI service. + +--- + +## Features + +- **Interactive analysis.** Upload syslog, pick a time window, get a proposed + TrustSec contract matrix as markdown / CSV. +- **Proactive autonomy.** A cron-driven scheduler scans the syslog backlog + on a configurable interval, finds flows not covered by the current + baseline, and proposes the rules that would let them through. +- **Reactive autonomy.** A real-time syslog tail runs every destination IP + through a pluggable threat-intel layer (AbuseIPDB / OTX / VirusTotal / + Talos) and proposes a deny rule the moment a malicious flow is seen. +- **Human-gated approvals.** Every proposed rule lands as a WebEx adaptive + card (or in the UI / CLI / MCP); nothing changes the live matrix until + an operator clicks Approve. Approval creates an immutable + `matrix_version`; rollback is a pointer flip. +- **Multi-client.** Streamlit UI, `scopilot` CLI, MCP server (stdio + HTTP), + WebEx bot, REST API — all reach the same agent. +- **Production-ready.** Microservices, Docker / Kubernetes deployment, + Postgres + Redis, JSON logs, Prometheus metrics, NetworkPolicies, CI/CD. + +--- + +## Architecture -## What it does +``` + External clients + ┌──────────┬─────────┬───────────┬──────────┬───────────┐ + │ CLI │ Claude │ LibreChat │ Streamlit│ WebEx │ + │ scopilot │ UI │ │ UI │ Bot │ + └─────┬────┴────┬────┴─────┬─────┴────┬─────┴─────┬─────┘ + │ REST │ MCP │ MCP/SSE │ REST │ webhooks + ▼ ▼ ▼ ▼ ▼ + ┌──────────────────┐ ┌──────────────────┐ ┌────────────┐ + │ api/ FastAPI │ │ mcp-server/ │ │ webex-bot/ │ + │ OIDC-ready auth │ │ stdio + HTTP │ │ HMAC verify│ + └────────┬─────────┘ └────────┬─────────┘ └─────┬──────┘ + └─────────────┬───────┴──────────────────┘ + ▼ + ┌────────────────────────┐ + │ core/ (shared library) │ + │ services · repos │ + │ events · threat intel │ + │ sources (stream + win)│ + └────┬──────────┬────────┘ + │ │ + ┌───────────────┘ └───────────────┐ + ▼ ▼ +┌──────────────┐ Redis Streams ┌────────────────┐ +│ worker/ │◄──── events.* ───────►│ threat-daemon/ │ +│ scheduler │ consumer groups │ asyncssh tail │ +│ + consumer │ │ + intel lookup│ +└──────┬───────┘ └────────┬───────┘ + │ │ + └────────────────┬───────────────────────┘ + ▼ + ┌────────────────┐ + │ PostgreSQL │ (Redis: cache + Streams) + │ via Alembic │ + └────────────────┘ +``` + +Everything underneath the clients is a separate, horizontally scalable +microservice; everything in `core/` is the shared library each microservice +imports. -1. Asks for a syslog source (local file or SSH to a syslog collector), an analysis window, and an SGT/DGT id→name dictionary. -2. Pulls and parses `%RBM-6-SGACLHIT` syslog entries from the configured source. -3. Aggregates the raw events into unique flow tuples (`sgt`, `dgt`, `protocol`, `src_port`, `dst_port`). -4. Uses Claude to classify each flow as **business_relevant**, **default**, **business_irrelevant**, or **harmful**. -5. Groups classified flows into one **contract** per (Source SGT, Destination SGT) pair, with one or more ACEs each. -6. Renders the matrix as a markdown table and persists runs to SQLite. +--- -The matrix-wide default rule remains `deny-ip` — the agent emits only the explicit permits (and selective denies for visibility into Business Irrelevant / Harmful flows). +## Quick start -## Install +### Run everything with Docker Compose ```bash -pip install -e ".[dev]" +export SCOPILOT_ANTHROPIC__API_KEY=sk-ant-... +docker compose -f deploy/docker-compose.yml up -d +# api on :8000, mcp-http on :8002, worker + scheduler in the background ``` -## Run the stack +Optional services live behind Compose profiles: + +```bash +docker compose -f deploy/docker-compose.yml --profile ui --profile webex up -d +``` -Phase 2 splits the app in two: a FastAPI service holds the agent + DB, -and Streamlit / the CLI talk to it over HTTP. +### Local dev without Docker ```bash -# Start the API +pip install -e ".[dev,api,worker,webex,mcp,cli,sources,ui]" export SCOPILOT_ANTHROPIC__API_KEY=sk-ant-... -export SCOPILOT_API__REQUIRE_AUTH=false # dev only -uvicorn services.api.main:app --reload +export SCOPILOT_REDIS__URL=memory:// # in-memory bus, no Redis needed +export SCOPILOT_API__REQUIRE_AUTH=false # dev only -# Streamlit UI (in another terminal) -streamlit run app.py -# or use the CLI -scopilot --help -scopilot health -scopilot sgt set 100 Employees -scopilot run start tests/fixtures/sample.log +# Apply migrations (SQLite by default — set SCOPILOT_DB__URL for Postgres) +alembic upgrade head + +# Then start the bits you need: +uvicorn services.api.main:app --reload # REST + /metrics +python -m services.worker.main --role worker # consumes events.flow.unknown +python -m services.worker.main --role scheduler # cron tick +streamlit run app.py # UI +scopilot --help # CLI ``` -When `SCOPILOT_API__REQUIRE_AUTH=true` (prod default), set -`SCOPILOT_API__API_KEYS=[""]` and pass it via -`Authorization: Bearer ` (or `SCOPILOT_API_TOKEN=` for the CLI). +--- -### Run the scheduler + worker (proactive autonomy) +## Services + +Each service is a separate process / container. They all import from the same +`core/` library, so a bug fix in the agent's pipeline lands everywhere at once. + +### `api/` — FastAPI REST + `/metrics` + +The control plane. Exposes runs, ingest, classify, matrix, SGT dictionary, +proposals, healthz/readyz, and Prometheus `/metrics`. Bearer-token auth via +`SCOPILOT_API__API_KEYS`; the dependency surface is JWKS-ready for a future +OIDC verifier swap. Streamlit, the CLI, and the WebEx bot all talk through +this service. ```bash -# Required for production; for dev, set SCOPILOT_REDIS__URL=memory:// to -# run everything single-process against the in-memory bus. -export SCOPILOT_REDIS__URL=redis://localhost:6379/0 +uvicorn services.api.main:app --host 0.0.0.0 --port 8000 +``` -# One process drives the cron + Redis leader election. -python -m services.worker.main --role scheduler +### `worker/` — Scheduler + flow-unknown consumer + +Two roles in one binary: + +- **`--role scheduler`** runs a cron loop. Redis-leader-elected so multiple + replicas are safe. At every tick it loads new `flow_events` since the + per-tenant cursor, diffs them against the latest approved + `matrix_version`, and publishes `events.flow.unknown` for every + uncovered tuple. +- **`--role worker`** consumes `events.flow.unknown` in a consumer group, + classifies the flow via Claude (honouring a 7-day classification cache + so re-seen flows don't pay LLM cost), and turns the verdict into a rule + proposal — `deny` for harmful / business_irrelevant, `permit` otherwise. + Storm-collapse keeps a misconfigured source from drowning operators. -# One or more processes consume events.flow.unknown. +```bash +python -m services.worker.main --role scheduler python -m services.worker.main --role worker ``` -When the scheduler detects a flow not covered by the latest approved -matrix it publishes `events.flow.unknown`; the worker picks it up, -classifies via Claude, and creates a rule proposal that lands in WebEx -(or any other notifier sink). Operators approve/reject via the existing -Phase-3 flow. +### `threat_daemon/` — Real-time SSH tail + threat intel -### Run the threat daemon (reactive autonomy, optional) +Tails `/var/log/network/syslog` over SSH (`asyncssh` with `tail -F`, +exponential-backoff reconnect, heartbeat markers, log-rotation transparent). +For every destination IP it consults the pluggable threat-intel layer; on a +malicious verdict it publishes `events.flow.unknown` with `trigger="threat"` +and the full verdict trail. From there it joins the same worker pipeline as +the scheduler — same classification, same proposal flow, same approval loop. -Tails the syslog stream in real time and on a malicious destination IP -publishes `events.flow.unknown` with `trigger="threat"` — the same -worker pipeline classifies the flow and posts a proposal. Requires at -least one threat-intel provider configured. +Providers (mix and match — at least one required): + +- **AbuseIPDB** (primary; free 1000/day) +- **AlienVault OTX** +- **VirusTotal** (slow free-tier rate-limit; lookups cached aggressively) +- **Cisco Talos** (opt-in, best-effort, documented as brittle) ```bash export SCOPILOT_THREAT__ABUSEIPDB_API_KEY= -# optional: SCOPILOT_THREAT__OTX_API_KEY, SCOPILOT_THREAT__VIRUSTOTAL_API_KEY - python -m services.threat_daemon.main \ --host syslog.example.com \ --username collector \ @@ -81,181 +179,174 @@ python -m services.threat_daemon.main \ --log-path /var/log/network/syslog ``` -### Run the WebEx bot (optional) +### `webex_bot/` — Approval loop in Cisco WebEx + +Webhook receiver with HMAC-SHA1 verification, adaptive-card builder, and a +thin WebEx HTTP client. Every proposal posts an Approve / Reject card to the +configured operators' room; button clicks drive the proposal state machine +back through `core` (which on approve creates a new `matrix_version`). +Inline `approve ` / `reject ` / `list pending` commands work too. ```bash export SCOPILOT_WEBEX__BOT_ACCESS_TOKEN= export SCOPILOT_WEBEX__WEBHOOK_SECRET= export SCOPILOT_WEBEX__OPERATORS_ROOM_ID= uvicorn services.webex_bot.main:app --port 8001 +# Point your WebEx bot webhook at https:///webhooks/webex ``` -Point your WebEx bot's webhook at `https:///webhooks/webex`. -With these env vars set, every proposal created via the API automatically -posts an adaptive card to the operators' room; clicking Approve / Reject -drives the proposal through the state machine and, on approval, updates -the live matrix. +### `mcp_server/` — MCP server (stdio + HTTP) -## Run the tests +Exposes 14 tools (run lifecycle, SGT dictionary, proposals, threat intel) so +any MCP-aware client — Claude Code, Claude Desktop, LibreChat, custom — can +drive the agent. Two transports on one shared registry: ```bash -pytest +# stdio: Claude Code / Claude Desktop +python -m services.mcp_server.stdio + +# streamable HTTP: LibreChat or any remote MCP client +uvicorn services.mcp_server.http:app --port 8002 ``` -## Project layout +`set_sgt_name` is gated behind `--allow-dictionary-edit`; the rest of the +tool surface is read or proposal-write. + +### `cli/` — `scopilot` Typer CLI + +A thin client of the REST API. Useful for scripting and as the reference +for what the API actually exposes. +```bash +export SCOPILOT_API_BASE=http://localhost:8000 +export SCOPILOT_API_TOKEN= # only if auth is on +scopilot health +scopilot sgt set 100 Employees +scopilot run start tests/fixtures/sample.log +scopilot proposal list +scopilot proposal approve ``` -app.py # Streamlit entry point (legacy; refactored to API client in Phase 2) -alembic/ # async SQLAlchemy migrations -src/segmentation_copilot/ - config.py # Pydantic Settings — SCOPILOT_* env vars - parser.py # %RBM-6-SGACLHIT regex parser - aggregator.py # Group events into unique flow tuples - sgt.py # SGT dict load + lookup with missing-name registry - classify.py # Claude-based flow categorisation - contracts.py # Build contracts + render markdown matrix - db.py # legacy sync SQLite; replaced by core/ in Phase 1 - agent.py # Security Analyst system prompt + tool registry - tools.py # legacy AgentState pipeline; replaced by core/services in Phase 1 - core/ - db.py # async SQLAlchemy 2.0 engine + session factory - models/ # ORM (orm.py) + Pydantic domain models (domain.py) - repositories/ # async repos: runs, events, classifications, contracts, sgt, proposals, matrix - services/ # orchestration: ingestion, classification, matrix, baseline - sources/ - base.py # LogSource abstract base - local.py # Local file backend - ssh.py # Paramiko-based SSH backend -tests/ # pytest suite + fixtures -data/ # SQLite db + uploads (gitignored) + +### `app.py` — Streamlit UI + +Pure HTTP client of the API. Configure `SCOPILOT_API_BASE` and optionally a +bearer token, then drive runs / approvals from the browser. + +```bash +SCOPILOT_API_BASE=http://localhost:8000 streamlit run app.py ``` -## Production-readiness roadmap - -The plan in `/root/.claude/plans/i-would-like-to-sparkling-owl.md` decomposes the -project into six phases. **Phase 1 (this PR)** lands the foundation: - -- Centralized Pydantic `Settings` (`config.py`) consuming `SCOPILOT_*` env vars. -- Async SQLAlchemy 2.0 + Alembic — supports SQLite (dev) and Postgres (prod). -- ORM and Pydantic domain models with `tenant_id` on every tenant-scoped table. -- Repository layer (`core/repositories/`) with idempotent upserts and the - optimistic-lock proposal `decide()` SQL. -- Service layer (`core/services/`) wrapping the existing pure-function pipeline - (`parser`, `aggregator`, `classify`, `contracts`) with persistence and a - recent-flow classification cache. -- Schema includes `proposals`, `proposal_audit`, `matrix_versions`, - `threat_lookups`, `audit_events` so Phases 3 and 5 can land additively. - -**Phase 6 (this PR)** adds: - -- `services/mcp_server/` — MCP server exposing 14 tools (runs, SGT, - proposals, threat intel). Two transports on one shared registry: - stdio (`python -m services.mcp_server.stdio`) for Claude Code / - Desktop, and streamable HTTP for LibreChat / remote clients. - `set_sgt_name` is gated by `--allow-dictionary-edit`. -- `deploy/Dockerfile` — multi-stage; one image serves every role - (api / worker / scheduler / mcp / threat-daemon / webex-bot / ui). - Non-root, read-only-rootfs friendly. -- `deploy/docker-compose.yml` — full stack (postgres + redis + every - service) behind Compose profiles for the optional ones (webex, - threat, ui). -- `deploy/k8s/base/` — kustomize base with Deployments + Services + - HPAs + PodDisruptionBudget for the scheduler + Ingress + an example - NetworkPolicy stack. Migration runs as a pre-install Job / - argocd-sync-wave -10. -- `core/observability/` — JSON structured logs + Prometheus metrics - (counters for flow_unknown / classifications / proposals / - threat_lookups). API exposes `/metrics`. -- `.github/workflows/ci.yml` — ruff lint, Alembic migration on a real - Postgres service container, `pytest` against the full matrix, - Docker build + push to GHCR on `main`, Trivy scan. - -**Phase 5** added: - -- `core/threat/` — pluggable threat-intelligence layer with a - `ThreatIntelClient` Protocol and four implementations: - - **AbuseIPDB** (primary; free tier 1000/day) - - **AlienVault OTX** (pulse-count scoring) - - **VirusTotal** (engine-stats normalisation) - - **Talos** (opt-in best-effort; documented as brittle, never the - only signal) -- `ThreatAggregator` runs them in parallel with a per-call timeout and - applies the decision policy (score ≥ threshold OR two-provider - category consensus). Per-provider failures don't poison the decision. -- Redis caching (6h clean / 24h malicious / 1h 404) backed by a - `threat_lookups` audit table. -- `core/sources/streaming.py` — `StreamingLogSource` Protocol + - `InMemoryStreamingSource` for tests/dev. -- `core/sources/streaming_ssh.py` — production `asyncssh`-based tail - with `tail -F`, exponential-backoff reconnect, heartbeat markers, and - resilience against log rotation. -- `services/threat_daemon/` — ties everything together: tail → parse → - IP lookup → on malicious verdict, publish `events.flow.unknown` with - `trigger="threat"` and the full verdict trail attached. Reuses the - Phase-4 worker pipeline downstream — same classification, same - proposal flow, same WebEx approval. - -**Phase 4** added: - -- `core/events/` — `EventBus` Protocol with two implementations: - Redis Streams for production (consumer groups, at-least-once, idempotency - dedup via `SET NX`) and an in-memory bus that's contract-equivalent - for tests / single-process dev. Picked automatically from - `SCOPILOT_REDIS__URL`. -- `services/worker/` — proactive autonomy service: - - **Scheduler** (`--role scheduler`) — Redis-leader-elected; only the - elected replica fires the cron, so multiple instances are safe. - At every tick, loads new `flow_events` since the last cursor, - diffs against the latest approved `matrix_version`, and publishes - `events.flow.unknown` for every uncovered tuple. - - **Worker** (`--role worker`) — consumes `events.flow.unknown`, - classifies via Claude (honouring the 7-day classification cache so - re-seen flows don't pay LLM cost), and turns the verdict into a - rule proposal (deny for harmful / business_irrelevant; permit - otherwise). The existing storm-collapse logic keeps a misconfigured - syslog source from drowning operators. -- Per-tenant scan cursor in Redis (28-day TTL); on Redis loss the worst - outcome is re-classifying the last few days — absorbed by the cache. -- Bus idempotency keys make scan ticks safe to retry. - -**Phase 3** added: - -- `core/services/proposal.py` — full proposal state machine - (`pending → notified → {approved | rejected | expired}`, - `approved → applied | failed`) with **idempotency** (same shape returns - the existing row) and **storm collapse** (multiple proposals for the - same `(src_sgt, dst_sgt)` merge into one). -- On approval, the service creates an immutable `matrix_version` whose - `parent_id` chains back to the previous baseline — rollback is a pointer - flip. -- `core/services/notifier.py` — pluggable sink fan-out so future channels - (Slack, Teams, email) drop in without touching call sites. -- `services/webex_bot/` — FastAPI service with HMAC-SHA1 webhook - verification, an adaptive-card builder, and a WebEx HTTP client. Drives - the approve/reject loop end-to-end and handles operator races gracefully. -- API `POST /v1/proposals` now fans out to the notifier via - `BackgroundTasks`; `POST /v1/proposals/{id}/decision` goes through the - state machine. +--- ## Deploy +### Docker Compose (single host) + ```bash -# Local: full stack on docker-compose docker compose -f deploy/docker-compose.yml up -d -docker compose -f deploy/docker-compose.yml --profile ui --profile webex up -d +``` + +Brings up Postgres + Redis + migrate + api + worker + scheduler + mcp-http. +Optional services behind profiles: `ui`, `webex`, `threat`. + +### Kubernetes (kustomize) -# Kubernetes +```bash kubectl apply -k deploy/k8s/base ``` -Both targets need at least `SCOPILOT_ANTHROPIC__API_KEY`; threat-intel -and WebEx-bot keys are optional. In production, source secrets via -External Secrets Operator rather than the example `Secret`. +`deploy/k8s/base/` ships Deployments + Services + HPAs + a leader-friendly +`PodDisruptionBudget` for the scheduler + `cert-manager`-aware Ingress for +`api`, `mcp`, `ui`, and `webex` hosts. Postgres + Redis are single-replica +StatefulSets — for production, point `SCOPILOT_DB__URL` at a managed +instance and remove the StatefulSet. An example `NetworkPolicy` stack +(default-deny + per-service allows) lives in `networkpolicy.yaml`. + +Pods run non-root (uid 10001), `readOnlyRootFilesystem`, all capabilities +dropped, `seccomp: RuntimeDefault`. Namespace enforces Pod Security +Admission `restricted`. Replace `secret-example.yaml` with an +`ExternalSecret` backed by Vault / cloud SM in production. -## Apply migrations +### Migrations ```bash alembic upgrade head ``` -Override the database URL via `SCOPILOT_DB__URL` (see `.env.example`). +Driven by `SCOPILOT_DB__URL` (async) and `SCOPILOT_DB__SYNC_URL` (sync; +auto-derived if unset). The K8s base runs migrations as a pre-install Job. + +--- + +## Configuration + +Every setting is `SCOPILOT_*` and nested settings use `__`: + +| Group | Examples | +|-------|----------| +| Core | `SCOPILOT_ENVIRONMENT`, `SCOPILOT_LOG_LEVEL`, `SCOPILOT_LOG_FORMAT`, `SCOPILOT_DEFAULT_TENANT_ID` | +| Database | `SCOPILOT_DB__URL`, `SCOPILOT_DB__SYNC_URL`, `SCOPILOT_DB__ECHO` | +| Redis / event bus | `SCOPILOT_REDIS__URL` (set to `memory://` for single-process dev) | +| Anthropic | `SCOPILOT_ANTHROPIC__API_KEY`, `SCOPILOT_ANTHROPIC__MODEL` | +| API | `SCOPILOT_API__REQUIRE_AUTH`, `SCOPILOT_API__API_KEYS` (JSON list) | +| Scheduler | `SCOPILOT_SCHED__SCAN_INTERVAL_MINUTES`, `SCOPILOT_SCHED__CLASSIFICATION_CACHE_DAYS` | +| Threat intel | `SCOPILOT_THREAT__ABUSEIPDB_API_KEY`, `SCOPILOT_THREAT__OTX_API_KEY`, `SCOPILOT_THREAT__VIRUSTOTAL_API_KEY`, `SCOPILOT_THREAT__TALOS_ENABLED` | +| WebEx | `SCOPILOT_WEBEX__BOT_ACCESS_TOKEN`, `SCOPILOT_WEBEX__WEBHOOK_SECRET`, `SCOPILOT_WEBEX__OPERATORS_ROOM_ID` | + +See `.env.example` for the full surface with defaults. + +--- + +## Tests + +```bash +pip install -e ".[dev,api,worker,webex,mcp,cli,sources]" +pytest +``` + +Hermetic — no live Anthropic / WebEx / threat-feed calls. CI also runs +Alembic migrations against a real Postgres service container. + +--- + +## Project layout + +``` +app.py # Streamlit UI (pure HTTP client of api/) +alembic/ # async SQLAlchemy migrations +deploy/ + Dockerfile # multi-stage, one image per role + docker-compose.yml # full stack with optional profiles + k8s/base/ # kustomize Deployments + Services + HPAs + Ingress +.github/workflows/ci.yml # ruff + pytest + Postgres + Docker + Trivy +services/ + api/ FastAPI REST + /metrics + cli/ scopilot Typer client + worker/ scheduler + flow-unknown consumer (leader-elected) + threat_daemon/ real-time syslog tail + threat-intel lookup + webex_bot/ adaptive-card approval loop + mcp_server/ stdio + HTTP MCP server +src/segmentation_copilot/ + config.py # Pydantic Settings (SCOPILOT_* env vars) + parser.py # %RBM-6-SGACLHIT regex parser + aggregator.py # group events into unique flow tuples + sgt.py # SGT id→name dictionary + missing-id registry + classify.py # Claude-driven flow classification + contracts.py # build contracts + render markdown matrix + sources/ + local.py / ssh.py # fetch-by-window sources + streaming.py # StreamingLogSource Protocol + streaming_ssh.py # asyncssh tail with reconnect + heartbeat + core/ + db.py # async SQLAlchemy 2.0 engine + session + models/ # ORM (orm.py) + Pydantic domain models + repositories/ # async repos (runs, events, classifications, + # contracts, sgt, proposals, matrix) + services/ # ingestion / classification / matrix / baseline + # proposal state machine / notifier fan-out + events/ # EventBus Protocol + Redis Streams + InMemory + threat/ # ThreatIntelClient Protocol + 4 providers + aggregator + observability/ # JSON logs + Prometheus counters +tests/ # pytest suite + fixtures (79 tests) +data/ # SQLite db + uploads (gitignored) +```