From f46be94f457304c0a58c3668e5c3c3775684c0d3 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Fri, 22 May 2026 21:47:09 +0000
Subject: [PATCH] Rewrite README as coherent product doc
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Drops the iterative phase-by-phase framing (relevant during the
refactor, confusing now that everything's merged) and replaces it with
Features → Architecture → Quick start → Per-service runbook → Deploy →
Configuration → Project layout. Adds an ASCII architecture diagram and
a dedicated section per microservice with run commands; fixes the layout
tree which was missing services/, deploy/, core/events, core/threat,
core/observability, and .github/workflows.

https://claude.ai/code/session_01THsbGHdqjcvJeWUwrzZtp8
---
 README.md | 475 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 283 insertions(+), 192 deletions(-)

diff --git a/README.md b/README.md
index d4cdf5b..5a55fb2 100644
--- a/README.md
+++ b/README.md
@@ -1,79 +1,177 @@
 # segmentation-copilot
 
-An AI Security Analyst agent that turns Cisco SD-Access SG-ACL hit logs into a TrustSec contract matrix proposal — designed to help networks switch the matrix default rule to `deny-ip` without breaking legitimate flows.
+An AI Security Analyst for **Cisco TrustSec** networks. It watches SG-ACL hit
+logs, proposes the explicit `permit` / `deny` contracts that should exist for
+legitimate traffic, and lets a real human review every change before it
+touches the live matrix — a safe path to flipping the matrix default to
+`deny-ip`.
+
+The agent runs both **on demand** (operator asks for a one-off analysis) and
+**autonomously** (a scheduler scans for unknown flows; a daemon reacts to
+threat-flagged destinations in real time). Operators can drive it from the
+Streamlit UI, a CLI, any MCP-aware chat client (Claude Code, Claude Desktop,
+LibreChat, …), or a WebEx bot — all backed by the same FastAPI service.
+
+---
+
+## Features
+
+- **Interactive analysis.** Upload syslog, pick a time window, get a proposed
+  TrustSec contract matrix as markdown / CSV.
+- **Proactive autonomy.** A cron-driven scheduler scans the syslog backlog
+  on a configurable interval, finds flows not covered by the current
+  baseline, and proposes the rules that would let them through.
+- **Reactive autonomy.** A real-time syslog tail runs every destination IP
+  through a pluggable threat-intel layer (AbuseIPDB / OTX / VirusTotal /
+  Talos) and proposes a deny rule the moment a malicious flow is seen.
+- **Human-gated approvals.** Every proposed rule lands as a WebEx adaptive
+  card (or in the UI / CLI / MCP); nothing changes the live matrix until
+  an operator clicks Approve. Approval creates an immutable
+  `matrix_version`; rollback is a pointer flip.
+- **Multi-client.** Streamlit UI, `scopilot` CLI, MCP server (stdio + HTTP),
+  WebEx bot, REST API — all reach the same agent.
+- **Production-ready.** Microservices, Docker / Kubernetes deployment,
+  Postgres + Redis, JSON logs, Prometheus metrics, NetworkPolicies, CI/CD.
+
+---
+
+## Architecture
 
-## What it does
+```
+                  External clients
+   ┌──────────┬─────────┬───────────┬──────────┬───────────┐
+   │   CLI    │ Claude  │ LibreChat │ Streamlit│  WebEx    │
+   │ scopilot │  UI     │           │   UI     │   Bot     │
+   └─────┬────┴────┬────┴─────┬─────┴────┬─────┴─────┬─────┘
+         │ REST    │ MCP      │ MCP/SSE  │ REST      │ webhooks
+         ▼         ▼          ▼          ▼           ▼
+    ┌──────────────────┐  ┌──────────────────┐  ┌────────────┐
+    │   api/ FastAPI   │  │  mcp-server/     │  │ webex-bot/ │
+    │  OIDC-ready auth │  │  stdio + HTTP    │  │ HMAC verify│
+    └────────┬─────────┘  └────────┬─────────┘  └─────┬──────┘
+             └─────────────┬───────┴──────────────────┘
+                           ▼
+              ┌────────────────────────┐
+              │ core/ (shared library) │
+              │  services · repos      │
+              │  events · threat intel │
+              │  sources (stream + win)│
+              └────┬──────────┬────────┘
+                   │          │
+   ┌───────────────┘          └───────────────┐
+   ▼                                          ▼
+┌──────────────┐   Redis Streams       ┌────────────────┐
+│ worker/      │◄──── events.* ───────►│ threat-daemon/ │
+│ scheduler    │      consumer groups  │ asyncssh tail  │
+│  + consumer  │                       │  + intel lookup│
+└──────┬───────┘                       └────────┬───────┘
+       │                                        │
+       └────────────────┬───────────────────────┘
+                        ▼
+               ┌────────────────┐
+               │  PostgreSQL    │  (Redis: cache + Streams)
+               │  via Alembic   │
+               └────────────────┘
+```
+
+Everything underneath the clients is a separate, horizontally scalable
+microservice; everything in `core/` is the shared library each microservice
+imports.
 
-1. Asks for a syslog source (local file or SSH to a syslog collector), an analysis window, and an SGT/DGT id→name dictionary.
-2. Pulls and parses `%RBM-6-SGACLHIT` syslog entries from the configured source.
-3. Aggregates the raw events into unique flow tuples (`sgt`, `dgt`, `protocol`, `src_port`, `dst_port`).
-4. Uses Claude to classify each flow as **business_relevant**, **default**, **business_irrelevant**, or **harmful**.
-5. Groups classified flows into one **contract** per (Source SGT, Destination SGT) pair, with one or more ACEs each.
-6. Renders the matrix as a markdown table and persists runs to SQLite.
+---
 
-The matrix-wide default rule remains `deny-ip` — the agent emits only the explicit permits (and selective denies for visibility into Business Irrelevant / Harmful flows).
+## Quick start
 
-## Install
+### Run everything with Docker Compose
 
 ```bash
-pip install -e ".[dev]"
+export SCOPILOT_ANTHROPIC__API_KEY=sk-ant-...
+docker compose -f deploy/docker-compose.yml up -d
+# api on :8000, mcp-http on :8002, worker + scheduler in the background
 ```
 
-## Run the stack
+Optional services live behind Compose profiles:
+
+```bash
+docker compose -f deploy/docker-compose.yml --profile ui --profile webex up -d
+```
 
-Phase 2 splits the app in two: a FastAPI service holds the agent + DB,
-and Streamlit / the CLI talk to it over HTTP.
+### Local dev without Docker
 
 ```bash
-# Start the API
+pip install -e ".[dev,api,worker,webex,mcp,cli,sources,ui]"
 export SCOPILOT_ANTHROPIC__API_KEY=sk-ant-...
-export SCOPILOT_API__REQUIRE_AUTH=false           # dev only
-uvicorn services.api.main:app --reload
+export SCOPILOT_REDIS__URL=memory://       # in-memory bus, no Redis needed
+export SCOPILOT_API__REQUIRE_AUTH=false    # dev only
 
-# Streamlit UI (in another terminal)
-streamlit run app.py
-# or use the CLI
-scopilot --help
-scopilot health
-scopilot sgt set 100 Employees
-scopilot run start tests/fixtures/sample.log
+# Apply migrations (SQLite by default — set SCOPILOT_DB__URL for Postgres)
+alembic upgrade head
+
+# Then start the bits you need:
+uvicorn services.api.main:app --reload          # REST + /metrics
+python -m services.worker.main --role worker    # consumes events.flow.unknown
+python -m services.worker.main --role scheduler # cron tick
+streamlit run app.py                            # UI
+scopilot --help                                 # CLI
 ```
 
-When `SCOPILOT_API__REQUIRE_AUTH=true` (prod default), set
-`SCOPILOT_API__API_KEYS=["<token>"]` and pass it via
-`Authorization: Bearer <token>` (or `SCOPILOT_API_TOKEN=<token>` for the CLI).
+---
 
-### Run the scheduler + worker (proactive autonomy)
+## Services
+
+Each service is a separate process / container. They all import from the same
+`core/` library, so a bug fix in the agent's pipeline lands everywhere at once.
+
+### `api/` — FastAPI REST + `/metrics`
+
+The control plane. Exposes runs, ingest, classify, matrix, SGT dictionary,
+proposals, healthz/readyz, and Prometheus `/metrics`. Bearer-token auth via
+`SCOPILOT_API__API_KEYS`; the dependency surface is JWKS-ready for a future
+OIDC verifier swap. Streamlit, the CLI, and the WebEx bot all talk through
+this service.
 
 ```bash
-# Required for production; for dev, set SCOPILOT_REDIS__URL=memory:// to
-# run everything single-process against the in-memory bus.
-export SCOPILOT_REDIS__URL=redis://localhost:6379/0
+uvicorn services.api.main:app --host 0.0.0.0 --port 8000
+```
 
-# One process drives the cron + Redis leader election.
-python -m services.worker.main --role scheduler
+### `worker/` — Scheduler + flow-unknown consumer
+
+Two roles in one binary:
+
+- **`--role scheduler`** runs a cron loop. Redis-leader-elected so multiple
+  replicas are safe. At every tick it loads new `flow_events` since the
+  per-tenant cursor, diffs them against the latest approved
+  `matrix_version`, and publishes `events.flow.unknown` for every
+  uncovered tuple.
+- **`--role worker`** consumes `events.flow.unknown` in a consumer group,
+  classifies the flow via Claude (honouring a 7-day classification cache
+  so re-seen flows don't pay LLM cost), and turns the verdict into a rule
+  proposal — `deny` for harmful / business_irrelevant, `permit` otherwise.
+  Storm-collapse keeps a misconfigured source from drowning operators.
 
-# One or more processes consume events.flow.unknown.
+```bash
+python -m services.worker.main --role scheduler
 python -m services.worker.main --role worker
 ```
 
-When the scheduler detects a flow not covered by the latest approved
-matrix it publishes `events.flow.unknown`; the worker picks it up,
-classifies via Claude, and creates a rule proposal that lands in WebEx
-(or any other notifier sink). Operators approve/reject via the existing
-Phase-3 flow.
+### `threat_daemon/` — Real-time SSH tail + threat intel
 
-### Run the threat daemon (reactive autonomy, optional)
+Tails `/var/log/network/syslog` over SSH (`asyncssh` with `tail -F`,
+exponential-backoff reconnect, heartbeat markers, log-rotation transparent).
+For every destination IP it consults the pluggable threat-intel layer; on a
+malicious verdict it publishes `events.flow.unknown` with `trigger="threat"`
+and the full verdict trail. From there it joins the same worker pipeline as
+the scheduler — same classification, same proposal flow, same approval loop.
 
-Tails the syslog stream in real time and on a malicious destination IP
-publishes `events.flow.unknown` with `trigger="threat"` — the same
-worker pipeline classifies the flow and posts a proposal. Requires at
-least one threat-intel provider configured.
+Providers (mix and match — at least one required):
+
+- **AbuseIPDB** (primary; free 1000/day)
+- **AlienVault OTX**
+- **VirusTotal** (slow free-tier rate-limit; lookups cached aggressively)
+- **Cisco Talos** (opt-in, best-effort, documented as brittle)
 
 ```bash
 export SCOPILOT_THREAT__ABUSEIPDB_API_KEY=<key>
-# optional: SCOPILOT_THREAT__OTX_API_KEY, SCOPILOT_THREAT__VIRUSTOTAL_API_KEY
-
 python -m services.threat_daemon.main \
     --host syslog.example.com \
     --username collector \
@@ -81,181 +179,174 @@ python -m services.threat_daemon.main \
     --log-path /var/log/network/syslog
 ```
 
-### Run the WebEx bot (optional)
+### `webex_bot/` — Approval loop in Cisco WebEx
+
+Webhook receiver with HMAC-SHA1 verification, adaptive-card builder, and a
+thin WebEx HTTP client. Every proposal posts an Approve / Reject card to the
+configured operators' room; button clicks drive the proposal state machine
+back through `core` (which on approve creates a new `matrix_version`).
+Inline `approve <id>` / `reject <id>` / `list pending` commands work too.
 
 ```bash
 export SCOPILOT_WEBEX__BOT_ACCESS_TOKEN=<bot token>
 export SCOPILOT_WEBEX__WEBHOOK_SECRET=<hmac secret>
 export SCOPILOT_WEBEX__OPERATORS_ROOM_ID=<room id>
 uvicorn services.webex_bot.main:app --port 8001
+# Point your WebEx bot webhook at https://<host>/webhooks/webex
 ```
 
-Point your WebEx bot's webhook at `https://<bot host>/webhooks/webex`.
-With these env vars set, every proposal created via the API automatically
-posts an adaptive card to the operators' room; clicking Approve / Reject
-drives the proposal through the state machine and, on approval, updates
-the live matrix.
+### `mcp_server/` — MCP server (stdio + HTTP)
 
-## Run the tests
+Exposes 14 tools (run lifecycle, SGT dictionary, proposals, threat intel) so
+any MCP-aware client — Claude Code, Claude Desktop, LibreChat, custom — can
+drive the agent. Two transports on one shared registry:
 
 ```bash
-pytest
+# stdio: Claude Code / Claude Desktop
+python -m services.mcp_server.stdio
+
+# streamable HTTP: LibreChat or any remote MCP client
+uvicorn services.mcp_server.http:app --port 8002
 ```
 
-## Project layout
+`set_sgt_name` is gated behind `--allow-dictionary-edit`; the rest of the
+tool surface is read or proposal-write.
+
+### `cli/` — `scopilot` Typer CLI
+
+A thin client of the REST API. Useful for scripting and as the reference
+for what the API actually exposes.
 
+```bash
+export SCOPILOT_API_BASE=http://localhost:8000
+export SCOPILOT_API_TOKEN=<bearer>            # only if auth is on
+scopilot health
+scopilot sgt set 100 Employees
+scopilot run start tests/fixtures/sample.log
+scopilot proposal list
+scopilot proposal approve <id>
 ```
-app.py                          # Streamlit entry point (legacy; refactored to API client in Phase 2)
-alembic/                        # async SQLAlchemy migrations
-src/segmentation_copilot/
-    config.py                   # Pydantic Settings — SCOPILOT_* env vars
-    parser.py                   # %RBM-6-SGACLHIT regex parser
-    aggregator.py               # Group events into unique flow tuples
-    sgt.py                      # SGT dict load + lookup with missing-name registry
-    classify.py                 # Claude-based flow categorisation
-    contracts.py                # Build contracts + render markdown matrix
-    db.py                       # legacy sync SQLite; replaced by core/ in Phase 1
-    agent.py                    # Security Analyst system prompt + tool registry
-    tools.py                    # legacy AgentState pipeline; replaced by core/services in Phase 1
-    core/
-        db.py                   # async SQLAlchemy 2.0 engine + session factory
-        models/                 # ORM (orm.py) + Pydantic domain models (domain.py)
-        repositories/           # async repos: runs, events, classifications, contracts, sgt, proposals, matrix
-        services/               # orchestration: ingestion, classification, matrix, baseline
-    sources/
-        base.py                 # LogSource abstract base
-        local.py                # Local file backend
-        ssh.py                  # Paramiko-based SSH backend
-tests/                          # pytest suite + fixtures
-data/                           # SQLite db + uploads (gitignored)
+
+### `app.py` — Streamlit UI
+
+Pure HTTP client of the API. Configure `SCOPILOT_API_BASE` and optionally a
+bearer token, then drive runs / approvals from the browser.
+
+```bash
+SCOPILOT_API_BASE=http://localhost:8000 streamlit run app.py
 ```
 
-## Production-readiness roadmap
-
-The plan in `/root/.claude/plans/i-would-like-to-sparkling-owl.md` decomposes the
-project into six phases. **Phase 1 (this PR)** lands the foundation:
-
-- Centralized Pydantic `Settings` (`config.py`) consuming `SCOPILOT_*` env vars.
-- Async SQLAlchemy 2.0 + Alembic — supports SQLite (dev) and Postgres (prod).
-- ORM and Pydantic domain models with `tenant_id` on every tenant-scoped table.
-- Repository layer (`core/repositories/`) with idempotent upserts and the
-  optimistic-lock proposal `decide()` SQL.
-- Service layer (`core/services/`) wrapping the existing pure-function pipeline
-  (`parser`, `aggregator`, `classify`, `contracts`) with persistence and a
-  recent-flow classification cache.
-- Schema includes `proposals`, `proposal_audit`, `matrix_versions`,
-  `threat_lookups`, `audit_events` so Phases 3 and 5 can land additively.
-
-**Phase 6 (this PR)** adds:
-
-- `services/mcp_server/` — MCP server exposing 14 tools (runs, SGT,
-  proposals, threat intel). Two transports on one shared registry:
-  stdio (`python -m services.mcp_server.stdio`) for Claude Code /
-  Desktop, and streamable HTTP for LibreChat / remote clients.
-  `set_sgt_name` is gated by `--allow-dictionary-edit`.
-- `deploy/Dockerfile` — multi-stage; one image serves every role
-  (api / worker / scheduler / mcp / threat-daemon / webex-bot / ui).
-  Non-root, read-only-rootfs friendly.
-- `deploy/docker-compose.yml` — full stack (postgres + redis + every
-  service) behind Compose profiles for the optional ones (webex,
-  threat, ui).
-- `deploy/k8s/base/` — kustomize base with Deployments + Services +
-  HPAs + PodDisruptionBudget for the scheduler + Ingress + an example
-  NetworkPolicy stack. Migration runs as a pre-install Job /
-  argocd-sync-wave -10.
-- `core/observability/` — JSON structured logs + Prometheus metrics
-  (counters for flow_unknown / classifications / proposals /
-  threat_lookups). API exposes `/metrics`.
-- `.github/workflows/ci.yml` — ruff lint, Alembic migration on a real
-  Postgres service container, `pytest` against the full matrix,
-  Docker build + push to GHCR on `main`, Trivy scan.
-
-**Phase 5** added:
-
-- `core/threat/` — pluggable threat-intelligence layer with a
-  `ThreatIntelClient` Protocol and four implementations:
-  - **AbuseIPDB** (primary; free tier 1000/day)
-  - **AlienVault OTX** (pulse-count scoring)
-  - **VirusTotal** (engine-stats normalisation)
-  - **Talos** (opt-in best-effort; documented as brittle, never the
-    only signal)
-- `ThreatAggregator` runs them in parallel with a per-call timeout and
-  applies the decision policy (score ≥ threshold OR two-provider
-  category consensus). Per-provider failures don't poison the decision.
-- Redis caching (6h clean / 24h malicious / 1h 404) backed by a
-  `threat_lookups` audit table.
-- `core/sources/streaming.py` — `StreamingLogSource` Protocol +
-  `InMemoryStreamingSource` for tests/dev.
-- `core/sources/streaming_ssh.py` — production `asyncssh`-based tail
-  with `tail -F`, exponential-backoff reconnect, heartbeat markers, and
-  resilience against log rotation.
-- `services/threat_daemon/` — ties everything together: tail → parse →
-  IP lookup → on malicious verdict, publish `events.flow.unknown` with
-  `trigger="threat"` and the full verdict trail attached. Reuses the
-  Phase-4 worker pipeline downstream — same classification, same
-  proposal flow, same WebEx approval.
-
-**Phase 4** added:
-
-- `core/events/` — `EventBus` Protocol with two implementations:
-  Redis Streams for production (consumer groups, at-least-once, idempotency
-  dedup via `SET NX`) and an in-memory bus that's contract-equivalent
-  for tests / single-process dev. Picked automatically from
-  `SCOPILOT_REDIS__URL`.
-- `services/worker/` — proactive autonomy service:
-  - **Scheduler** (`--role scheduler`) — Redis-leader-elected; only the
-    elected replica fires the cron, so multiple instances are safe.
-    At every tick, loads new `flow_events` since the last cursor,
-    diffs against the latest approved `matrix_version`, and publishes
-    `events.flow.unknown` for every uncovered tuple.
-  - **Worker** (`--role worker`) — consumes `events.flow.unknown`,
-    classifies via Claude (honouring the 7-day classification cache so
-    re-seen flows don't pay LLM cost), and turns the verdict into a
-    rule proposal (deny for harmful / business_irrelevant; permit
-    otherwise). The existing storm-collapse logic keeps a misconfigured
-    syslog source from drowning operators.
-- Per-tenant scan cursor in Redis (28-day TTL); on Redis loss the worst
-  outcome is re-classifying the last few days — absorbed by the cache.
-- Bus idempotency keys make scan ticks safe to retry.
-
-**Phase 3** added:
-
-- `core/services/proposal.py` — full proposal state machine
-  (`pending → notified → {approved | rejected | expired}`,
-  `approved → applied | failed`) with **idempotency** (same shape returns
-  the existing row) and **storm collapse** (multiple proposals for the
-  same `(src_sgt, dst_sgt)` merge into one).
-- On approval, the service creates an immutable `matrix_version` whose
-  `parent_id` chains back to the previous baseline — rollback is a pointer
-  flip.
-- `core/services/notifier.py` — pluggable sink fan-out so future channels
-  (Slack, Teams, email) drop in without touching call sites.
-- `services/webex_bot/` — FastAPI service with HMAC-SHA1 webhook
-  verification, an adaptive-card builder, and a WebEx HTTP client. Drives
-  the approve/reject loop end-to-end and handles operator races gracefully.
-- API `POST /v1/proposals` now fans out to the notifier via
-  `BackgroundTasks`; `POST /v1/proposals/{id}/decision` goes through the
-  state machine.
+---
 
 ## Deploy
 
+### Docker Compose (single host)
+
 ```bash
-# Local: full stack on docker-compose
 docker compose -f deploy/docker-compose.yml up -d
-docker compose -f deploy/docker-compose.yml --profile ui --profile webex up -d
+```
+
+Brings up Postgres + Redis + migrate + api + worker + scheduler + mcp-http.
+Optional services behind profiles: `ui`, `webex`, `threat`.
+
+### Kubernetes (kustomize)
 
-# Kubernetes
+```bash
 kubectl apply -k deploy/k8s/base
 ```
 
-Both targets need at least `SCOPILOT_ANTHROPIC__API_KEY`; threat-intel
-and WebEx-bot keys are optional. In production, source secrets via
-External Secrets Operator rather than the example `Secret`.
+`deploy/k8s/base/` ships Deployments + Services + HPAs + a leader-friendly
+`PodDisruptionBudget` for the scheduler + `cert-manager`-aware Ingress for
+`api`, `mcp`, `ui`, and `webex` hosts. Postgres + Redis are single-replica
+StatefulSets — for production, point `SCOPILOT_DB__URL` at a managed
+instance and remove the StatefulSet. An example `NetworkPolicy` stack
+(default-deny + per-service allows) lives in `networkpolicy.yaml`.
+
+Pods run non-root (uid 10001), `readOnlyRootFilesystem`, all capabilities
+dropped, `seccomp: RuntimeDefault`. Namespace enforces Pod Security
+Admission `restricted`. Replace `secret-example.yaml` with an
+`ExternalSecret` backed by Vault / cloud SM in production.
 
-## Apply migrations
+### Migrations
 
 ```bash
 alembic upgrade head
 ```
 
-Override the database URL via `SCOPILOT_DB__URL` (see `.env.example`).
+Driven by `SCOPILOT_DB__URL` (async) and `SCOPILOT_DB__SYNC_URL` (sync;
+auto-derived if unset). The K8s base runs migrations as a pre-install Job.
+
+---
+
+## Configuration
+
+Every setting is `SCOPILOT_*` and nested settings use `__`:
+
+| Group | Examples |
+|-------|----------|
+| Core | `SCOPILOT_ENVIRONMENT`, `SCOPILOT_LOG_LEVEL`, `SCOPILOT_LOG_FORMAT`, `SCOPILOT_DEFAULT_TENANT_ID` |
+| Database | `SCOPILOT_DB__URL`, `SCOPILOT_DB__SYNC_URL`, `SCOPILOT_DB__ECHO` |
+| Redis / event bus | `SCOPILOT_REDIS__URL` (set to `memory://` for single-process dev) |
+| Anthropic | `SCOPILOT_ANTHROPIC__API_KEY`, `SCOPILOT_ANTHROPIC__MODEL` |
+| API | `SCOPILOT_API__REQUIRE_AUTH`, `SCOPILOT_API__API_KEYS` (JSON list) |
+| Scheduler | `SCOPILOT_SCHED__SCAN_INTERVAL_MINUTES`, `SCOPILOT_SCHED__CLASSIFICATION_CACHE_DAYS` |
+| Threat intel | `SCOPILOT_THREAT__ABUSEIPDB_API_KEY`, `SCOPILOT_THREAT__OTX_API_KEY`, `SCOPILOT_THREAT__VIRUSTOTAL_API_KEY`, `SCOPILOT_THREAT__TALOS_ENABLED` |
+| WebEx | `SCOPILOT_WEBEX__BOT_ACCESS_TOKEN`, `SCOPILOT_WEBEX__WEBHOOK_SECRET`, `SCOPILOT_WEBEX__OPERATORS_ROOM_ID` |
+
+See `.env.example` for the full surface with defaults.
+
+---
+
+## Tests
+
+```bash
+pip install -e ".[dev,api,worker,webex,mcp,cli,sources]"
+pytest
+```
+
+Hermetic — no live Anthropic / WebEx / threat-feed calls. CI also runs
+Alembic migrations against a real Postgres service container.
+
+---
+
+## Project layout
+
+```
+app.py                              # Streamlit UI (pure HTTP client of api/)
+alembic/                            # async SQLAlchemy migrations
+deploy/
+    Dockerfile                      # multi-stage, one image per role
+    docker-compose.yml              # full stack with optional profiles
+    k8s/base/                       # kustomize Deployments + Services + HPAs + Ingress
+.github/workflows/ci.yml            # ruff + pytest + Postgres + Docker + Trivy
+services/
+    api/            FastAPI REST + /metrics
+    cli/            scopilot Typer client
+    worker/         scheduler + flow-unknown consumer (leader-elected)
+    threat_daemon/  real-time syslog tail + threat-intel lookup
+    webex_bot/      adaptive-card approval loop
+    mcp_server/     stdio + HTTP MCP server
+src/segmentation_copilot/
+    config.py                       # Pydantic Settings (SCOPILOT_* env vars)
+    parser.py                       # %RBM-6-SGACLHIT regex parser
+    aggregator.py                   # group events into unique flow tuples
+    sgt.py                          # SGT id→name dictionary + missing-id registry
+    classify.py                     # Claude-driven flow classification
+    contracts.py                    # build contracts + render markdown matrix
+    sources/
+        local.py / ssh.py           # fetch-by-window sources
+        streaming.py                # StreamingLogSource Protocol
+        streaming_ssh.py            # asyncssh tail with reconnect + heartbeat
+    core/
+        db.py                       # async SQLAlchemy 2.0 engine + session
+        models/                     # ORM (orm.py) + Pydantic domain models
+        repositories/               # async repos (runs, events, classifications,
+                                    #              contracts, sgt, proposals, matrix)
+        services/                   # ingestion / classification / matrix / baseline
+                                    # proposal state machine / notifier fan-out
+        events/                     # EventBus Protocol + Redis Streams + InMemory
+        threat/                     # ThreatIntelClient Protocol + 4 providers + aggregator
+        observability/              # JSON logs + Prometheus counters
+tests/                              # pytest suite + fixtures (79 tests)
+data/                               # SQLite db + uploads (gitignored)
+```