Skip to content

Releases: dshapi/AI-SPM

Headline: 3-node Kubernetes (kind) HA deployment by default

05 May 20:31

Choose a tag to compare

v1.0.1 — May 2026

This release moves dev from a single-node kind cluster to a
production-shaped HA topology that mirrors the prod target one-for-one.
A single ./deploy/scripts/bootstrap-cluster.sh brings up:

  • 3 control-plane Kubernetes nodes running on Docker Desktop via
    kind. No worker nodes — control-plane taints lifted on dev so
    application pods can schedule cluster-wide.
  • CNPG (CloudNativePG) Postgres with 1 primary + 2 replicas,
    automatic failover, streaming replication, and a spm-db-rw
    Service that always points to the current primary.
  • Bitnami Redis Sentinel with 3 nodes for cache + session HA.
  • MinIO distributed object storage (4 servers, 4 disks) backing
    Flink checkpoints and other large blobs. Replaces Longhorn.
  • Istio service mesh with sidecar injection, mTLS PeerAuthentication,
    AuthorizationPolicies per service, and a single ingress gateway at
    https://aispm.local (port 443, browser-trusted via mkcert).
  • gVisor (runsc) RuntimeClass as the default for customer-uploaded
    agent pods — per-pod kernel sandboxing for untrusted code.
  • Local container registry at localhost:5001, mirrored into every
    kind node so dev image rebuilds land in seconds.

The chart applies in 6 phased tiers (infra → data → data-init →
platform → compute → frontend), serially, with auto-recovery from
immutable-PVC errors, failed Jobs, and stale registry state.
A single canonical seeder (scripts/seed_all.py) populates models,
posture history, integrations, cases, alerts, and policies via one
idempotent K8s Job.

Bootstrap & operability

  • One entry point: ./deploy/scripts/bootstrap-cluster.sh. Helper
    scripts (kind-cluster.sh, kind-databases-ha.sh,
    install-gvisor.sh, build-images.sh) are now invoked from
    bootstrap; never directly.
  • Cluster-destroy prompt at the start. FORCE_DESTROY /
    FORCE_KEEP / FORCE_CREATE env overrides for CI.
  • Verbose-by-default: every step prints what it does. xtrace goes to
    /tmp/bootstrap-xtrace-<pid>.log on bash 4.1+.
  • Phased apply runs serially to keep kind-on-Docker load manageable.
  • Auto-recovery from immutable-PVC errors, failed/stuck Jobs, stuck
    pod templates, stale Docker registry containers, and racing
    cert-manager Certificate reconcile loops.
  • .env keys (Postgres password, Anthropic, Tavily, Groq, etc.)
    auto-merge into platform-secrets every bootstrap — no more
    re-entering them in the Integrations UI after a rebuild.

TLS / WSS

  • Chart's aispm-tls-certificate.yaml now gates on
    ingress.certManager, so cert-manager doesn't fight the
    bootstrap's mkcert Secret in dev.
  • Bootstrap actively repairs the WSS cert chain on every run —
    re-upserts the mkcert Secret, restarts istio-ingressgateway,
    verifies the wire issuer is mkcert development CA. Safari
    WebSocket Secure connections now work end-to-end with no manual
    steps.

Schema drift fixes (alembic 010 / 011 / 012)

  • New agent_kind enum + agents.kind column.
  • agents.risk migrated from risk_level to model_risk_tier;
    model_risk_tier extended with low/medium/critical.
  • agents.policy_status migrated from policy_status enum to
    policy_coverage; values mapped (coveredfull).
  • model_provider extended with aws, azure, gcp, internal.
  • All migrations idempotent across fresh, hand-patched, and re-run
    database states.

Resource sizing

  • spm-api ships with explicit 2Gi memory limits (was inheriting
    the namespace LimitRange's 512Mi and getting OOMKilled every
    ~8 minutes).
  • db-seed Job memory bumped to 2Gi.

macOS dev caveat

gVisor's runsc doesn't work on Docker Desktop's Linuxkit kernel —
sandbox init crashes. values.dev.yaml overrides
agentRuntime.runtimeClassName to "" so agent pods use runc on
Mac. On a Linux dev host or in prod the gVisor sandbox is enforced
unchanged.

AI-SPM v1.0.0 — AI Security Posture Management

25 Apr 19:00

Choose a tag to compare

AI-SPM v1.0.0 — AI Security Posture Management

Release date: 2026-04-25
Codename: "MCP"

First production release of the AI Security Posture Management. Customers
can now upload their own AI agents as a single Python file, deploy
them into sandboxed containers, and have them chat through the full
security pipeline — prompt-guard → policy decider → Kafka → output-guard
— with attached policies, conversation memory, web search, and a live
activity timeline visible in the admin UI.


Highlights

  • End-to-end agent chat through the existing AI-SPM security pipeline,
    with attached per-agent policies enforced on every turn.
  • Drop-in agent uploads. Operator drops in a single agent.py, the
    platform validates it, mints per-agent tokens, spawns a sandboxed
    Docker container, and routes traffic through Kafka. No custom image
    required for the five example agent shapes we ship.
  • Provider-agnostic LLM proxy. Native dispatch for Anthropic and
    Ollama (both OpenAI-compatible and native modes); operators switch
    providers in the UI without restarts or code changes.
  • Live observability. Every chat turn, web-search call, and LLM
    call emits a lineage event that lands in session_events and tails
    in the per-agent Activity tab in the admin UI within 5 seconds.
  • DB-backed configuration. The agent SDK fetches its connection
    bundle from the controller at boot — no platform secrets in the
    agent's container env.

What's new

Agent runtime control plane

  • POST /api/spm/agents — upload agent.py (multipart) with
    deploy_after=true. Validates syntax, top-level async def main,
    and dry-import; mints per-agent mcp_token + llm_api_key; creates
    the per-agent Kafka topics; spawns the runtime container; polls for
    the SDK's aispm.ready() handshake.
  • POST /api/spm/agents/{id}/start | /stop — idempotent kick;
    UI surfaces a persistent "working…" spinner until the polled
    runtime_state actually changes.
  • DELETE /api/spm/agents/{id} — stops the container, drops the
    topics, deletes the row.
  • POST /api/spm/agents/{id}/chat — full pipeline, SSE response.
  • GET /api/spm/agents/{id}/bootstrap — DB-backed SDK boot. The
    agent's container only needs three env vars (AGENT_ID,
    MCP_TOKEN, CONTROLLER_URL); everything else is fetched here.
  • GET /api/spm/agents/{id}/policies + PUT — atomic-replace
    attach/detach. The chat handler reads linked_policies per turn
    and forwards them to OPA so policies can scope evaluation.
  • GET /api/spm/agents/{id}/activity — unified timeline (chat
    turns + AgentToolCall + AgentLLMCall), newest-first, capped at
    200 rows. Polled by the Activity tab.

Agent-side SDK (agent_runtime/aispm)

  • aispm.ready() — lifecycle handshake.
  • aispm.chat.subscribe() / reply() — Kafka I/O. Consumer uses
    auto_offset_reset="earliest" so the very first message after deploy
    is never silently dropped during consumer-group join.
  • aispm.chat.history(session_id, limit) — replay persisted turns;
    example agents use this for conversation memory across turns.
  • aispm.mcp.call("web_fetch", ...) — JSON-RPC over HTTP to the MCP
    server; web_fetch is Tavily-backed.
  • aispm.llm.complete(messages=, model=…) — OpenAI-compatible call
    through spm-llm-proxy; the SDK no longer pins a default model so
    the operator's chosen provider model wins.
  • aispm.get_secret(name) — per-agent secret store.
  • aispm.log("step", trace=…) — structured lineage line on stdout.

Provider dispatch (spm-llm-proxy)

connector_type Endpoint Auth header Model source
anthropic {base_url}/v1/messages x-api-key + anthropic-version: 2023-06-01 integration model (payload model honoured only when it starts with claude)
ollama (/v1) {base_url}/chat/completions (OpenAI-compatible) none payload model > integration model > llama3.1:8b fallback
ollama (other) {base_url}/api/chat (native) none payload model > integration model > llama3.1:8b fallback

Switching provider is a UI dropdown change on the AI-SPM Agent Runtime
Control Plane (MCP)
integration row — no restart, no agent re-deploy.

Observability (AgentToolCallEvent, AgentLLMCallEvent)

  • spm-mcp emits AgentToolCallEvent after every web_fetch,
    capturing tool name, args, ok/error, and duration_ms.
  • spm-llm-proxy emits AgentLLMCallEvent after every chat-completion
    call (Anthropic and Ollama paths), capturing model, prompt and
    completion token counts, and ok/error.
  • Both events publish to cpm.global.lineage_events. The existing
    lineage_consumer persists them into session_events automatically.
  • Best-effort by design: a producer init failure never blocks the
    serving path. A lineage_producer.send failed warning is the only
    signal when Kafka is unreachable; chat keeps working.

Admin UI

  • Inventory → Agents tab lists live agents alongside mock rows
    with a runtime-state pip and risk tint.
  • PreviewPanel (right-side panel on row click) carries the
    Run/Stop toggle, Open Chat, View Detail, and Delete asset
    actions.
  • AgentChatPanel (300px inline panel) opens from PreviewPanel's
    Open Chat. Composer pinned to bottom (min-h-0 + max-h(100vh-120px)
    so it can never be pushed off-screen by long chat history).
  • AgentDetailDrawer (560px overlay) opens from PreviewPanel's
    View Detail button. Five tabs: Overview, Configure, Activity
    (live tail, polls every 5s), Sessions, Lineage.
  • PolicySelector lets operators attach/detach policies on a live
    agent without leaving the panel.
  • Add Integration modal: enum_integration fields render as real
    dropdowns of existing integrations (no more pasting UUIDs).
  • Run/Stop button stays in a "working…" state until the next poll
    observes the actual runtime-state change.

Examples

A new top-level Example agents/ folder ships five
ready-to-deploy agents — one per agent_type enum value:

File agent_type Demonstrates
custom_agent.py custom Bare-SDK happy path with aispm.chat.history() conversation memory and a strong web-search prompt.
langchain_agent.py langchain Off-the-shelf LangChain AgentExecutor + @tool calling our MCP / LLM proxies.
llamaindex_agent.py llamaindex LlamaIndex chat-engine routed through aispm.llm, with a hand-rolled retrieval fallback.
autogpt_agent.py autogpt Self-prompting plan → execute → reflect loop, capped at 3 hops.
openai_assistant_agent.py openai_assistant OpenAI Assistants-style request shape (system + user + tools), no framework.

The runtime image now has langchain==0.3.*, langchain-openai==0.2.*,
llama-index-core==0.11.*, and llama-index-llms-openai-like==0.2.*
baked in, so langchain_agent.py and llamaindex_agent.py deploy
cleanly without bringing your own image.


Bug fixes

  • paused agent immediately after deploy. The upload route's
    _wait_for_ready was reading a stale identity-mapped Agent row
    from its own SQLAlchemy session and timing out, then overwriting
    the (correctly running) row to crashed. Fixed with db.expire_all()
    on every poll iteration.
  • First message after deploy silently dropped. The agent's Kafka
    consumer joined the group with the default auto_offset_reset= "latest", so any message produced between aispm.ready() flipping
    the row to running and the consumer registering with the broker
    was skipped. Fixed by switching to earliest.
  • Prompt blocked by safety guard. (S2) on the literal word "yes".
    Three different code sites (two adapters and one module-level
    function injected via guard_fn=) had the same anti-pattern that
    forced verdict=block whenever any S1–S15 category appeared, even
    when the guard's own verdict was allow. Replaced with a length-based
    bypass for inputs under GUARD_MIN_TEXT_LEN=8 chars and a
    score-threshold (GUARD_BLOCK_SCORE=0.6) gate on the
    category-escalation path.
  • 502 Load failed on chat. The agent_chat.py SSE handler was
    importing aiokafka lazily but the package wasn't in spm-api's
    requirements. Added the dep.
  • 500 ModuleNotFoundError: No module named 'services.spm_api' in
    both spm-llm-proxy and spm-mcp. Both fell back to a brittle
    cross-service import. Inlined _decode_secret and dropped the
    cross-service registry lookup so each service is self-contained.
  • POST /v1/chat/completions returning 500. The proxy was hardcoded
    to Ollama's /api/chat shape; pointing Default LLM at Anthropic
    produced a 404 from api.anthropic.com. Now branches on
    connector_type and translates request + response shape per provider.
  • **web_fetch...
Read more