diff --git a/README.md b/README.md
index d0dea78..a86c160 100644
--- a/README.md
+++ b/README.md
@@ -42,6 +42,14 @@ Two integration paths:
 1. **Proxy only**: Point your MCP client at Helio instead of your MCP server. Zero code changes. Immediate governance.
 2. **Proxy + SDK**: Add the thin Python SDK to annotate tool calls with evidence context and action dependencies. Richer governance, under 500 lines of code.
 
+### Enforcement grades
+
+Helio governs at the strongest grade each path physically allows, and records it per call:
+
+- **Structural** (stdio MCP) — Helio owns the only path to the tool; the agent cannot route around it.
+- **Network** (HTTP MCP) — structural given you control the upstream's egress.
+- **Host-enforced** (hook adapters via the [adapter API](docs/adapter-api.md), e.g. OpenClaw) — for frameworks that run tools in-process and expose hooks rather than an MCP transport. The framework's hook gate enforces; Helio decides. This is a cooperative, lower grade than the proxy path, and Helio labels it as such rather than overclaiming. Helio's decisions still cannot be evicted from the agent's context or prompt-injected, and any attempt to route around them is visible in the audit trail.
+
 ## Quick Start (5 minutes)
 
 ### 1. Install
diff --git a/docs/adapter-api.md b/docs/adapter-api.md
new file mode 100644
index 0000000..eaf7d18
--- /dev/null
+++ b/docs/adapter-api.md
@@ -0,0 +1,151 @@
+# Adapter Governance API (sideband)
+
+> **Status: experimental.** These four endpoints are how hook-based agent frameworks (OpenClaw first) drive Helio's policy engine without an MCP transport to interpose on. The contract may change in a breaking way until a second adapter validates its neutrality. Pin your adapter to a Helio minor version.
+
+The governance API lives on the **SDK sideband** — the local server on `127.0.0.1:3200` (configurable via `sdk.*`), the same server the Python SDK uses for evidence/context. It is **not** the dashboard sideband (`:3100`, documented in [Sideband API Reference](./sideband-api.md)); the two are different servers with different jobs. Endpoints here:
+
+| Route                        | Purpose                                                                        |
+| ---------------------------- | ------------------------------------------------------------------------------ |
+| `POST /evaluate`             | Decide a tool call. **Side-effect-free** on rate/spend counters.               |
+| `POST /audit`                | Record the outcome of an evaluated call; **consumes** counters. Idempotent.    |
+| `POST /install-scan`         | Evaluate a package/skill install. Observational until install-time rules ship. |
+| `POST /approval/:id/resolve` | Record the resolution of a natively-handled approval.                          |
+
+## Why this exists, and what it does not promise
+
+Helio's headline guarantee is **structural** enforcement: an agent speaking MCP physically cannot reach a tool except through the proxy. Hook-based frameworks run their tools in-process, so there is nothing to proxy — the framework's hook dispatcher is the enforcement point, and Helio supplies the decision. This is the standard policy-decision-point / policy-enforcement-point split.
+
+Helio classifies every governed call by **enforcement grade**, surfaced via the audit `origin` column:
+
+| Grade           | Path                     | Guarantee                                                          |
+| --------------- | ------------------------ | ------------------------------------------------------------------ |
+| `structural`    | stdio MCP                | Helio owns the only path to the tool.                              |
+| `network`       | HTTP MCP                 | Structural given the operator controls egress.                     |
+| `host-enforced` | hook adapters (this API) | Enforcement by the host framework's hook gate; decisions by Helio. |
+
+The host-enforced grade is **cooperative**: it works only if the adapter faithfully calls `/evaluate`, honors the decision, and reports `/audit`. A malicious in-process skill that bypasses the hook is outside what this API can prevent (`/install-scan` exists to gate exactly that vector). Helio does not market the hook path as proxy-grade, and neither should you.
+
+### Normative adapter requirements
+
+An adapter built on this API **MUST**:
+
+1. **Fail closed.** If `/evaluate` is unreachable, times out, or returns 5xx, **block** the tool call. Never proceed on a failed decision — this is the property that couples tool execution to Helio's liveness.
+2. **Resolve before auditing.** For a `require_approval` decision, call `/approval/:id/resolve` before `/audit` (see [Approvals](#approvals)).
+3. **Carry tool definitions where it can** (`tool.input_schema`, `tool.annotations`, …) so adapter-origin tools get the same rug-pull / drift guard as MCP tools.
+4. **Authenticate** with the adapter-scope bearer token (`HELIO_ADAPTER_TOKEN`), never the SDK token.
+
+## Authentication
+
+The governance routes require `Authorization: Bearer <HELIO_ADAPTER_TOKEN>`. This is a **separate token** from the SDK's `HELIO_SDK_TOKEN`: an SDK client cannot drive policy decisions, and an adapter cannot write evidence. Both are generated per boot (and printed to stderr) unless set in the environment. Requests carrying an `Origin` header are refused (browser-forgery guard), and bodies over 1 MiB are rejected with 413.
+
+If you embed `GovernanceService` directly (instead of running `helio start`), wire an `ApprovalRouter` whenever the policy can emit `require_approval` (explicit rules, `flag_destructive: require_approval`, or `on_tool_drift: require_approval`), otherwise construction and hot-reload fail closed by throwing `GovernanceConfigError` (exported from `@gethelio/proxy`).
+
+## `POST /evaluate`
+
+```jsonc
+// Request
+{
+  "origin": "openclaw",                  // optional; default "sideband"; ^[a-z0-9_-]{1,64}$
+  "adapter_version": "0.1.0",            // optional, ≤64 chars (per-origin liveness)
+  "agent_id": "main",                    // optional
+  "session_id": "oc-session-1",          // optional; required for evidence/dependency rules
+  "tool": {
+    "name": "send_message",              // required
+    "description": "…",                  // optional ┐ full definition enables the drift guard
+    "input_schema": { },                 // optional ┤
+    "annotations": { "destructiveHint": false } // optional ┘
+  },
+  "arguments": { "channel": "#general", "text": "hi" }, // optional (≤64 KiB)
+  "metadata": { "channel_id": "C1", "sender_id": "U7" } // optional (≤4 KiB)
+}
+
+// Response 200
+{
+  "evaluation_id": "5f2…",               // correlate with /audit; present even for terminal decisions
+  "decision": "allow",                   // allow | deny | require_approval | rate_limited | spend_limited | dry_run
+  "reason": "Matched \"allow-chat\" → allow",
+  "matched_rule": "allow-chat",          // null when the default policy applied
+  "matched_rule_index": 2,
+  "feedback": { "message": "…" },        // present on blocking decisions
+  "approval": { "id": "…", "timeout_ms": 300000, "resolve_path": "/approval/…/resolve" }, // require_approval only
+  "limits": { "rate": { } },             // present when a limit rule matched
+  "dry_run": { "would_forward": true },  // dry_run only
+  "tool_drift": { "changes": [ ] }       // present when the drift gate fired
+}
+```
+
+The `decision` is an **outcome**, not Helio's internal rule action: a `rate_limit` rule that still has headroom returns `"allow"` with a `limits.rate` block; only when the bucket is exhausted does it return `"rate_limited"`. There is no `modify` decision — argument rewriting has no engine support today.
+
+**Errors:** `400` validation / invalid JSON, `401` wrong-or-missing adapter token, `403` Origin header, `413` oversized `metadata`/`tool_input`/body, `400 origin_limit_exceeded` / `400 tool_baseline_limit` / `503 evaluation_backlog_full` (memory budgets), `503 governance_unavailable` (sideband running without the service).
+
+## `POST /audit`
+
+```jsonc
+// Request
+{
+  "evaluation_id": "5f2…",
+  "status": "success" | "error" | "not_executed",
+  "error": "…",            // optional, when status == "error"
+  "duration_ms": 412,      // optional
+  "result": { },           // optional outcome summary
+  "actual_amount": 0.42    // optional, finite ≥0 — true post-execution spend; overrides the arg-derived amount
+}
+
+// Response 201 (fresh) — replays return 200
+{ "ok": true, "audit_record_id": "…" }
+```
+
+Counters are consumed here (not at `/evaluate`), and only when the call actually ran (`success`/`error`, not `not_executed`). `/audit` is **idempotent on `evaluation_id`**: an identical replay returns `200 { already_finalized: true }` with no double-consumption, so a network retry after a lost response is safe. A different payload under the same id is an adapter bug → `409 evaluation_conflict`.
+
+**Decision finalization.** `deny`, `rate_limited`, `spend_limited`, and `dry_run` are **terminal at `/evaluate`** — their audit record is written immediately, so completeness never depends on the adapter calling `/audit`. A later `/audit` for such an evaluation returns `200 { finalized_by: "evaluate" }` and accepts any payload, so adapters may audit unconditionally.
+
+`actual_amount` must be finite and `>= 0` (`400 invalid_actual_amount` otherwise) and only applies to evaluations whose decision carried a spend rule (`400 no_spend_rule` if sent for any other evaluation).
+
+**Other responses:** `404 evaluation_unknown`, `404 evaluation_expired` (the decision aged out — see below), `409 approval_unresolved` (resolve the approval first; **retryable** with short backoff).
+
+### The crash-TTL and TOCTOU caveats
+
+- An evaluation that is never audited expires after `sdk.evaluation_ttl` (default `10m`) into an audit record with `record_kind: "evaluation_expired"`. This is a **bypass/tamper signal**, not a normal block — surface it in monitoring.
+- Because decision and execution are separate calls, two concurrent `/evaluate`s can both peek the last limit slot and both execute. Counters stay truthful after the fact (both `/audit`s record), but the host-enforced tier cannot close this window from the proxy side.
+
+## `POST /install-scan`
+
+Observational until install-time policy ships: always returns `decision: "allow"` with `reason: "no install-time rules defined"`, and writes an audit record with `record_kind: "install_scan"`. The request/response shape is final now so adapters and the dashboard build against a stable contract.
+
+```jsonc
+// Request
+{ "origin": "openclaw", "package": { "name": "left-pad", "version": "1.3.0", "source": "npm" } }
+// Response 200
+{ "evaluation_id": "…", "decision": "allow", "reason": "no install-time rules defined", "matched_rule": null }
+```
+
+## Approvals
+
+A `require_approval` decision creates a **native ticket** (`channel_name: native:<origin>`): Helio does not block, start timeout timers, or notify a channel, because the adapter runs the approval in its own UI (e.g. a Telegram dialog). The dashboard shows the ticket but its approve/deny buttons return `409 native_ticket` — only the adapter can resolve it, via:
+
+```jsonc
+// POST /approval/:id/resolve
+{ "resolution": "approved" | "denied" | "timeout" | "cancelled",
+  "resolved_by": "telegram:@oli",   // required for approved/denied
+  "reason": "…", "scope": "once" | "always" }
+// Response 200
+{ "ok": true }
+```
+
+The resolution does **not** write the audit record; the subsequent `/audit` does, copying the approval status. A native ticket times out at `min(rule timeout, evaluation TTL)`; deadlines are enforced on access, so a late resolve deterministically returns `409 already_resolved`.
+
+## Audit record additions
+
+Sideband activity shares the audit schema with the MCP path, plus three columns (also used by the dashboard):
+
+- `record_kind` — `tool_call` | `drift_event` | `install_scan` | `evaluation_expired`.
+- `origin` — `mcp` for the proxy path, or the adapter origin string.
+- `metadata` — the adapter-supplied context object (reserved keys `channel_id`, `sender_id`, `sender_name`, `conversation_id`).
+
+See [Audit Trail](./audit.md) for the full record reference.
+
+## See also
+
+- [Sideband API Reference](./sideband-api.md) — the dashboard sideband (`:3100`), a different server.
+- [Configuration](./configuration.md) — the `sdk.*` block (`enabled`, `port`, `host`, `evaluation_ttl`).
+- [Policy Guide](./policies.md) — the rules these endpoints evaluate.
diff --git a/docs/audit.md b/docs/audit.md
index 466c2e0..bd3bf99 100644
--- a/docs/audit.md
+++ b/docs/audit.md
@@ -31,6 +31,9 @@ Each audit record contains the following fields:
 | `proxy_compute_ms`     | number         | Proxy compute time excluding approval wait and upstream processing.                                                                                 |
 | `flagged_destructive`  | boolean        | Whether the tool was flagged as potentially destructive (`destructiveHint: true`).                                                                  |
 | `dry_run`              | boolean        | Whether this record was produced in dry-run mode.                                                                                                   |
+| `record_kind`          | string         | Record category: `tool_call` (default), `drift_event`, `install_scan`, or `evaluation_expired` (a sideband decision that was never audited).        |
+| `origin`               | string         | Enforcement origin: `mcp` for the proxy path, or an adapter origin string (e.g. `openclaw`) for [sideband-governed](./adapter-api.md) calls.        |
+| `metadata`             | object \| null | Adapter-supplied context (reserved keys `channel_id`, `sender_id`, `sender_name`, `conversation_id`). Null for MCP-origin records.                  |
 | `created_at`           | string         | ISO 8601 timestamp of when the record was persisted to the database.                                                                                |
 
 ## Tool Definition Drift Records
diff --git a/docs/configuration.md b/docs/configuration.md
index 47512f2..3709dc5 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -298,25 +298,28 @@ Audit rows also include:
 
 ### sdk
 
-Configuration for the Python SDK sideband API, used for evidence grounding.
+Configuration for the SDK sideband API, used for evidence grounding (Python SDK) and the [adapter governance API](./adapter-api.md) (hook-based adapters such as OpenClaw).
 
-| Field     | Type    | Required | Default     | Description                          |
-| --------- | ------- | -------- | ----------- | ------------------------------------ |
-| `enabled` | boolean | No       | `false`     | Enable the SDK sideband HTTP server. |
-| `port`    | integer | No       | `3200`      | Sideband server port (1–65535).      |
-| `host`    | string  | No       | `127.0.0.1` | Sideband server bind address.        |
+| Field            | Type    | Required | Default     | Description                                                                                                       |
+| ---------------- | ------- | -------- | ----------- | ----------------------------------------------------------------------------------------------------------------- |
+| `enabled`        | boolean | No       | `false`     | Enable the SDK sideband HTTP server.                                                                              |
+| `port`           | integer | No       | `3200`      | Sideband server port (1–65535).                                                                                   |
+| `host`           | string  | No       | `127.0.0.1` | Sideband server bind address.                                                                                     |
+| `evaluation_ttl` | string  | No       | `10m`       | How long a governance `/evaluate` decision waits for its `/audit` before being finalized as `evaluation_expired`. |
 
 #### Sideband authentication
 
-When `sdk.enabled` is `true`, Helio generates a fresh 32-byte hex Bearer token on every `helio start` and prints it to stderr:
+When `sdk.enabled` is `true`, Helio generates two fresh 32-byte hex Bearer tokens on every `helio start` (unless set in the environment) and prints them to stderr:
 
 ```
 SDK sideband listening on http://127.0.0.1:3200
 SDK token (pass as HELIO_SDK_TOKEN env var to your SDK clients):
   <hex>
+Adapter token (governance routes; pass as HELIO_ADAPTER_TOKEN to your adapter):
+  <hex>
 ```
 
-The token is also written into `process.env.HELIO_SDK_TOKEN` so child processes spawned by the proxy inherit it. Every sideband request except `GET /healthz` must carry `Authorization: Bearer <token>`; mismatches return `401`. The sideband additionally rejects any request that carries an `Origin` header (including `Origin: null`) and blocks `OPTIONS` preflights with `403`, so a malicious local HTML file cannot talk to it through a browser.
+The tokens are scoped: `HELIO_SDK_TOKEN` authorizes the evidence/context routes, and `HELIO_ADAPTER_TOKEN` authorizes the governance routes (`/evaluate`, `/audit`, `/install-scan`, `/approval/:id/resolve`). An SDK client cannot drive policy decisions, and an adapter cannot write evidence. Both are written into `process.env` so child processes inherit them. Every sideband request except `GET /healthz` must carry the matching `Authorization: Bearer <token>`; mismatches return `401`. The sideband rejects any request carrying an `Origin` header (including `Origin: null`), blocks `OPTIONS` preflights with `403`, and rejects request bodies over 1 MiB with `413`.
 
 Operators who need a stable token across restarts can set `HELIO_SDK_TOKEN` explicitly in the proxy's environment — the proxy respects a pre-set value instead of generating one. Rotation, revocation, and key management are not part of the v0.1.0 trust model; a restart with a new token is the rotation primitive.
 
diff --git a/docs/sideband-api.md b/docs/sideband-api.md
index f130ae9..84d7b84 100644
--- a/docs/sideband-api.md
+++ b/docs/sideband-api.md
@@ -4,6 +4,8 @@ The dashboard sideband is a read/write REST + SSE surface served on a separate p
 
 > **Why a separate port?** The sideband is deliberately isolated from the `/mcp` port. This is what prevents an agent speaking `/mcp` from self-approving its own pending tickets — the approval REST API is mounted exclusively on this sideband, not on the MCP port.
 
+> **Not the same as the SDK sideband.** This document covers the **dashboard sideband** (`:3100`), the operator read/write surface. There is a second, separate **SDK sideband** (`:3200`, `sdk.*`) that serves the Python SDK's evidence routes and the [adapter governance API](./adapter-api.md) (`/evaluate`, `/audit`, `/install-scan`, `/approval/:id/resolve`) for hook-based adapters like OpenClaw. Different server, different port, different token.
+
 ## Overview
 
 - **Default port:** `127.0.0.1:3100` (configurable via `dashboard.port`).
diff --git a/packages/dashboard/src/components/ApprovalStatusBadge.tsx b/packages/dashboard/src/components/ApprovalStatusBadge.tsx
index 133e522..32ae456 100644
--- a/packages/dashboard/src/components/ApprovalStatusBadge.tsx
+++ b/packages/dashboard/src/components/ApprovalStatusBadge.tsx
@@ -14,6 +14,7 @@ const COLORS: Record<ApprovalStatus, string> = {
   break_glass: 'bg-purple-50 text-purple-700 ring-purple-600/20',
   client_disconnected: 'bg-gray-100 text-gray-700 ring-gray-500/20',
   shutdown_cancelled: 'bg-slate-100 text-slate-700 ring-slate-500/20',
+  cancelled: 'bg-slate-100 text-slate-700 ring-slate-500/20',
 }
 
 export const ApprovalStatusBadge = memo(function ApprovalStatusBadge({
diff --git a/packages/dashboard/src/types.ts b/packages/dashboard/src/types.ts
index acb57e4..fbc1a57 100644
--- a/packages/dashboard/src/types.ts
+++ b/packages/dashboard/src/types.ts
@@ -75,6 +75,7 @@ export type ApprovalStatus =
   | 'break_glass'
   | 'client_disconnected'
   | 'shutdown_cancelled'
+  | 'cancelled'
 
 export interface ApprovalTicket {
   readonly id: string
diff --git a/packages/proxy/src/__tests__/sideband-shared-limiter.test.ts b/packages/proxy/src/__tests__/sideband-shared-limiter.test.ts
new file mode 100644
index 0000000..1edd82b
--- /dev/null
+++ b/packages/proxy/src/__tests__/sideband-shared-limiter.test.ts
@@ -0,0 +1,75 @@
+import { describe, it, expect, vi } from 'vitest'
+import { GovernanceService } from '../sideband/governance-service.js'
+import { GovernedForwarder } from '../policy/governed-forwarder.js'
+import { compilePolicies } from '../policy/parser.js'
+import { RateLimiter } from '../policy/rate-limiter.js'
+import type { McpForwarder, McpRequest, ForwardResult } from '../mcp/types.js'
+
+// ---------------------------------------------------------------------------
+// Shared-limiter integration (issue #12): a rate counter consumed by a
+// sideband /audit must be visible to a subsequent MCP-path tools/call, and
+// vice versa — one budget, both doors.
+// ---------------------------------------------------------------------------
+
+function mockForwarder(): McpForwarder {
+  const body = { jsonrpc: '2.0' as const, id: 1, result: { content: [] } }
+  const response = { status: 200, headers: {}, body }
+  const result: ForwardResult = { response, durationMs: 1 }
+  return { forward: vi.fn().mockResolvedValue(result) }
+}
+
+function toolsCall(name: string): McpRequest {
+  return { jsonrpc: '2.0', id: 1, method: 'tools/call', params: { name, arguments: {} } }
+}
+
+function isBlocked(result: ForwardResult): boolean {
+  const body = result.response.body as { error?: unknown }
+  return body.error !== undefined
+}
+
+describe('sideband ↔ MCP shared rate limiter', () => {
+  it('counts a sideband /audit consumption against the MCP-path budget', async () => {
+    const policy = compilePolicies({
+      default: 'allow',
+      dry_run: false,
+      rules: [
+        {
+          name: 'rl',
+          match: { tool: 'send' },
+          action: 'rate_limit',
+          limits: { max_calls: 2, window: '60s' },
+        },
+      ],
+    }).policy
+
+    // One limiter instance shared by both paths.
+    const rateLimiter = new RateLimiter({ cleanupIntervalMs: 0 })
+    const service = new GovernanceService({ policy, rateLimiter, sweepIntervalMs: 0 })
+    const forwarder = new GovernedForwarder(mockForwarder(), policy, { rateLimiter })
+
+    // 1) Sideband evaluate + audit consumes slot 1 of 2.
+    const ev = service.evaluate({
+      origin: 'openclaw',
+      agent_id: null,
+      session_id: null,
+      tool: { name: 'send' },
+      arguments: {},
+      metadata: null,
+    })
+    expect(ev.body['decision']).toBe('allow')
+    service.audit({ evaluation_id: ev.body['evaluation_id'] as string, status: 'success' }, 'h')
+    expect(rateLimiter.getKeyState('tool:send')?.current).toBe(1)
+
+    // 2) MCP tools/call consumes slot 2 of 2 — allowed.
+    const first = await forwarder.forward(toolsCall('send'))
+    expect(isBlocked(first)).toBe(false)
+    expect(rateLimiter.getKeyState('tool:send')?.current).toBe(2)
+
+    // 3) MCP tools/call now over the shared budget — blocked.
+    const second = await forwarder.forward(toolsCall('send'))
+    expect(isBlocked(second)).toBe(true)
+
+    rateLimiter.close()
+    service.close()
+  })
+})
diff --git a/packages/proxy/src/approval/api.test.ts b/packages/proxy/src/approval/api.test.ts
index bbefac2..ad56a21 100644
--- a/packages/proxy/src/approval/api.test.ts
+++ b/packages/proxy/src/approval/api.test.ts
@@ -67,6 +67,39 @@ describe('Approval REST API', () => {
     }
   })
 
+  // -----------------------------------------------------------------------
+  // Native (adapter-owned) tickets are not operator-resolvable — issue #12
+  // -----------------------------------------------------------------------
+
+  describe('native ticket guard', () => {
+    it('rejects approve/deny/break-glass on native tickets with 409 native_ticket', async () => {
+      const { router: r, post } = setup()
+      router = r
+      const ticket = r.createNativeTicket({
+        tool_name: 'send',
+        tool_input: {},
+        matched_rule: undefined,
+        session_id: null,
+        origin: 'openclaw',
+      })
+
+      for (const [path, body] of [
+        [`/${ticket.id}/approve`, { approved_by: 'op' }],
+        [`/${ticket.id}/deny`, { denied_by: 'op' }],
+        [`/${ticket.id}/break-glass`, { approved_by: 'op', reason: 'x' }],
+      ] as const) {
+        const res = await post(path, body)
+        expect(res.status).toBe(409)
+        const json = (await res.json()) as { error: string; resolve_in: string }
+        expect(json.error).toBe('native_ticket')
+        expect(json.resolve_in).toBe('openclaw')
+      }
+
+      // The ticket stays pending — no operator decision leaked into it.
+      expect(r.resolveNativeTicket(ticket.id, 'approved', 'tg')).toBe(true)
+    })
+  })
+
   // -----------------------------------------------------------------------
   // GET /
   // -----------------------------------------------------------------------
diff --git a/packages/proxy/src/approval/api.ts b/packages/proxy/src/approval/api.ts
index ee66eb7..0a35ffd 100644
--- a/packages/proxy/src/approval/api.ts
+++ b/packages/proxy/src/approval/api.ts
@@ -1,6 +1,7 @@
 import { Hono } from 'hono'
 import { z } from 'zod'
 import type { ApprovalRouter } from './router.js'
+import { NATIVE_CHANNEL_PREFIX } from './router.js'
 import type { ApprovalQueue } from './queue.js'
 import type { ApprovalStatus } from './types.js'
 import { verifyBearer } from '../auth/bearer.js'
@@ -33,6 +34,7 @@ const APPROVAL_STATUSES = [
   'break_glass',
   'client_disconnected',
   'shutdown_cancelled',
+  'cancelled',
 ] as const
 const approvalStatusSet = new Set<ApprovalStatus>(APPROVAL_STATUSES)
 
@@ -162,6 +164,18 @@ export function createApprovalApp(
     if (!ticket) {
       return c.json({ error: 'Ticket not found' }, 404)
     }
+    if (ticket.channel_name.startsWith(NATIVE_CHANNEL_PREFIX)) {
+      // Adapter-owned approval (issue #12, D10): a dashboard decision cannot
+      // reach the adapter's native UI, so resolving here would record control
+      // that never propagated. Direct the operator to the owning surface.
+      return c.json(
+        {
+          error: 'native_ticket',
+          resolve_in: ticket.channel_name.slice(NATIVE_CHANNEL_PREFIX.length),
+        },
+        409,
+      )
+    }
     if (ticket.status !== 'pending') {
       return c.json({ error: 'Ticket already resolved', status: ticket.status }, 409)
     }
@@ -195,6 +209,18 @@ export function createApprovalApp(
     if (!ticket) {
       return c.json({ error: 'Ticket not found' }, 404)
     }
+    if (ticket.channel_name.startsWith(NATIVE_CHANNEL_PREFIX)) {
+      // Adapter-owned approval (issue #12, D10): a dashboard decision cannot
+      // reach the adapter's native UI, so resolving here would record control
+      // that never propagated. Direct the operator to the owning surface.
+      return c.json(
+        {
+          error: 'native_ticket',
+          resolve_in: ticket.channel_name.slice(NATIVE_CHANNEL_PREFIX.length),
+        },
+        409,
+      )
+    }
     if (ticket.status !== 'pending') {
       return c.json({ error: 'Ticket already resolved', status: ticket.status }, 409)
     }
@@ -228,6 +254,18 @@ export function createApprovalApp(
     if (!ticket) {
       return c.json({ error: 'Ticket not found' }, 404)
     }
+    if (ticket.channel_name.startsWith(NATIVE_CHANNEL_PREFIX)) {
+      // Adapter-owned approval (issue #12, D10): a dashboard decision cannot
+      // reach the adapter's native UI, so resolving here would record control
+      // that never propagated. Direct the operator to the owning surface.
+      return c.json(
+        {
+          error: 'native_ticket',
+          resolve_in: ticket.channel_name.slice(NATIVE_CHANNEL_PREFIX.length),
+        },
+        409,
+      )
+    }
     if (ticket.status !== 'pending') {
       return c.json({ error: 'Ticket already resolved', status: ticket.status }, 409)
     }
diff --git a/packages/proxy/src/approval/queue.ts b/packages/proxy/src/approval/queue.ts
index 602b1f4..f1e3103 100644
--- a/packages/proxy/src/approval/queue.ts
+++ b/packages/proxy/src/approval/queue.ts
@@ -129,7 +129,8 @@ export class ApprovalQueue {
       | 'timeout'
       | 'break_glass'
       | 'client_disconnected'
-      | 'shutdown_cancelled',
+      | 'shutdown_cancelled'
+      | 'cancelled',
     resolvedBy?: string,
     options?: { denial_reason?: string; break_glass_reason?: string },
   ): boolean {
diff --git a/packages/proxy/src/approval/router.test.ts b/packages/proxy/src/approval/router.test.ts
index f987aab..edab058 100644
--- a/packages/proxy/src/approval/router.test.ts
+++ b/packages/proxy/src/approval/router.test.ts
@@ -747,4 +747,107 @@ describe('ApprovalRouter', () => {
       expect(outcome.status).toBe('denied')
     })
   })
+
+  // -----------------------------------------------------------------------
+  // Native (adapter-owned) tickets — issue #12, D10
+  // -----------------------------------------------------------------------
+
+  describe('native tickets', () => {
+    function createNativeRouter() {
+      const queue = new ApprovalQueue({ cleanupIntervalMs: 0 })
+      const channel = mockChannel()
+      const submitted: ApprovalTicket[] = []
+      const resolved: ApprovalTicket[] = []
+      const router = new ApprovalRouter({
+        defaultTimeoutMs: 300_000,
+        defaultOnTimeout: 'deny',
+        channels: new Map([['dashboard', channel]]),
+        queue,
+        onSubmit: (t) => submitted.push(t),
+        onResolve: (t) => resolved.push(t),
+      })
+      return { router, queue, channel, submitted, resolved }
+    }
+
+    it('creates a native:<origin> ticket and fires onSubmit but not the channel', () => {
+      const { router, queue, channel, submitted } = createNativeRouter()
+
+      const ticket = router.createNativeTicket({
+        tool_name: 'send',
+        tool_input: { text: 'hi' },
+        matched_rule: makeRule(),
+        session_id: 's1',
+        origin: 'openclaw',
+      })
+
+      expect(ticket.channel_name).toBe('native:openclaw')
+      expect(queue.get(ticket.id)?.status).toBe('pending')
+      expect(submitted).toHaveLength(1) // approval_requested SSE flows
+      expect(channel.calls).toHaveLength(0) // no double-notify
+    })
+
+    it('does not start a timeout timer (no auto-resolution)', () => {
+      vi.useFakeTimers()
+      const { router, queue } = createNativeRouter()
+      const ticket = router.createNativeTicket({
+        tool_name: 'send',
+        tool_input: {},
+        matched_rule: makeRule({ approval: { channel: 'dashboard', timeoutMs: 1000 } }),
+        session_id: null,
+        origin: 'openclaw',
+      })
+      vi.advanceTimersByTime(5000)
+      expect(queue.get(ticket.id)?.status).toBe('pending') // never auto-timed-out
+    })
+
+    it('resolveNativeTicket resolves and fires onResolve', () => {
+      const { router, queue, resolved } = createNativeRouter()
+      const ticket = router.createNativeTicket({
+        tool_name: 'send',
+        tool_input: {},
+        matched_rule: makeRule(),
+        session_id: null,
+        origin: 'openclaw',
+      })
+
+      const ok = router.resolveNativeTicket(ticket.id, 'approved', 'telegram:@oli')
+      expect(ok).toBe(true)
+      expect(queue.get(ticket.id)?.status).toBe('approved')
+      expect(queue.get(ticket.id)?.resolved_by).toBe('telegram:@oli')
+      expect(resolved).toHaveLength(1)
+    })
+
+    it('supports the cancelled resolution', () => {
+      const { router, queue } = createNativeRouter()
+      const ticket = router.createNativeTicket({
+        tool_name: 'send',
+        tool_input: {},
+        matched_rule: makeRule(),
+        session_id: null,
+        origin: 'openclaw',
+      })
+      expect(router.resolveNativeTicket(ticket.id, 'cancelled')).toBe(true)
+      expect(queue.get(ticket.id)?.status).toBe('cancelled')
+    })
+
+    it('refuses to resolve a router-managed (submit) ticket as native', async () => {
+      const { router, queue } = createNativeRouter()
+      const promise = router.submit(submitParams())
+      const ticketId = queue.listPending()[0]?.id as string
+
+      // The submit ticket is not native; resolveNativeTicket must refuse it so
+      // the held promise is never left hanging.
+      expect(router.resolveNativeTicket(ticketId, 'approved', 'x')).toBe(false)
+      expect(queue.get(ticketId)?.status).toBe('pending')
+
+      // Clean up the still-pending promise.
+      router.approve(ticketId, 'admin')
+      await promise
+    })
+
+    it('returns false for an unknown ticket id', () => {
+      const { router } = createNativeRouter()
+      expect(router.resolveNativeTicket('nope', 'approved')).toBe(false)
+    })
+  })
 })
diff --git a/packages/proxy/src/approval/router.ts b/packages/proxy/src/approval/router.ts
index 40ab994..537255d 100644
--- a/packages/proxy/src/approval/router.ts
+++ b/packages/proxy/src/approval/router.ts
@@ -16,6 +16,14 @@ import type { CompiledPolicyRule } from '../policy/types.js'
 // human decision arrives (via REST API) or the timeout fires.
 // ---------------------------------------------------------------------------
 
+/**
+ * Channel-name prefix marking a ticket as adapter-owned (native sideband
+ * approval). The full channel_name is `native:<origin>` (e.g.
+ * `native:openclaw`). Dashboard approve/deny endpoints refuse these tickets;
+ * only the adapter can resolve them via the sideband. (issue #12, D10.)
+ */
+export const NATIVE_CHANNEL_PREFIX = 'native:'
+
 /** Internal state for a pending approval. */
 interface PendingApproval {
   readonly resolve: (outcome: ApprovalOutcome) => void
@@ -59,6 +67,21 @@ export interface ApprovalSubmitParams {
   readonly session_id: string | null
 }
 
+/** Resolution statuses a native (sideband) ticket can be moved to. */
+export type NativeResolution = 'approved' | 'denied' | 'timeout' | 'cancelled'
+
+/** Parameters for creating a native (adapter-owned) approval ticket. */
+export interface NativeTicketParams {
+  readonly tool_name: string
+  readonly tool_input: Record<string, unknown>
+  readonly matched_rule: CompiledPolicyRule | undefined
+  readonly session_id: string | null
+  /** Adapter origin (e.g. "openclaw"); stored as channel_name `native:<origin>`. */
+  readonly origin: string
+  /** Ticket timeout in ms (rule timeout, else the router default). */
+  readonly timeout_ms?: number
+}
+
 export class ApprovalRouter {
   private readonly defaultTimeoutMs: number
   readonly defaultOnTimeout: 'allow' | 'deny'
@@ -194,6 +217,70 @@ export class ApprovalRouter {
     return outcome
   }
 
+  /**
+   * Create a native (adapter-owned) approval ticket without holding a Promise.
+   *
+   * Used by the sideband governance path (issue #12, D10): when `/evaluate`
+   * yields `require_approval`, the adapter runs the approval in its own UI
+   * (e.g. OpenClaw's Telegram dialog), so Helio must NOT block, start
+   * timeout/escalation timers, or notify a channel — doing so would
+   * double-notify. We still create the queue ticket and fire `onSubmit` so the
+   * dashboard's `approval_requested` SSE event flows and the ticket is visible.
+   *
+   * The ticket's `channel_name` is `native:<origin>`, which marks it as
+   * adapter-owned: the dashboard approve/deny endpoints refuse it (it can only
+   * be resolved through the adapter), and {@link resolveNativeTicket} is the
+   * resolution path.
+   */
+  createNativeTicket(params: NativeTicketParams): ApprovalTicket {
+    const rule = params.matched_rule
+    const timeoutMs = params.timeout_ms ?? rule?.approval?.timeoutMs ?? this.defaultTimeoutMs
+
+    const ticket = this.queue.add({
+      tool_name: params.tool_name,
+      tool_input: params.tool_input,
+      matched_rule: rule?.name ?? null,
+      rule_index: rule?.index ?? null,
+      channel_name: `${NATIVE_CHANNEL_PREFIX}${params.origin}`,
+      session_id: params.session_id,
+      timeout_ms: timeoutMs,
+    })
+
+    this.onSubmit?.(ticket)
+    return ticket
+  }
+
+  /**
+   * Resolve a native ticket created by {@link createNativeTicket}.
+   *
+   * Resolves the queue ticket and fires `onResolve` (→ `approval_resolved`
+   * SSE), with no held Promise to settle. Refuses tickets that have a pending
+   * router Promise (those are MCP-path tickets; resolving them here would leave
+   * the held request hanging) and tickets that are not `native:`-prefixed.
+   *
+   * @returns `true` if resolved, `false` if not found, already resolved, not a
+   *          native ticket, or router-managed.
+   */
+  resolveNativeTicket(
+    ticketId: string,
+    status: NativeResolution,
+    resolvedBy?: string,
+    options?: { denial_reason?: string },
+  ): boolean {
+    if (this.pending.has(ticketId)) return false // router-managed; not ours
+    const ticket = this.queue.get(ticketId)
+    if (!ticket || !ticket.channel_name.startsWith(NATIVE_CHANNEL_PREFIX)) return false
+
+    const resolved = this.queue.resolve(ticketId, status, resolvedBy, {
+      denial_reason: options?.denial_reason,
+    })
+    if (!resolved) return false
+
+    const updated = this.queue.get(ticketId)
+    if (updated) this.onResolve?.(updated)
+    return true
+  }
+
   /**
    * Approve a pending ticket. Resolves the held Promise so the governed
    * forwarder can forward the request upstream.
@@ -241,6 +328,11 @@ export class ApprovalRouter {
     })
   }
 
+  /** Look up a ticket by id (delegates to the queue). */
+  getTicket(ticketId: string): ApprovalTicket | undefined {
+    return this.queue.get(ticketId)
+  }
+
   /** Clean up all pending timers and resolve all pending promises. */
   close(): void {
     this.closed = true
diff --git a/packages/proxy/src/approval/types.ts b/packages/proxy/src/approval/types.ts
index 0fc86fe..027d597 100644
--- a/packages/proxy/src/approval/types.ts
+++ b/packages/proxy/src/approval/types.ts
@@ -16,6 +16,13 @@ export type ApprovalStatus =
   | 'break_glass'
   | 'client_disconnected'
   | 'shutdown_cancelled'
+  /**
+   * The approver's native UI cancelled the request (e.g. an OpenClaw
+   * `/approve` dialog dismissed without a decision). Distinct from
+   * `client_disconnected`, which the router reserves for the requesting agent
+   * aborting the held MCP request. Sideband (native) approvals only. (#12.)
+   */
+  | 'cancelled'
 
 /** A failed attempt to deliver an approval notification. */
 export interface ApprovalNotificationFailure {
diff --git a/packages/proxy/src/audit/csv.test.ts b/packages/proxy/src/audit/csv.test.ts
index a7deab7..ec348c0 100644
--- a/packages/proxy/src/audit/csv.test.ts
+++ b/packages/proxy/src/audit/csv.test.ts
@@ -31,6 +31,9 @@ function makeRecord(overrides: Partial<AuditRecord> = {}): AuditRecord {
     proxy_compute_ms: 1.2,
     flagged_destructive: false,
     dry_run: false,
+    record_kind: 'tool_call',
+    origin: 'mcp',
+    metadata: null,
     created_at: '2025-01-15T10:00:00.100Z',
   }
   return {
diff --git a/packages/proxy/src/audit/store.test.ts b/packages/proxy/src/audit/store.test.ts
index 14cfdc6..499ae29 100644
--- a/packages/proxy/src/audit/store.test.ts
+++ b/packages/proxy/src/audit/store.test.ts
@@ -36,6 +36,9 @@ function makeRecord(overrides: Partial<InsertRecord> = {}): InsertRecord {
     proxy_compute_ms: 5,
     flagged_destructive: false,
     dry_run: false,
+    record_kind: 'tool_call',
+    origin: 'mcp',
+    metadata: null,
   }
   return {
     ...defaults,
@@ -283,6 +286,32 @@ CREATE TABLE IF NOT EXISTS audit_records (
       expect(record.tool_input).toEqual(input)
     })
 
+    it('round-trips record_kind, origin, and metadata (issue #12)', () => {
+      const metadata = { channel_id: 'C1', sender_id: 'U7', nested: { a: 1 } }
+      const record = insertAndGet(
+        store,
+        makeRecord({ record_kind: 'install_scan', origin: 'openclaw', metadata }),
+      )
+      expect(record.record_kind).toBe('install_scan')
+      expect(record.origin).toBe('openclaw')
+      expect(record.metadata).toEqual(metadata)
+    })
+
+    it('stores null metadata as null, not the string "null"', () => {
+      const record = insertAndGet(store, makeRecord({ metadata: null }))
+      expect(record.metadata).toBeNull()
+    })
+
+    it('filters by record_kind and origin', () => {
+      store.insert(makeRecord({ record_kind: 'tool_call', origin: 'mcp' }))
+      store.insert(makeRecord({ record_kind: 'install_scan', origin: 'openclaw' }))
+      store.insert(makeRecord({ record_kind: 'install_scan', origin: 'openclaw' }))
+
+      expect(store.list({ record_kind: 'install_scan' }).total).toBe(2)
+      expect(store.list({ origin: 'mcp' }).total).toBe(1)
+      expect(store.list({ origin: 'openclaw' }).total).toBe(2)
+    })
+
     it('stores upstream_response when includeResponses is true', () => {
       const response = { status: 'ok', data: [1, 2] }
       const record = insertAndGet(store, makeRecord({ upstream_response: response }))
diff --git a/packages/proxy/src/audit/store.ts b/packages/proxy/src/audit/store.ts
index 7c9c7b1..d8880c7 100644
--- a/packages/proxy/src/audit/store.ts
+++ b/packages/proxy/src/audit/store.ts
@@ -50,6 +50,9 @@ CREATE TABLE IF NOT EXISTS audit_records (
   proxy_compute_ms  REAL NOT NULL,
   flagged_destructive INTEGER NOT NULL DEFAULT 0,
   dry_run           INTEGER NOT NULL DEFAULT 0,
+  record_kind       TEXT NOT NULL DEFAULT 'tool_call',
+  origin            TEXT NOT NULL DEFAULT 'mcp',
+  metadata          TEXT,
   created_at        TEXT NOT NULL
 );
 `
@@ -61,6 +64,8 @@ CREATE INDEX IF NOT EXISTS idx_audit_policy_decision  ON audit_records (policy_d
 CREATE INDEX IF NOT EXISTS idx_audit_session_id       ON audit_records (session_id);
 CREATE INDEX IF NOT EXISTS idx_audit_block_reason     ON audit_records (block_reason);
 CREATE INDEX IF NOT EXISTS idx_audit_upstream_status_created_at ON audit_records (upstream_http_status, created_at);
+CREATE INDEX IF NOT EXISTS idx_audit_record_kind     ON audit_records (record_kind);
+CREATE INDEX IF NOT EXISTS idx_audit_origin          ON audit_records (origin);
 `
 
 const INSERT_SQL = `
@@ -70,14 +75,14 @@ INSERT INTO audit_records (
   approved_by, upstream_response, upstream_error, upstream_latency_ms,
   upstream_http_status,
   total_duration_ms, approval_wait_ms, proxy_compute_ms,
-  flagged_destructive, dry_run, created_at
+  flagged_destructive, dry_run, record_kind, origin, metadata, created_at
 ) VALUES (
   @id, @timestamp, @session_id, @agent_id, @environment, @tool_name, @tool_input,
   @policy_decision, @block_reason, @matched_rule, @matched_rule_index, @evidence_chain, @approval_status,
   @approved_by, @upstream_response, @upstream_error, @upstream_latency_ms,
   @upstream_http_status,
   @total_duration_ms, @approval_wait_ms, @proxy_compute_ms,
-  @flagged_destructive, @dry_run, @created_at
+  @flagged_destructive, @dry_run, @record_kind, @origin, @metadata, @created_at
 )
 `
 
@@ -89,6 +94,9 @@ const REQUIRED_AUDIT_COLUMNS = [
   'approval_wait_ms',
   'proxy_compute_ms',
   'upstream_http_status',
+  'record_kind',
+  'origin',
+  'metadata',
 ] as const
 
 // ---------------------------------------------------------------------------
@@ -119,6 +127,9 @@ interface RawAuditRow {
   proxy_compute_ms: number
   flagged_destructive: number
   dry_run: number
+  record_kind: string
+  origin: string
+  metadata: string | null
   created_at: string
 }
 
@@ -155,6 +166,9 @@ function deserializeRow(row: RawAuditRow): AuditRecord {
     proxy_compute_ms: row.proxy_compute_ms,
     flagged_destructive: row.flagged_destructive === 1,
     dry_run: row.dry_run === 1,
+    record_kind: row.record_kind as AuditRecord['record_kind'],
+    origin: row.origin,
+    metadata: row.metadata ? (JSON.parse(row.metadata) as Record<string, unknown>) : null,
     created_at: row.created_at,
   }
 }
@@ -181,6 +195,14 @@ function buildWhereClause(filters: AuditQueryFilters): {
   if (filters.blocked !== undefined) {
     conditions.push(filters.blocked ? 'block_reason IS NOT NULL' : 'block_reason IS NULL')
   }
+  if (filters.record_kind !== undefined) {
+    conditions.push('record_kind = ?')
+    params.push(filters.record_kind)
+  }
+  if (filters.origin !== undefined) {
+    conditions.push('origin = ?')
+    params.push(filters.origin)
+  }
   if (filters.session_id !== undefined) {
     conditions.push('session_id = ?')
     params.push(filters.session_id)
@@ -364,6 +386,9 @@ export class AuditStore {
       proxy_compute_ms: record.proxy_compute_ms,
       flagged_destructive: record.flagged_destructive ? 1 : 0,
       dry_run: record.dry_run ? 1 : 0,
+      record_kind: record.record_kind,
+      origin: record.origin,
+      metadata: record.metadata ? JSON.stringify(record.metadata) : null,
       created_at: now,
     })
 
diff --git a/packages/proxy/src/audit/types.ts b/packages/proxy/src/audit/types.ts
index 741153a..5f26554 100644
--- a/packages/proxy/src/audit/types.ts
+++ b/packages/proxy/src/audit/types.ts
@@ -55,6 +55,27 @@ export interface AuditRecord {
   readonly flagged_destructive: boolean
   /** Whether this record was produced in dry-run mode. */
   readonly dry_run: boolean
+  /**
+   * Record category discriminator (issues #12/#16). `'tool_call'` is the
+   * default for governed tool calls; `'drift_event'` for tool-definition drift
+   * records; `'install_scan'` for sideband install evaluations; and
+   * `'evaluation_expired'` for sideband evaluations whose `/audit` never
+   * arrived (the bypass/tamper signal — block_reason stays null so they do not
+   * count as enforcement blocks).
+   */
+  readonly record_kind: 'tool_call' | 'drift_event' | 'install_scan' | 'evaluation_expired'
+  /**
+   * Enforcement origin: `'mcp'` for the proxy path, or an adapter-supplied
+   * origin string (e.g. `'openclaw'`) for sideband-governed calls. Surfaces
+   * the enforcement-grade ladder (structural vs host-enforced) per record.
+   */
+  readonly origin: string
+  /**
+   * Adapter-supplied context object (reserved keys: `channel_id`, `sender_id`,
+   * `sender_name`, `conversation_id`). Null for MCP-origin records. Backs
+   * `match.metadata.*` (#13) and the dashboard metadata columns (#16).
+   */
+  readonly metadata: Record<string, unknown> | null
   /** ISO 8601 timestamp of when the record was persisted. */
   readonly created_at: string
 }
@@ -85,6 +106,10 @@ export interface AuditQueryFilters {
   readonly flagged_destructive?: boolean
   /** Include only dry-run records (true) or non-dry-run records (false). */
   readonly dry_run?: boolean
+  /** Filter by record kind (tool_call / drift_event / install_scan / evaluation_expired). */
+  readonly record_kind?: string
+  /** Filter by enforcement origin (e.g. 'mcp', 'openclaw'). */
+  readonly origin?: string
   /** Include only records where upstream HTTP status is >= this value. */
   readonly upstream_status_min?: number
   /** Include only records where upstream HTTP status is <= this value. */
diff --git a/packages/proxy/src/audit/writer.test.ts b/packages/proxy/src/audit/writer.test.ts
index 698392e..d630912 100644
--- a/packages/proxy/src/audit/writer.test.ts
+++ b/packages/proxy/src/audit/writer.test.ts
@@ -33,6 +33,9 @@ function makeRecord(overrides: Partial<InsertRecord> = {}): InsertRecord {
     proxy_compute_ms: 5,
     flagged_destructive: false,
     dry_run: false,
+    record_kind: 'tool_call',
+    origin: 'mcp',
+    metadata: null,
   }
   return {
     ...defaults,
diff --git a/packages/proxy/src/audit/writer.ts b/packages/proxy/src/audit/writer.ts
index cf322ee..6ecc6c2 100644
--- a/packages/proxy/src/audit/writer.ts
+++ b/packages/proxy/src/audit/writer.ts
@@ -80,9 +80,8 @@ export class AuditWriter {
    * is scheduled. This keeps request-path latency bounded even under bursty
    * write load.
    */
-  push(record: Omit<AuditRecord, 'id' | 'created_at'>): void {
+  push(record: Omit<AuditRecord, 'id' | 'created_at'>, id: string = randomUUID()): void {
     if (this.closed) return
-    const id = randomUUID()
     this.buffer.push({ id, record })
     this.onPush?.(record, id)
     if (this.buffer.length >= this.bufferSize) {
@@ -98,9 +97,8 @@ export class AuditWriter {
    * A fatal-process crash still invokes the crash-drain hook, which calls
    * `flush()` synchronously before exit.
    */
-  pushImmediate(record: Omit<AuditRecord, 'id' | 'created_at'>): void {
+  pushImmediate(record: Omit<AuditRecord, 'id' | 'created_at'>, id: string = randomUUID()): void {
     if (this.closed) return
-    const id = randomUUID()
     this.buffer.push({ id, record })
     this.onPush?.(record, id)
     this.scheduleFlushSoon()
diff --git a/packages/proxy/src/cli.test.ts b/packages/proxy/src/cli.test.ts
index d7614b6..62ebb66 100644
--- a/packages/proxy/src/cli.test.ts
+++ b/packages/proxy/src/cli.test.ts
@@ -1077,6 +1077,9 @@ audit:
         proxy_compute_ms: 2,
         flagged_destructive: false,
         dry_run: false,
+        record_kind: 'tool_call',
+        origin: 'mcp',
+        metadata: null,
       }
       return {
         ...defaults,
diff --git a/packages/proxy/src/cli.ts b/packages/proxy/src/cli.ts
index b25d7d4..751a855 100644
--- a/packages/proxy/src/cli.ts
+++ b/packages/proxy/src/cli.ts
@@ -14,6 +14,7 @@ import { GovernedForwarder } from './policy/governed-forwarder.js'
 import type { AnnotationCachePrimeResult } from './policy/governed-forwarder.js'
 import { AuditStore, AuditWriter } from './audit/index.js'
 import { EvidenceStore, createSidebandApp } from './evidence/index.js'
+import { GovernanceService } from './sideband/governance-service.js'
 import {
   ApprovalQueue,
   ApprovalRouter,
@@ -443,6 +444,8 @@ async function startCommand(configPath: string, options: StartOptions): Promise<
   let sidebandHandle: ServerHandle | undefined
   let sidebandToken: string | undefined
   let sidebandTokenSource: 'generated' | 'env' | undefined
+  let adapterToken: string | undefined
+  let governanceService: GovernanceService | undefined
   if (config.sdk.enabled) {
     sidebandToken = process.env['HELIO_SDK_TOKEN']
     if (!sidebandToken || sidebandToken.length === 0) {
@@ -452,7 +455,37 @@ async function startCommand(configPath: string, options: StartOptions): Promise<
     } else {
       sidebandTokenSource = 'env'
     }
-    const sidebandApp = createSidebandApp(evidenceStore, { token: sidebandToken })
+
+    // Governance routes carry a SEPARATE adapter-scope token (issue #12, D1):
+    // an SDK client must not be able to drive policy decisions, nor an adapter
+    // write evidence it was not granted. Same per-boot/env provisioning as the
+    // SDK token.
+    adapterToken = process.env['HELIO_ADAPTER_TOKEN']
+    if (!adapterToken || adapterToken.length === 0) {
+      adapterToken = randomBytes(32).toString('hex')
+      process.env['HELIO_ADAPTER_TOKEN'] = adapterToken
+    }
+
+    // The service reuses the SAME limiter/queue/router/writer instances as the
+    // MCP path — one budget, both doors. A counter consumed via /audit is
+    // visible to a subsequent MCP tools/call and vice versa.
+    governanceService = new GovernanceService({
+      policy,
+      environment: config.environment,
+      evidenceStore,
+      approvalRouter,
+      rateLimiter,
+      spendLimiter,
+      auditWriter,
+      approvalTimeoutMs: parseDuration(config.approval.timeout),
+      ttlMs: parseDuration(config.sdk.evaluation_ttl),
+    })
+
+    const sidebandApp = createSidebandApp(evidenceStore, {
+      token: sidebandToken,
+      adapterToken,
+      governance: governanceService,
+    })
     sidebandHandle = startSidebandServer(sidebandApp, config.sdk.port, config.sdk.host)
   }
 
@@ -508,6 +541,11 @@ async function startCommand(configPath: string, options: StartOptions): Promise<
         `SDK token (${source}; pass as HELIO_SDK_TOKEN env var to your SDK clients):\n  ${sidebandToken}`,
       )
     }
+    if (adapterToken) {
+      console.error(
+        `Adapter token (governance routes; pass as HELIO_ADAPTER_TOKEN to your adapter):\n  ${adapterToken}`,
+      )
+    }
   }
   if (dashboardHandle) {
     console.error(
@@ -541,6 +579,7 @@ async function startCommand(configPath: string, options: StartOptions): Promise<
       initialConfig: config,
       onPolicyReload: (newPolicy, reloadWarnings, restartRequiredPaths) => {
         governedForwarder.updatePolicy(newPolicy)
+        governanceService?.updatePolicy(newPolicy)
         const count = newPolicy.rules.length
         console.error(
           `[helio] Policy reloaded: ${String(count)} rule${count !== 1 ? 's' : ''} (default: ${newPolicy.defaultAction})`,
@@ -587,6 +626,7 @@ async function startCommand(configPath: string, options: StartOptions): Promise<
     closeDashboardApp,
     dashboardHandle,
     eventBus,
+    governanceService,
   )
 }
 
@@ -745,6 +785,7 @@ function registerShutdown(
   closeDashboardApp?: () => void,
   dashboardHandle?: ServerHandle,
   eventBus?: DashboardEventBus,
+  governanceService?: GovernanceService,
 ): void {
   let isShuttingDown = false
   const shutdown = () => {
@@ -762,6 +803,7 @@ function registerShutdown(
       if (configWatcher) configWatcher.close()
       if (closeDashboardApp) closeDashboardApp()
       if (eventBus) eventBus.close()
+      if (governanceService) governanceService.close()
       if (rateLimiter) rateLimiter.close()
       if (spendLimiter) spendLimiter.close()
       if (approvalRouter) approvalRouter.close()
diff --git a/packages/proxy/src/config/schema.ts b/packages/proxy/src/config/schema.ts
index 3c5c1f3..0312b5e 100644
--- a/packages/proxy/src/config/schema.ts
+++ b/packages/proxy/src/config/schema.ts
@@ -316,6 +316,13 @@ const sdkSchema = z.object({
   enabled: z.boolean().default(false),
   port: z.number().int().min(1).max(65535).default(3200),
   host: z.string().default('127.0.0.1'),
+  /**
+   * How long a sideband `/evaluate` decision waits for its `/audit` before the
+   * proxy finalizes it as `evaluation_expired` (issue #12, D4). Bounds the
+   * pending-evaluation registry; an adapter crash cannot silently drop a
+   * decided-allowed call from the trail.
+   */
+  evaluation_ttl: durationSchema.default('10m'),
 })
 
 // ---------------------------------------------------------------------------
diff --git a/packages/proxy/src/dashboard/api.test.ts b/packages/proxy/src/dashboard/api.test.ts
index ef3bb0b..7ec41a8 100644
--- a/packages/proxy/src/dashboard/api.test.ts
+++ b/packages/proxy/src/dashboard/api.test.ts
@@ -113,6 +113,9 @@ function insertAuditRecord(
     proxy_compute_ms: 1,
     flagged_destructive: false,
     dry_run: false,
+    record_kind: 'tool_call',
+    origin: 'mcp',
+    metadata: null,
   }
 
   return store.insert(
diff --git a/packages/proxy/src/evidence/api.ts b/packages/proxy/src/evidence/api.ts
index 078a599..7c8fcee 100644
--- a/packages/proxy/src/evidence/api.ts
+++ b/packages/proxy/src/evidence/api.ts
@@ -1,8 +1,14 @@
 import { Hono } from 'hono'
+import { bodyLimit } from 'hono/body-limit'
 import { z } from 'zod'
 import type { EvidenceStore } from './store.js'
 import { verifyBearer } from '../auth/bearer.js'
 import { formatZodErrors } from '../util/format-zod-errors.js'
+import type { GovernanceService } from '../sideband/governance-service.js'
+import { createGovernanceApp, isGovernancePath } from '../sideband/governance-api.js'
+
+/** Max request body accepted on the sideband (issue #12, D1/F7). */
+const SIDEBAND_BODY_LIMIT_BYTES = 1 * 1_024 * 1_024
 
 // ---------------------------------------------------------------------------
 // Request body schemas
@@ -29,12 +35,24 @@ const postContextBody = z.object({
 /** Options for constructing the SDK sideband Hono app. */
 export interface SidebandAppOptions {
   /**
-   * Bearer token that every request must carry in its `Authorization`
-   * header (except `GET /healthz`). When omitted or empty, the sideband
-   * runs open — useful for local development when the operator has
-   * explicitly disabled the per-boot token via env.
+   * Bearer token for the evidence/context/session routes (the SDK scope).
+   * When omitted or empty, those routes run open — useful for local
+   * development when the operator has disabled the per-boot token via env.
    */
   readonly token?: string
+  /**
+   * Bearer token for the governance routes (the adapter scope, issue #12/F6).
+   * Distinct from `token` so an SDK client cannot drive policy decisions and
+   * an adapter cannot write evidence it was not granted. When omitted, the
+   * governance routes run open (and only mount if `governance` is provided).
+   */
+  readonly adapterToken?: string
+  /**
+   * The governance service backing `/evaluate`, `/audit`, `/install-scan`, and
+   * `/approval/:id/resolve`. When omitted, those routes return 503
+   * `governance_unavailable` (evidence-only deployments, and existing tests).
+   */
+  readonly governance?: GovernanceService
 }
 
 /**
@@ -60,7 +78,9 @@ export interface SidebandAppOptions {
  */
 export function createSidebandApp(store: EvidenceStore, options: SidebandAppOptions = {}): Hono {
   const app = new Hono()
-  const token = options.token && options.token.length > 0 ? options.token : undefined
+  const sdkToken = options.token && options.token.length > 0 ? options.token : undefined
+  const adapterToken =
+    options.adapterToken && options.adapterToken.length > 0 ? options.adapterToken : undefined
 
   // CORS guard — applied globally, fires before auth. The SDK never sends
   // an Origin header; any request that does is browser-originated traffic
@@ -76,26 +96,40 @@ export function createSidebandApp(store: EvidenceStore, options: SidebandAppOpti
     await next()
   })
 
-  // Bearer auth — applied globally when a token is configured, but
-  // `/healthz` stays open so container/orchestrator probes can succeed
-  // without the secret.
-  if (token) {
-    app.use('*', async (c, next) => {
-      if (c.req.path === '/healthz') {
-        await next()
-        return
-      }
-      const authHeader = c.req.header('authorization')
-      if (!verifyBearer(authHeader, token)) {
-        return c.json({ error: 'Unauthorized' }, 401)
-      }
+  // Body-size limit — bounds memory before any handler parses a body
+  // (issue #12, F7). 1 MiB is generous for evidence payloads and tool inputs;
+  // per-field caps (4 KiB metadata, 64 KiB tool_input) are enforced downstream.
+  app.use(
+    '*',
+    bodyLimit({
+      maxSize: SIDEBAND_BODY_LIMIT_BYTES,
+      onError: (c) => c.json({ error: 'request_body_too_large' }, 413),
+    }),
+  )
+
+  // Scoped bearer auth — `/healthz` stays open for probes. Governance routes
+  // require the adapter-scope token; everything else requires the SDK-scope
+  // token. A scope whose token is unset runs open (local dev / disabled-token
+  // posture), matching the prior single-token behavior.
+  app.use('*', async (c, next) => {
+    if (c.req.path === '/healthz') {
       await next()
-    })
-  }
+      return
+    }
+    const expected = isGovernancePath(c.req.path) ? adapterToken : sdkToken
+    if (expected && !verifyBearer(c.req.header('authorization'), expected)) {
+      return c.json({ error: 'Unauthorized' }, 401)
+    }
+    await next()
+  })
 
   // Health check
   app.get('/healthz', (c) => c.json({ status: 'ok' }))
 
+  // Governance routes (issue #12) — /evaluate, /audit, /install-scan,
+  // /approval/:id/resolve. Mounted at root; return 503 when no service.
+  app.route('/', createGovernanceApp(options.governance))
+
   // POST /evidence — SDK reports evidence from a tool output
   app.post('/evidence', async (c) => {
     let body: unknown
diff --git a/packages/proxy/src/index.ts b/packages/proxy/src/index.ts
index bb5783d..27f4444 100644
--- a/packages/proxy/src/index.ts
+++ b/packages/proxy/src/index.ts
@@ -35,6 +35,16 @@ export type {
 } from './policy/index.js'
 export { EvidenceStore, createSidebandApp } from './evidence/index.js'
 export type { EvidenceEntry, SessionState, EvidenceStoreOptions } from './evidence/index.js'
+export { GovernanceService } from './sideband/governance-service.js'
+export { GovernanceConfigError } from './sideband/errors.js'
+export type {
+  GovernanceServiceOptions,
+  WireDecision,
+  EvaluateInput,
+  AuditInput,
+  InstallScanInput,
+  ResolveApprovalInput,
+} from './sideband/governance-service.js'
 export { AuditStore, AuditWriter } from './audit/index.js'
 export type {
   AuditRecord,
diff --git a/packages/proxy/src/policy/annotation-cache.test.ts b/packages/proxy/src/policy/annotation-cache.test.ts
index 728e065..26a8116 100644
--- a/packages/proxy/src/policy/annotation-cache.test.ts
+++ b/packages/proxy/src/policy/annotation-cache.test.ts
@@ -432,3 +432,69 @@ describe('baseline and drift', () => {
     expect(cache.isDrifted('t')).toBe(true)
   })
 })
+
+// ---------------------------------------------------------------------------
+// updateSingle() — incremental per-tool merge for the sideband governance
+// path (issue #12, D6). Must NOT wipe other tools' present/current state.
+// ---------------------------------------------------------------------------
+
+describe('updateSingle', () => {
+  it('baselines a single tool on first sight', () => {
+    const cache = new ToolAnnotationCache()
+    const r = cache.updateSingle({ name: 'send', annotations: { destructiveHint: true } })
+    expect(r.updated).toBe(true)
+    expect(r.baselined).toEqual(['send'])
+    expect(r.drifted).toEqual([])
+    expect(cache.has('send')).toBe(true)
+    expect(cache.get('send')).toEqual({ destructiveHint: true })
+  })
+
+  it('does not touch other tools’ current annotations (no wholesale wipe)', () => {
+    const cache = new ToolAnnotationCache()
+    cache.updateSingle({ name: 'a', annotations: { readOnlyHint: true } })
+    cache.updateSingle({ name: 'b', annotations: { destructiveHint: true } })
+
+    // A later single update for 'a' must leave 'b' present and queryable —
+    // the F3 regression: update() would have rebuilt present/current and
+    // dropped 'b' from the current-annotations map.
+    cache.updateSingle({ name: 'a', annotations: { readOnlyHint: true } })
+    expect(cache.has('b')).toBe(true)
+    expect(cache.getCurrent('b')).toEqual({ destructiveHint: true })
+    expect(cache.getCurrent('a')).toEqual({ readOnlyHint: true })
+  })
+
+  it('detects drift when a tool’s definition changes after baseline', () => {
+    const cache = new ToolAnnotationCache()
+    cache.updateSingle({ name: 'send', description: 'original' })
+    const r = cache.updateSingle({ name: 'send', description: 'tampered' })
+    expect(r.drifted).toHaveLength(1)
+    expect(r.drifted[0]?.toolName).toBe('send')
+    expect(r.drifted[0]?.changes.map((c) => c.aspect)).toContain('description')
+    expect(cache.isDrifted('send')).toBe(true)
+  })
+
+  it('reverts drift when the definition returns to baseline', () => {
+    const cache = new ToolAnnotationCache()
+    cache.updateSingle({ name: 'send', description: 'original' })
+    cache.updateSingle({ name: 'send', description: 'tampered' })
+    const r = cache.updateSingle({ name: 'send', description: 'original' })
+    expect(r.reverted).toEqual(['send'])
+    expect(cache.isDrifted('send')).toBe(false)
+  })
+
+  it('does not re-emit an unchanged drift', () => {
+    const cache = new ToolAnnotationCache()
+    cache.updateSingle({ name: 'send', description: 'original' })
+    cache.updateSingle({ name: 'send', description: 'tampered' })
+    const again = cache.updateSingle({ name: 'send', description: 'tampered' })
+    expect(again.drifted).toEqual([])
+    expect(cache.isDrifted('send')).toBe(true)
+  })
+
+  it('ignores malformed input', () => {
+    const cache = new ToolAnnotationCache()
+    expect(cache.updateSingle(null).updated).toBe(false)
+    expect(cache.updateSingle({ noName: true }).updated).toBe(false)
+    expect(cache.size).toBe(0)
+  })
+})
diff --git a/packages/proxy/src/policy/annotation-cache.ts b/packages/proxy/src/policy/annotation-cache.ts
index d907c1b..bed8d66 100644
--- a/packages/proxy/src/policy/annotation-cache.ts
+++ b/packages/proxy/src/policy/annotation-cache.ts
@@ -1,4 +1,5 @@
 import type { ToolAnnotationHints } from './types.js'
+import { canonicalize } from '../util/canonical-json.js'
 
 /** Aspects of a tool definition reported in drift events. */
 export type ToolDriftAspect =
@@ -184,6 +185,82 @@ export class ToolAnnotationCache {
     return { updated: true, baselined, drifted, reverted }
   }
 
+  /**
+   * Incrementally merge a single tool definition into the cache (issue #12, D6).
+   *
+   * Unlike {@link update}, this touches only the named tool: it adds to (never
+   * rebuilds) the `present` set and `currentAnnotations` map. The sideband
+   * governance path feeds adapter-origin tools one definition at a time (each
+   * `/evaluate` carries at most one), so routing them through the whole-list
+   * `update()` would wipe every other tool's current-annotation snapshot on
+   * each call and silently degrade the stricter-of-both log-mode drift
+   * evaluation. The MCP whole-list path is unaffected — it keeps calling
+   * `update()`. Each origin owns its own cache instance, so the accumulate
+   * semantics here never mix with update()'s replace semantics.
+   *
+   * `toolDefinition` must already be in MCP shape (`inputSchema`/`outputSchema`
+   * camelCase); the governance service maps the wire `tool` object before
+   * calling. Returns the same result shape as `update()` (for one tool).
+   */
+  updateSingle(toolDefinition: unknown): ToolCacheUpdateResult {
+    if (typeof toolDefinition !== 'object' || toolDefinition === null) {
+      return { updated: false, baselined: [], drifted: [], reverted: [] }
+    }
+    const t = toolDefinition as Record<string, unknown>
+    const name = t['name']
+    if (typeof name !== 'string') {
+      return { updated: false, baselined: [], drifted: [], reverted: [] }
+    }
+
+    const baselined: string[] = []
+    const drifted: ToolDriftEvent[] = []
+    const reverted: string[] = []
+
+    this.present.add(name)
+    const annotations = extractAnnotations(t)
+    this.currentAnnotations.set(name, annotations)
+    const definitionKey = canonicalize(t)
+
+    const baseline = this.baselines.get(name)
+    if (!baseline) {
+      this.baselines.set(name, { definition: t, definitionKey, annotations })
+      baselined.push(name)
+      if (this.driftedTools.has(name)) {
+        this.driftedTools.delete(name)
+        reverted.push(name)
+      }
+      return { updated: true, baselined, drifted, reverted }
+    }
+
+    if (definitionKey === baseline.definitionKey) {
+      if (this.driftedTools.has(name)) {
+        this.driftedTools.delete(name)
+        reverted.push(name)
+      }
+      return { updated: true, baselined, drifted, reverted }
+    }
+
+    const changes: ToolDriftChange[] = []
+    for (const field of ASPECT_FIELDS) {
+      const baselineValue = baseline.definition[field]
+      const currentValue = t[field]
+      if (canonicalize(baselineValue) !== canonicalize(currentValue)) {
+        changes.push({ aspect: field, baseline: baselineValue, current: currentValue })
+      }
+    }
+    if (changes.length === 0) {
+      changes.push({ aspect: 'other', baseline: baseline.definition, current: t })
+    }
+
+    const event: ToolDriftEvent = { toolName: name, changes }
+    const existing = this.driftedTools.get(name)
+    const isNewDrift = !existing || canonicalize(existing.changes) !== canonicalize(changes)
+    this.driftedTools.set(name, event)
+    if (isNewDrift) drifted.push(event)
+
+    return { updated: true, baselined, drifted, reverted }
+  }
+
   /**
    * Get the **baseline** annotations for a tool — the definition first seen,
    * not the latest upstream claim. Returns `undefined` if the tool has no
@@ -226,38 +303,6 @@ function extractAnnotations(tool: Record<string, unknown>): ToolAnnotationHints
     : undefined
 }
 
-/** Deterministic JSON encoding with recursively sorted object keys. */
-function canonicalize(value: unknown): string {
-  // JSON.stringify returns undefined for top-level `undefined` at runtime even
-  // though its TS overloads type the return as `string` for non-undefined
-  // inputs. The explicit widening annotation keeps the coalesce legitimate.
-  const encoded: string | undefined = JSON.stringify(sortKeysDeep(value))
-  // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- see above: runtime diverges from TS overload
-  return encoded ?? ''
-}
-
-function sortKeysDeep(value: unknown): unknown {
-  if (Array.isArray(value)) return value.map(sortKeysDeep)
-  if (value !== null && typeof value === 'object') {
-    const source = value as Record<string, unknown>
-    const out: Record<string, unknown> = {}
-    for (const key of Object.keys(source).sort()) {
-      // Object.defineProperty (not assignment) so a JSON-parsed "__proto__"
-      // key becomes an own property instead of silently setting the
-      // prototype — otherwise content under that key never registers as
-      // drift, a blind spot in a security control.
-      Object.defineProperty(out, key, {
-        value: sortKeysDeep(source[key]),
-        enumerable: true,
-        writable: true,
-        configurable: true,
-      })
-    }
-    return out
-  }
-  return value
-}
-
 /**
  * Extract the tools array from a JSON-RPC response body.
  *
diff --git a/packages/proxy/src/policy/decision-pipeline.test.ts b/packages/proxy/src/policy/decision-pipeline.test.ts
new file mode 100644
index 0000000..cb0fce5
--- /dev/null
+++ b/packages/proxy/src/policy/decision-pipeline.test.ts
@@ -0,0 +1,211 @@
+import { describe, it, expect, vi } from 'vitest'
+import { decide } from './decision-pipeline.js'
+import type { DecideInput } from './decision-pipeline.js'
+import { compilePolicies } from './parser.js'
+import type { PoliciesConfig } from '../config/schema.js'
+import type { CompiledPolicy } from './types.js'
+import type { ToolDriftEvent } from './annotation-cache.js'
+import { EvidenceStore } from '../evidence/index.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function compile(config: Omit<PoliciesConfig, 'dry_run'> & { dry_run?: boolean }): CompiledPolicy {
+  return compilePolicies({ dry_run: false, ...config }).policy
+}
+
+function input(overrides: Partial<DecideInput> & { policy: CompiledPolicy }): DecideInput {
+  return {
+    toolName: 'send',
+    toolArguments: {},
+    sessionId: undefined,
+    environment: undefined,
+    evidenceStore: undefined,
+    baselineAnnotations: undefined,
+    currentAnnotations: undefined,
+    driftEvent: undefined,
+    ...overrides,
+  }
+}
+
+const drift = (aspects: string[]): ToolDriftEvent => ({
+  toolName: 'send',
+  changes: aspects.map((aspect) => ({ aspect: aspect as never, baseline: 1, current: 2 })),
+})
+
+// ---------------------------------------------------------------------------
+// decide()
+// ---------------------------------------------------------------------------
+
+describe('decide', () => {
+  it('returns the matched rule action', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'block-send', match: { tool: 'send' }, action: 'deny' }],
+    })
+    const r = decide(input({ policy }))
+    expect(r.decision.action).toBe('deny')
+    expect(r.decision.matchedRule?.name).toBe('block-send')
+  })
+
+  it('falls back to the default action with no matched rule', () => {
+    const policy = compile({ default: 'allow', rules: [] })
+    const r = decide(input({ policy }))
+    expect(r.decision.action).toBe('allow')
+    expect(r.decision.matchedRule).toBeUndefined()
+  })
+
+  describe('drift gate', () => {
+    it('block mode denies a drifted tool', () => {
+      const policy = compile({ default: 'allow', on_tool_drift: 'block', rules: [] })
+      const r = decide(input({ policy, driftEvent: drift(['description']) }))
+      expect(r.decision.action).toBe('deny')
+      expect(r.driftBlocked).toBe(true)
+    })
+
+    it('require_approval mode escalates a drifted tool', () => {
+      const policy = compile({ default: 'allow', on_tool_drift: 'require_approval', rules: [] })
+      const r = decide(input({ policy, driftEvent: drift(['inputSchema']) }))
+      expect(r.decision.action).toBe('require_approval')
+      expect(r.driftBlocked).toBe(false)
+    })
+
+    it('log mode keeps the stricter of baseline and current decisions', () => {
+      // Baseline annotations (readOnly) match an allow rule; current (destructive)
+      // matches a deny rule. Stricter (deny) must win.
+      const policy = compile({
+        default: 'allow',
+        on_tool_drift: 'log',
+        rules: [
+          { name: 'allow-ro', match: { annotations: { readOnlyHint: true } }, action: 'allow' },
+          {
+            name: 'deny-destructive',
+            match: { annotations: { destructiveHint: true } },
+            action: 'deny',
+          },
+        ],
+      })
+      const r = decide(
+        input({
+          policy,
+          driftEvent: drift(['annotations']),
+          baselineAnnotations: { readOnlyHint: true },
+          currentAnnotations: { destructiveHint: true },
+        }),
+      )
+      expect(r.decision.action).toBe('deny')
+      expect(r.driftBlocked).toBe(false) // log mode never sets driftBlocked
+    })
+  })
+
+  describe('flag_destructive', () => {
+    it('log mode warns but leaves the decision unchanged', () => {
+      const spy = vi.spyOn(console, 'error').mockImplementation(() => {})
+      const policy = compile({ default: 'allow', flag_destructive: 'log', rules: [] })
+      const r = decide(input({ policy, baselineAnnotations: { destructiveHint: true } }))
+      expect(r.flaggedDestructive).toBe(true)
+      expect(r.decision.action).toBe('allow')
+      expect(spy).toHaveBeenCalledOnce()
+      spy.mockRestore()
+    })
+
+    it('require_approval mode escalates an unguarded destructive tool', () => {
+      const policy = compile({
+        default: 'allow',
+        flag_destructive: 'require_approval',
+        rules: [],
+      })
+      const r = decide(input({ policy, baselineAnnotations: { destructiveHint: true } }))
+      expect(r.flaggedDestructive).toBe(true)
+      expect(r.decision.action).toBe('require_approval')
+    })
+
+    it('does not flag when a rule already matched', () => {
+      const policy = compile({
+        default: 'allow',
+        flag_destructive: 'require_approval',
+        rules: [{ name: 'explicit', match: { tool: 'send' }, action: 'allow' }],
+      })
+      const r = decide(input({ policy, baselineAnnotations: { destructiveHint: true } }))
+      expect(r.flaggedDestructive).toBe(false)
+      expect(r.decision.action).toBe('allow')
+    })
+  })
+
+  describe('evidence / session gating', () => {
+    it('denies an evidence-gated rule when no session id is present', () => {
+      const policy = compile({
+        default: 'deny',
+        rules: [
+          {
+            name: 'needs-evidence',
+            match: { tool: 'send' },
+            action: 'allow',
+            evidence: { requires: ['lookup'] },
+          },
+        ],
+      })
+      const r = decide(input({ policy, sessionId: undefined }))
+      expect(r.sessionBlocked).toBe(true)
+      expect(r.evidenceBlocked).toBe(true)
+      expect(r.decision.action).toBe('deny')
+    })
+
+    it('denies when required evidence is missing for a session', () => {
+      const store = new EvidenceStore()
+      store.setAllowedEvidenceKeys(['lookup'])
+      const policy = compile({
+        default: 'deny',
+        rules: [
+          {
+            name: 'needs-evidence',
+            match: { tool: 'send' },
+            action: 'allow',
+            evidence: { requires: ['lookup'] },
+          },
+        ],
+      })
+      const r = decide(input({ policy, sessionId: 'sess-1', evidenceStore: store }))
+      expect(r.evidenceBlocked).toBe(true)
+      expect(r.decision.action).toBe('deny')
+      expect(r.evidenceResult?.satisfied).toBe(false)
+      store.close()
+    })
+  })
+
+  describe('dry-run determination', () => {
+    it('flags a per-rule dry_run action', () => {
+      const policy = compile({
+        default: 'allow',
+        rules: [{ name: 'dr', match: { tool: 'send' }, action: 'dry_run' }],
+      })
+      const r = decide(input({ policy }))
+      expect(r.isDryRun).toBe(true)
+    })
+
+    it('flags the global dry_run policy', () => {
+      const policy = compile({ default: 'allow', dry_run: true, rules: [] })
+      const r = decide(input({ policy }))
+      expect(r.isDryRun).toBe(true)
+    })
+
+    it('does not treat a session-blocked call as dry-run', () => {
+      const policy = compile({
+        default: 'allow',
+        dry_run: true,
+        rules: [
+          {
+            name: 'needs-evidence',
+            match: { tool: 'send' },
+            action: 'allow',
+            evidence: { requires: ['lookup'] },
+          },
+        ],
+      })
+      const r = decide(input({ policy, sessionId: undefined }))
+      expect(r.sessionBlocked).toBe(true)
+      expect(r.isDryRun).toBe(false)
+    })
+  })
+})
diff --git a/packages/proxy/src/policy/decision-pipeline.ts b/packages/proxy/src/policy/decision-pipeline.ts
new file mode 100644
index 0000000..b9ef2fd
--- /dev/null
+++ b/packages/proxy/src/policy/decision-pipeline.ts
@@ -0,0 +1,242 @@
+// ---------------------------------------------------------------------------
+// Decision pipeline — the side-effect-free policy decision half shared by the
+// MCP forwarder and the sideband governance API (issue #12, D2).
+//
+// `decide()` is the portion of the old GovernedForwarder.handleToolsCall that
+// turns a tool call into a PolicyDecision: rule evaluation, drift gating,
+// flag_destructive escalation, and evidence/dependency grounding. It computes
+// what should happen but **never touches limiter state** — each caller applies
+// the limiter step itself (the MCP path consumes via check() at execution
+// time; the sideband peeks at /evaluate and records at /audit). Keeping this
+// here means both paths share one engine instead of forking security-critical
+// orchestration.
+// ---------------------------------------------------------------------------
+
+import type { CompiledPolicy, PolicyAction, ToolAnnotationHints } from './types.js'
+import { evaluatePolicy } from './engine.js'
+import type { PolicyDecision } from './engine.js'
+import type { ToolDriftEvent } from './annotation-cache.js'
+import type { EvidenceStore } from '../evidence/store.js'
+import { checkEvidence, checkDependencies } from '../evidence/grounding.js'
+import type { EvidenceCheckResult, DependencyCheckResult } from '../evidence/grounding.js'
+
+/** Drift response mode resolved from `policies.on_tool_drift` (default block). */
+export type DriftMode = 'block' | 'require_approval' | 'log'
+
+/**
+ * Everything `decide()` needs about one tool call. Annotation lookups are
+ * passed in resolved (the MCP path reads them from its single cache; the
+ * sideband from a per-origin cache) so the pipeline does not depend on any
+ * particular cache instance.
+ */
+export interface DecideInput {
+  readonly toolName: string
+  readonly toolArguments: Record<string, unknown> | undefined
+  readonly sessionId: string | undefined
+  readonly policy: CompiledPolicy
+  readonly environment: string | undefined
+  readonly evidenceStore: EvidenceStore | undefined
+  /** Baseline annotations — the definition the operator reviewed. */
+  readonly baselineAnnotations: ToolAnnotationHints | undefined
+  /** Latest-claim annotations — used for stricter-of-both in drift log mode. */
+  readonly currentAnnotations: ToolAnnotationHints | undefined
+  /** Active drift event for this tool, if its definition moved off baseline. */
+  readonly driftEvent: ToolDriftEvent | undefined
+}
+
+/** The decision plus all the metadata the execution/audit steps consume. */
+export interface PipelineDecision {
+  /** Final decision after drift gate, flag_destructive, and evidence checks. */
+  readonly decision: PolicyDecision
+  /** The action before evidence checks may have overridden it to `deny`. */
+  readonly originalAction: PolicyAction
+  readonly driftEvent: ToolDriftEvent | undefined
+  readonly driftMode: DriftMode
+  readonly driftBlocked: boolean
+  readonly flaggedDestructive: boolean
+  readonly evidenceResult: EvidenceCheckResult | undefined
+  readonly dependencyResult: DependencyCheckResult | undefined
+  readonly evidenceBlocked: boolean
+  readonly sessionBlocked: boolean
+  readonly isDryRun: boolean
+}
+
+/**
+ * Run the policy decision pipeline for one tool call.
+ *
+ * Pure with respect to limiter and audit state. The only side effect is an
+ * operational `console.error` when `flag_destructive: log` matches an
+ * unguarded destructive tool — preserved verbatim from the original forwarder
+ * so MCP behavior is bit-identical.
+ */
+export function decide(input: DecideInput): PipelineDecision {
+  const { toolName, toolArguments, sessionId, policy, environment, evidenceStore } = input
+
+  const annotations = input.baselineAnnotations
+  const driftEvent = input.driftEvent
+  const driftMode: DriftMode = policy.onToolDrift ?? 'block'
+
+  let decision = evaluatePolicy(policy, {
+    toolName,
+    annotations,
+    toolArguments,
+    environment,
+  })
+
+  // In log mode a drifted tool is evaluated against BOTH the baseline
+  // annotations (what the operator reviewed) and the current upstream claim —
+  // the stricter decision wins, so a definition flip cannot weaken enforcement
+  // in either direction.
+  if (driftEvent && driftMode === 'log') {
+    const currentDecision = evaluatePolicy(policy, {
+      toolName,
+      annotations: input.currentAnnotations,
+      toolArguments,
+      environment,
+    })
+    decision = stricterDecision(decision, currentDecision)
+  }
+
+  // Irreversible action detection: when no explicit rule matched, check if the
+  // tool is destructive and apply flag_destructive behavior.
+  const baselineDestructive = annotations?.destructiveHint ?? true // MCP default
+  const currentDestructive =
+    driftEvent && driftMode === 'log' ? (input.currentAnnotations?.destructiveHint ?? true) : false
+  const isDestructive = baselineDestructive || currentDestructive
+  let flaggedDestructive = false
+
+  if (isDestructive && !decision.matchedRule && policy.flagDestructive) {
+    flaggedDestructive = true
+
+    if (policy.flagDestructive === 'log') {
+      // eslint-disable-next-line no-console -- Intentional operational warning
+      console.error(`[helio] Destructive tool detected: ${toolName} (no matching rule)`)
+    } else {
+      // require_approval — override the decision to escalate
+      decision = {
+        action: 'require_approval',
+        matchedRule: undefined,
+        reason: `Destructive tool "${toolName}" auto-escalated by flag_destructive policy`,
+      }
+    }
+  }
+
+  // Tool definition drift gate: a drifted tool no longer matches the definition
+  // the operator reviewed, so on_tool_drift overrides whatever rule evaluation
+  // produced ("log" leaves the decision untouched).
+  let driftBlocked = false
+  if (driftEvent && driftMode !== 'log') {
+    driftBlocked = driftMode === 'block'
+    decision = {
+      action: driftMode === 'block' ? 'deny' : 'require_approval',
+      matchedRule: undefined,
+      reason: `Tool "${toolName}" definition drifted from baseline (${driftEvent.changes
+        .map((change) => change.aspect)
+        .join(', ')})`,
+    }
+  }
+
+  // Capture original action before evidence checks may override to 'deny'
+  const originalAction = decision.action
+
+  // Evidence grounding + dependency chain checks. Gate all non-deny actions
+  // (allow, require_approval, dry_run, etc.)
+  let evidenceResult: EvidenceCheckResult | undefined
+  let dependencyResult: DependencyCheckResult | undefined
+  let evidenceBlocked = false
+  let sessionBlocked = false
+
+  const requiresGroundedSession =
+    decision.action !== 'deny' &&
+    !!decision.matchedRule &&
+    ((decision.matchedRule.evidence?.requires.length ?? 0) > 0 ||
+      (decision.matchedRule.requires?.length ?? 0) > 0)
+
+  if (requiresGroundedSession && !sessionId) {
+    sessionBlocked = true
+    evidenceBlocked = true
+    decision = {
+      action: 'deny',
+      matchedRule: decision.matchedRule,
+      reason: 'Mcp-Session-Id is required for evidence/dependency-gated policy rules',
+    }
+  }
+
+  if (decision.action !== 'deny' && evidenceStore && sessionId && decision.matchedRule) {
+    const rule = decision.matchedRule
+
+    // Check evidence.requires
+    if (rule.evidence?.requires.length) {
+      evidenceResult = checkEvidence(evidenceStore, sessionId, rule.evidence.requires)
+      if (!evidenceResult.satisfied) {
+        evidenceBlocked = true
+        const problemKeys = [...evidenceResult.missing, ...evidenceResult.expired]
+        decision = {
+          action: 'deny',
+          matchedRule: rule,
+          reason: `Required evidence not satisfied: ${problemKeys.join(', ')}`,
+        }
+      }
+    }
+
+    // Check requires (dependency chains) — only if evidence check passed
+    if (!evidenceBlocked && rule.requires?.length) {
+      dependencyResult = checkDependencies(evidenceStore, sessionId, rule.requires, {
+        requireSuccess: rule.requiresSuccess ?? true,
+      })
+      if (!dependencyResult.satisfied) {
+        evidenceBlocked = true
+        decision = {
+          action: 'deny',
+          matchedRule: rule,
+          reason: `Required tool calls not completed: ${dependencyResult.missing.join(', ')}`,
+        }
+      }
+    }
+  }
+
+  // Dry-run detection: per-rule (action: dry_run) or global (policies.dry_run)
+  const isPerRuleDryRun = originalAction === 'dry_run'
+  const isGlobalDryRun = policy.dryRun === true
+  const isDryRun = (isPerRuleDryRun || isGlobalDryRun) && !sessionBlocked
+
+  return {
+    decision,
+    originalAction,
+    driftEvent,
+    driftMode,
+    driftBlocked,
+    flaggedDestructive,
+    evidenceResult,
+    dependencyResult,
+    evidenceBlocked,
+    sessionBlocked,
+    isDryRun,
+  }
+}
+
+/**
+ * Strictness ranking for policy actions — higher wins in stricter-of-both
+ * (log-mode drift evaluation). The ranking encodes whether an action can reach
+ * upstream and how strongly the operator constrained it:
+ * - deny and require_approval dominate everything (require_approval may
+ *   forward, but only through an explicit human gate);
+ * - dry_run outranks the limit actions and allow because it never forwards;
+ * - between the two limit actions, cross-conflicts (baseline matches one,
+ *   current matches the other) deterministically pick spend_limit. Log mode is
+ *   advisory by operator choice; operators needing hard guarantees use the
+ *   default "block".
+ */
+const ACTION_SEVERITY: Record<PolicyAction, number> = {
+  deny: 5,
+  require_approval: 4,
+  dry_run: 3,
+  spend_limit: 2,
+  rate_limit: 1,
+  allow: 0,
+}
+
+/** Pick the stricter of two policy decisions (ties keep the first). */
+export function stricterDecision(a: PolicyDecision, b: PolicyDecision): PolicyDecision {
+  return ACTION_SEVERITY[b.action] > ACTION_SEVERITY[a.action] ? b : a
+}
diff --git a/packages/proxy/src/policy/governed-forwarder.ts b/packages/proxy/src/policy/governed-forwarder.ts
index b1ffe66..210bcd5 100644
--- a/packages/proxy/src/policy/governed-forwarder.ts
+++ b/packages/proxy/src/policy/governed-forwarder.ts
@@ -6,15 +6,14 @@ import type {
   McpResponse,
 } from '../mcp/types.js'
 import { INTERNAL_ERROR } from '../mcp/types.js'
-import type { CompiledPolicy, PolicyAction } from './types.js'
-import { evaluatePolicy } from './engine.js'
+import type { CompiledPolicy } from './types.js'
 import type { PolicyDecision } from './engine.js'
+import { decide } from './decision-pipeline.js'
 import { ToolAnnotationCache } from './annotation-cache.js'
 import type { ToolDriftEvent, ToolCacheUpdateResult } from './annotation-cache.js'
 import type { AuditWriter } from '../audit/writer.js'
 import type { AuditRecord } from '../audit/types.js'
 import type { EvidenceStore } from '../evidence/store.js'
-import { checkEvidence, checkDependencies } from '../evidence/grounding.js'
 import type { EvidenceCheckResult, DependencyCheckResult } from '../evidence/grounding.js'
 import {
   buildPolicyDeniedFeedback,
@@ -278,6 +277,9 @@ export class GovernedForwarder implements McpForwarder {
       proxy_compute_ms: 0,
       flagged_destructive: false,
       dry_run: false,
+      record_kind: 'drift_event',
+      origin: 'mcp',
+      metadata: null,
     })
   }
 
@@ -297,145 +299,29 @@ export class GovernedForwarder implements McpForwarder {
         ? (params['arguments'] as Record<string, unknown>)
         : undefined
 
-    const annotations = this.annotationCache.get(toolName)
-    const driftEvent = this.annotationCache.getDrift(toolName)
-    const driftMode = this.policy.onToolDrift ?? 'block'
-
-    let decision = evaluatePolicy(this.policy, {
+    const {
+      decision,
+      driftEvent,
+      driftMode,
+      driftBlocked,
+      flaggedDestructive,
+      evidenceResult,
+      dependencyResult,
+      evidenceBlocked,
+      sessionBlocked,
+      isDryRun,
+    } = decide({
       toolName,
-      annotations,
       toolArguments,
+      sessionId: request.sessionId,
+      policy: this.policy,
       environment: this.environment,
+      evidenceStore: this.evidenceStore,
+      baselineAnnotations: this.annotationCache.get(toolName),
+      currentAnnotations: this.annotationCache.getCurrent(toolName),
+      driftEvent: this.annotationCache.getDrift(toolName),
     })
 
-    // In log mode a drifted tool is evaluated against BOTH the baseline
-    // annotations (what the operator reviewed) and the current upstream
-    // claim — the stricter decision wins, so a definition flip cannot weaken
-    // enforcement in either direction.
-    if (driftEvent && driftMode === 'log') {
-      const currentDecision = evaluatePolicy(this.policy, {
-        toolName,
-        annotations: this.annotationCache.getCurrent(toolName),
-        toolArguments,
-        environment: this.environment,
-      })
-      decision = stricterDecision(decision, currentDecision)
-    }
-
-    // Irreversible action detection: when no explicit rule matched,
-    // check if the tool is destructive and apply flag_destructive behavior.
-    const baselineDestructive = annotations?.destructiveHint ?? true // MCP default
-    const currentDestructive =
-      driftEvent && driftMode === 'log'
-        ? (this.annotationCache.getCurrent(toolName)?.destructiveHint ?? true)
-        : false
-    const isDestructive = baselineDestructive || currentDestructive
-    let flaggedDestructive = false
-
-    if (isDestructive && !decision.matchedRule && this.policy.flagDestructive) {
-      flaggedDestructive = true
-
-      if (this.policy.flagDestructive === 'log') {
-        // eslint-disable-next-line no-console -- Intentional operational warning
-        console.error(`[helio] Destructive tool detected: ${toolName} (no matching rule)`)
-      } else {
-        // require_approval — override the decision to escalate
-        decision = {
-          action: 'require_approval',
-          matchedRule: undefined,
-          reason: `Destructive tool "${toolName}" auto-escalated by flag_destructive policy`,
-        }
-      }
-    }
-
-    // Tool definition drift gate: a drifted tool no longer matches the
-    // definition the operator reviewed, so on_tool_drift overrides whatever
-    // rule evaluation produced ("log" leaves the decision untouched).
-    let driftBlocked = false
-    if (driftEvent && driftMode !== 'log') {
-      driftBlocked = driftMode === 'block'
-      decision = {
-        action: driftMode === 'block' ? 'deny' : 'require_approval',
-        matchedRule: undefined,
-        reason: `Tool "${toolName}" definition drifted from baseline (${driftEvent.changes
-          .map((change) => change.aspect)
-          .join(', ')})`,
-      }
-    }
-
-    // Capture original action before evidence checks may override to 'deny'
-    const originalAction = decision.action
-
-    // Evidence grounding + dependency chain checks.
-    // Gate all non-deny actions (allow, require_approval, dry_run, etc.)
-    let evidenceResult: EvidenceCheckResult | undefined
-    let dependencyResult: DependencyCheckResult | undefined
-    let evidenceBlocked = false
-    let sessionBlocked = false
-
-    const requiresGroundedSession =
-      decision.action !== 'deny' &&
-      !!decision.matchedRule &&
-      ((decision.matchedRule.evidence?.requires.length ?? 0) > 0 ||
-        (decision.matchedRule.requires?.length ?? 0) > 0)
-
-    if (requiresGroundedSession && !request.sessionId) {
-      sessionBlocked = true
-      evidenceBlocked = true
-      decision = {
-        action: 'deny',
-        matchedRule: decision.matchedRule,
-        reason: 'Mcp-Session-Id is required for evidence/dependency-gated policy rules',
-      }
-    }
-
-    if (
-      decision.action !== 'deny' &&
-      this.evidenceStore &&
-      request.sessionId &&
-      decision.matchedRule
-    ) {
-      const rule = decision.matchedRule
-
-      // Check evidence.requires
-      if (rule.evidence?.requires.length) {
-        evidenceResult = checkEvidence(
-          this.evidenceStore,
-          request.sessionId,
-          rule.evidence.requires,
-        )
-        if (!evidenceResult.satisfied) {
-          evidenceBlocked = true
-          const problemKeys = [...evidenceResult.missing, ...evidenceResult.expired]
-          decision = {
-            action: 'deny',
-            matchedRule: rule,
-            reason: `Required evidence not satisfied: ${problemKeys.join(', ')}`,
-          }
-        }
-      }
-
-      // Check requires (dependency chains) — only if evidence check passed
-      if (!evidenceBlocked && rule.requires?.length) {
-        dependencyResult = checkDependencies(this.evidenceStore, request.sessionId, rule.requires, {
-          requireSuccess: rule.requiresSuccess ?? true,
-        })
-        if (!dependencyResult.satisfied) {
-          evidenceBlocked = true
-          decision = {
-            action: 'deny',
-            matchedRule: rule,
-            reason: `Required tool calls not completed: ${dependencyResult.missing.join(', ')}`,
-          }
-        }
-      }
-    }
-
-    // Dry-run detection: per-rule (action: dry_run) or global (policies.dry_run: true)
-    const isPerRuleDryRun = originalAction === 'dry_run'
-    const isGlobalDryRun = this.policy.dryRun === true
-    const isDryRun = (isPerRuleDryRun || isGlobalDryRun) && !sessionBlocked
-
     let result: ForwardResult
     let approvalOutcome: ApprovalOutcome | undefined
     let approvalWaitMs = 0
@@ -960,6 +846,9 @@ export class GovernedForwarder implements McpForwarder {
       proxy_compute_ms: proxyComputeMs,
       flagged_destructive: flaggedDestructive,
       dry_run: isDryRun ?? false,
+      record_kind: 'tool_call',
+      origin: 'mcp',
+      metadata: null,
     }
 
     // Security-critical decisions — denies, approval resolutions, and
@@ -1125,32 +1014,6 @@ function collectAllowedEvidenceKeys(policy: CompiledPolicy): string[] {
   return [...keys]
 }
 
-/**
- * Strictness ranking for policy actions — higher wins in stricter-of-both
- * (log-mode drift evaluation). The ranking encodes whether an action can
- * reach upstream and how strongly the operator constrained it:
- * - deny and require_approval dominate everything (require_approval may
- *   forward, but only through an explicit human gate);
- * - dry_run outranks the limit actions and allow because it never forwards;
- * - between the two limit actions, cross-conflicts (baseline matches one,
- *   current matches the other) deterministically pick spend_limit. Log mode
- *   is advisory by operator choice; operators needing hard guarantees use
- *   the default "block".
- */
-const ACTION_SEVERITY: Record<PolicyAction, number> = {
-  deny: 5,
-  require_approval: 4,
-  dry_run: 3,
-  spend_limit: 2,
-  rate_limit: 1,
-  allow: 0,
-}
-
-/** Pick the stricter of two policy decisions (ties keep the first). */
-function stricterDecision(a: PolicyDecision, b: PolicyDecision): PolicyDecision {
-  return ACTION_SEVERITY[b.action] > ACTION_SEVERITY[a.action] ? b : a
-}
-
 /** Build a ForwardResult containing a JSON-RPC error response. */
 function makeErrorResult(
   request: McpRequest,
diff --git a/packages/proxy/src/policy/rate-limiter.test.ts b/packages/proxy/src/policy/rate-limiter.test.ts
index 70432d1..8418cb5 100644
--- a/packages/proxy/src/policy/rate-limiter.test.ts
+++ b/packages/proxy/src/policy/rate-limiter.test.ts
@@ -393,4 +393,79 @@ describe('RateLimiter', () => {
       limiter.close()
     })
   })
+
+  // -------------------------------------------------------------------------
+  // record() — unconditional commit path for the sideband /audit endpoint
+  // (issue #12, D3). The external call already executed, so unlike check()
+  // this always appends, even when the bucket is already at/over the limit.
+  // -------------------------------------------------------------------------
+
+  describe('record', () => {
+    it('appends even when already at/over the limit', () => {
+      const { limiter } = createLimiter()
+
+      // Fill to the limit via check().
+      limiter.check({ key: 'tool:send', maxCalls: 2, windowMs: 60_000 })
+      limiter.check({ key: 'tool:send', maxCalls: 2, windowMs: 60_000 })
+      expect(limiter.check({ key: 'tool:send', maxCalls: 2, windowMs: 60_000 }).allowed).toBe(false)
+
+      // record() commits past the limit — the call really happened.
+      const r = limiter.record({ key: 'tool:send', maxCalls: 2, windowMs: 60_000 })
+      expect(r.current).toBe(3)
+      expect(r.allowed).toBe(false) // over limit, but recorded
+      expect(limiter.getKeyState('tool:send')?.current).toBe(3)
+    })
+
+    it('creates the bucket on first record', () => {
+      const { limiter } = createLimiter()
+
+      const r = limiter.record({ key: 'tool:fresh', maxCalls: 5, windowMs: 60_000 })
+      expect(r.current).toBe(1)
+      expect(r.allowed).toBe(true)
+      expect(limiter.getKeyState('tool:fresh')?.current).toBe(1)
+    })
+
+    it('evicts expired timestamps before appending', () => {
+      const { limiter, advance } = createLimiter()
+
+      limiter.record({ key: 'tool:t', maxCalls: 3, windowMs: 1_000 })
+      advance(2_000) // first entry now expired
+      const r = limiter.record({ key: 'tool:t', maxCalls: 3, windowMs: 1_000 })
+      expect(r.current).toBe(1)
+    })
+
+    it('emits onWarning while within the limit', () => {
+      const time = 1_000_000
+      const warnings: Array<{ current: number }> = []
+      const limiter = new RateLimiter({
+        now: () => time,
+        cleanupIntervalMs: 0,
+        warningThreshold: 0.8,
+        onWarning: (s) => warnings.push({ current: s.current }),
+      })
+
+      limiter.record({ key: 'tool:w', maxCalls: 5, windowMs: 60_000 }) // 1/5
+      limiter.record({ key: 'tool:w', maxCalls: 5, windowMs: 60_000 }) // 2/5
+      limiter.record({ key: 'tool:w', maxCalls: 5, windowMs: 60_000 }) // 3/5
+      expect(warnings).toHaveLength(0)
+      limiter.record({ key: 'tool:w', maxCalls: 5, windowMs: 60_000 }) // 4/5 -> 0.8
+      expect(warnings).toEqual([{ current: 4 }])
+    })
+
+    it('does NOT emit onWarning on over-limit appends (no flood)', () => {
+      const time = 1_000_000
+      const warnings: number[] = []
+      const limiter = new RateLimiter({
+        now: () => time,
+        cleanupIntervalMs: 0,
+        warningThreshold: 0.8,
+        onWarning: (s) => warnings.push(s.current),
+      })
+
+      limiter.record({ key: 'tool:flood', maxCalls: 1, windowMs: 60_000 }) // 1/1 -> warns
+      limiter.record({ key: 'tool:flood', maxCalls: 1, windowMs: 60_000 }) // 2/1 over
+      limiter.record({ key: 'tool:flood', maxCalls: 1, windowMs: 60_000 }) // 3/1 over
+      expect(warnings).toEqual([1]) // only the within-limit append warned
+    })
+  })
 })
diff --git a/packages/proxy/src/policy/rate-limiter.ts b/packages/proxy/src/policy/rate-limiter.ts
index f8f7c3c..fd73995 100644
--- a/packages/proxy/src/policy/rate-limiter.ts
+++ b/packages/proxy/src/policy/rate-limiter.ts
@@ -150,6 +150,54 @@ export class RateLimiter {
     }
   }
 
+  /**
+   * Unconditionally record a call against the rate limit.
+   *
+   * Unlike check(), this always appends the timestamp — even when the bucket
+   * is already at/over the limit — because the call it represents has already
+   * executed. The sideband splits decision from execution: /evaluate peeks
+   * (non-destructive), and /audit calls record() once the external call ran,
+   * so refusing to record at the limit (as check() does) would let real calls
+   * escape accounting and under-count subsequent peeks. (issue #12, D3.)
+   *
+   * Warnings fire only while the post-append count stays within the limit —
+   * exact parity with check(), which never warns on its over-limit path — so a
+   * burst of over-limit audits cannot flood the dashboard's limit_warning feed.
+   */
+  record(params: RateLimitCheckParams): RateLimitResult {
+    const { key, maxCalls, windowMs } = params
+    const now = this.now()
+    const windowStart = now - windowMs
+
+    let bucket = this.buckets.get(key)
+    if (!bucket) {
+      bucket = { timestamps: [], maxCalls, windowMs }
+      this.buckets.set(key, bucket)
+    }
+
+    // Update stored config (may change on policy hot-reload)
+    bucket.maxCalls = maxCalls
+    bucket.windowMs = windowMs
+
+    // Evict expired timestamps, then append unconditionally.
+    bucket.timestamps = bucket.timestamps.filter((ts) => ts > windowStart)
+    bucket.timestamps.push(now)
+    const current = bucket.timestamps.length
+    const resetAtMs = (bucket.timestamps[0] ?? now) + windowMs
+
+    if (this.onWarning && current <= maxCalls && current / maxCalls >= this.warningThreshold) {
+      this.onWarning({ key, current, limit: maxCalls, window_ms: windowMs, reset_at_ms: resetAtMs })
+    }
+
+    return {
+      allowed: current <= maxCalls,
+      current,
+      limit: maxCalls,
+      windowMs,
+      resetAtMs,
+    }
+  }
+
   /**
    * Check the rate limit without recording the call (non-destructive).
    *
diff --git a/packages/proxy/src/policy/spend-limiter.test.ts b/packages/proxy/src/policy/spend-limiter.test.ts
index f7ebbe8..46a2710 100644
--- a/packages/proxy/src/policy/spend-limiter.test.ts
+++ b/packages/proxy/src/policy/spend-limiter.test.ts
@@ -558,4 +558,76 @@ describe('SpendLimiter', () => {
       limiter.close()
     })
   })
+
+  // -------------------------------------------------------------------------
+  // record() — unconditional commit path for the sideband /audit endpoint
+  // (issue #12, D3). The spend already happened, so this always appends —
+  // even past the limit — and throws on invalid amounts (which must have been
+  // rejected at /evaluate; reaching record() with one is a logic bug).
+  // -------------------------------------------------------------------------
+
+  describe('record', () => {
+    it('appends even when the spend exceeds the limit', () => {
+      const { limiter } = createLimiter()
+
+      limiter.check({ key: 'tool:pay', amount: 18, limit: 20, windowMs: 60_000 })
+      // A spend of 5 would exceed via check() and be rejected, but it executed.
+      const r = limiter.record({ key: 'tool:pay', amount: 5, limit: 20, windowMs: 60_000 })
+      expect(r.currentSpend).toBe(23)
+      expect(r.allowed).toBe(false) // over limit, but recorded
+      expect(limiter.getKeyState('tool:pay')?.current_spend).toBe(23)
+    })
+
+    it('creates the bucket on first record', () => {
+      const { limiter } = createLimiter()
+
+      const r = limiter.record({ key: 'tool:new', amount: 4.5, limit: 20, windowMs: 60_000 })
+      expect(r.currentSpend).toBe(4.5)
+      expect(r.allowed).toBe(true)
+      expect(limiter.getKeyState('tool:new')?.current_spend).toBe(4.5)
+    })
+
+    it('evicts expired entries before appending', () => {
+      const { limiter, advance } = createLimiter()
+
+      limiter.record({ key: 'tool:s', amount: 10, limit: 20, windowMs: 1_000 })
+      advance(2_000)
+      const r = limiter.record({ key: 'tool:s', amount: 3, limit: 20, windowMs: 1_000 })
+      expect(r.currentSpend).toBe(3)
+    })
+
+    it('throws on a negative or non-finite amount', () => {
+      const { limiter } = createLimiter()
+
+      expect(() => limiter.record({ key: 'k', amount: -1, limit: 20, windowMs: 1_000 })).toThrow(
+        RangeError,
+      )
+      expect(() => limiter.record({ key: 'k', amount: NaN, limit: 20, windowMs: 1_000 })).toThrow(
+        RangeError,
+      )
+      expect(() =>
+        limiter.record({ key: 'k', amount: Infinity, limit: 20, windowMs: 1_000 }),
+      ).toThrow(RangeError)
+      // No bucket state was created by the rejected attempts.
+      expect(limiter.getKeyState('k')).toBeUndefined()
+    })
+
+    it('emits onWarning while within the limit but not past it', () => {
+      const time = 1_000_000
+      const warnings: number[] = []
+      const limiter = new SpendLimiter({
+        now: () => time,
+        cleanupIntervalMs: 0,
+        warningThreshold: 0.8,
+        onWarning: (s) => warnings.push(s.current_spend),
+      })
+
+      limiter.record({ key: 'tool:w', amount: 10, limit: 20, windowMs: 60_000 }) // 10/20
+      expect(warnings).toHaveLength(0)
+      limiter.record({ key: 'tool:w', amount: 6, limit: 20, windowMs: 60_000 }) // 16/20 -> 0.8
+      expect(warnings).toEqual([16])
+      limiter.record({ key: 'tool:w', amount: 10, limit: 20, windowMs: 60_000 }) // 26/20 over
+      expect(warnings).toEqual([16]) // no warning on the over-limit append
+    })
+  })
 })
diff --git a/packages/proxy/src/policy/spend-limiter.ts b/packages/proxy/src/policy/spend-limiter.ts
index c95b73f..7d8b2b3 100644
--- a/packages/proxy/src/policy/spend-limiter.ts
+++ b/packages/proxy/src/policy/spend-limiter.ts
@@ -195,6 +195,68 @@ export class SpendLimiter {
     }
   }
 
+  /**
+   * Unconditionally record a spend against the limit.
+   *
+   * Unlike check(), this always appends the amount — even when it pushes the
+   * window past the limit — because the spend it represents has already been
+   * incurred. The sideband peeks at /evaluate and commits here at /audit once
+   * the external call ran (issue #12, D3).
+   *
+   * Throws on a negative or non-finite amount: such amounts are rejected at
+   * /evaluate, so one reaching record() is a logic bug we surface loudly rather
+   * than silently corrupt the sliding-window sum. Warnings fire only while the
+   * post-append spend stays within the limit (parity with check()).
+   */
+  record(params: SpendLimitCheckParams): SpendLimitResult {
+    const { key, amount, limit, windowMs } = params
+
+    if (!Number.isFinite(amount) || amount < 0) {
+      throw new RangeError(
+        `SpendLimiter.record() received an invalid amount (${String(amount)}); ` +
+          'invalid amounts must be rejected at /evaluate, never committed',
+      )
+    }
+
+    const now = this.now()
+    const windowStart = now - windowMs
+
+    let bucket = this.buckets.get(key)
+    if (!bucket) {
+      bucket = { entries: [], limit, currency: '', windowMs }
+      this.buckets.set(key, bucket)
+    }
+
+    // Update stored config (may change on policy hot-reload)
+    bucket.limit = limit
+    bucket.windowMs = windowMs
+
+    // Evict expired entries, then append unconditionally.
+    bucket.entries = bucket.entries.filter((e) => e.timestamp > windowStart)
+    bucket.entries.push({ timestamp: now, amount })
+    const currentSpend = bucket.entries.reduce((sum, e) => sum + e.amount, 0)
+    const resetAtMs = (bucket.entries[0]?.timestamp ?? now) + windowMs
+
+    if (this.onWarning && currentSpend <= limit && currentSpend / limit >= this.warningThreshold) {
+      this.onWarning({
+        key,
+        current_spend: currentSpend,
+        limit,
+        currency: bucket.currency,
+        window_ms: windowMs,
+        reset_at_ms: resetAtMs,
+      })
+    }
+
+    return {
+      allowed: currentSpend <= limit,
+      currentSpend,
+      limit,
+      windowMs,
+      resetAtMs,
+    }
+  }
+
   /**
    * Check the spend limit without recording the spend (non-destructive).
    *
diff --git a/packages/proxy/src/server.test.ts b/packages/proxy/src/server.test.ts
index d90d341..3793942 100644
--- a/packages/proxy/src/server.test.ts
+++ b/packages/proxy/src/server.test.ts
@@ -45,7 +45,7 @@ const minimalConfig = {
     retention: '90d',
     include_responses: true,
   },
-  sdk: { enabled: false, port: 3200, host: '127.0.0.1' },
+  sdk: { enabled: false, port: 3200, host: '127.0.0.1', evaluation_ttl: '10m' },
 } as HelioConfig
 
 // ---------------------------------------------------------------------------
diff --git a/packages/proxy/src/sideband/errors.ts b/packages/proxy/src/sideband/errors.ts
new file mode 100644
index 0000000..75513e1
--- /dev/null
+++ b/packages/proxy/src/sideband/errors.ts
@@ -0,0 +1,16 @@
+/**
+ * Error thrown when the GovernanceService is wired in a way that could fail
+ * open — for example, an approval-capable policy with no ApprovalRouter to
+ * route `require_approval` decisions through (issue #12). Surfaced at
+ * construction and on hot-reload so a misconfiguration crashes loudly rather
+ * than silently degrading an approval into an unenforced allow.
+ *
+ * Direct embedders of GovernanceService can catch this distinctly; the bundled
+ * CLI always provides a router, so it never fires in the shipped path.
+ */
+export class GovernanceConfigError extends Error {
+  constructor(message: string) {
+    super(message)
+    this.name = 'GovernanceConfigError'
+  }
+}
diff --git a/packages/proxy/src/sideband/governance-api.test.ts b/packages/proxy/src/sideband/governance-api.test.ts
new file mode 100644
index 0000000..faac847
--- /dev/null
+++ b/packages/proxy/src/sideband/governance-api.test.ts
@@ -0,0 +1,191 @@
+import { describe, it, expect, afterEach } from 'vitest'
+import { createSidebandApp } from '../evidence/api.js'
+import { EvidenceStore } from '../evidence/store.js'
+import { GovernanceService } from './governance-service.js'
+import { compilePolicies } from '../policy/parser.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+const SDK_TOKEN = 'sdk-token-aaaaaaaaaaaaaaaa'
+const ADAPTER_TOKEN = 'adapter-token-bbbbbbbbbbbb'
+
+function setup(opts?: { withGovernance?: boolean; tokens?: boolean }) {
+  const store = new EvidenceStore({ cleanupIntervalMs: 0 })
+  const policy = compilePolicies({
+    default: 'allow',
+    dry_run: false,
+    rules: [{ name: 'no-send', match: { tool: 'blocked' }, action: 'deny' }],
+  }).policy
+
+  const governance =
+    opts?.withGovernance === false
+      ? undefined
+      : new GovernanceService({ policy, sweepIntervalMs: 0 })
+
+  const app = createSidebandApp(store, {
+    token: opts?.tokens ? SDK_TOKEN : undefined,
+    adapterToken: opts?.tokens ? ADAPTER_TOKEN : undefined,
+    governance,
+  })
+
+  const post = (path: string, body: unknown, headers: Record<string, string> = {}) =>
+    app.request(path, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json', ...headers },
+      body: typeof body === 'string' ? body : JSON.stringify(body),
+    })
+
+  return { app, store, governance, post }
+}
+
+const evalBody = (overrides?: Record<string, unknown>) => ({
+  origin: 'openclaw',
+  tool: { name: 'send' },
+  arguments: {},
+  ...overrides,
+})
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('sideband governance routes', () => {
+  let store: EvidenceStore | null = null
+  let governance: GovernanceService | null = null
+  afterEach(() => {
+    store?.close()
+    governance?.close()
+    store = null
+    governance = null
+  })
+
+  it('evaluates a tool call and returns a decision', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+    const res = await ctx.post('/evaluate', evalBody())
+    expect(res.status).toBe(200)
+    const json = (await res.json()) as { decision: string; evaluation_id: string }
+    expect(json.decision).toBe('allow')
+    expect(json.evaluation_id).toBeTruthy()
+  })
+
+  it('returns 503 governance_unavailable when no service is wired', async () => {
+    const ctx = setup({ withGovernance: false })
+    store = ctx.store
+    const res = await ctx.post('/evaluate', evalBody())
+    expect(res.status).toBe(503)
+    expect(((await res.json()) as { error: string }).error).toBe('governance_unavailable')
+  })
+
+  it('rejects malformed JSON with 400', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+    const res = await ctx.post('/evaluate', '{ not json')
+    expect(res.status).toBe(400)
+    expect(((await res.json()) as { error: string }).error).toBe('Invalid JSON')
+  })
+
+  it('rejects a bad origin with a validation error', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+    const res = await ctx.post('/evaluate', evalBody({ origin: 'NOT VALID!' }))
+    expect(res.status).toBe(400)
+    expect(((await res.json()) as { error: string }).error).toBe('Validation error')
+  })
+
+  it('rejects oversized metadata with 413', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+    const res = await ctx.post('/evaluate', evalBody({ metadata: { blob: 'x'.repeat(5000) } }))
+    expect(res.status).toBe(413)
+  })
+
+  it('refuses cross-origin (browser) requests on governance routes', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+    const res = await ctx.post('/evaluate', evalBody(), { origin: 'http://evil.local' })
+    expect(res.status).toBe(403)
+  })
+
+  describe('scoped tokens (F6)', () => {
+    it('governance routes require the adapter token, not the SDK token', async () => {
+      const ctx = setup({ tokens: true })
+      store = ctx.store
+      governance = ctx.governance ?? null
+
+      // No token → 401
+      expect((await ctx.post('/evaluate', evalBody())).status).toBe(401)
+      // SDK token is the wrong scope → 401
+      expect(
+        (await ctx.post('/evaluate', evalBody(), { authorization: `Bearer ${SDK_TOKEN}` })).status,
+      ).toBe(401)
+      // Adapter token → ok
+      expect(
+        (await ctx.post('/evaluate', evalBody(), { authorization: `Bearer ${ADAPTER_TOKEN}` }))
+          .status,
+      ).toBe(200)
+    })
+
+    it('evidence routes reject the adapter token (wrong scope)', async () => {
+      const ctx = setup({ tokens: true })
+      store = ctx.store
+      governance = ctx.governance ?? null
+
+      const body = {
+        session_id: 's1',
+        tool_name: 't',
+        evidence_key: 'k',
+        evidence_data: { x: 1 },
+      }
+      expect(
+        (await ctx.post('/evidence', body, { authorization: `Bearer ${ADAPTER_TOKEN}` })).status,
+      ).toBe(401)
+      expect(
+        (await ctx.post('/evidence', body, { authorization: `Bearer ${SDK_TOKEN}` })).status,
+      ).toBe(201)
+    })
+  })
+
+  it('runs the full evaluate → audit loop', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+
+    const ev = await ctx.post('/evaluate', evalBody())
+    const { evaluation_id } = (await ev.json()) as { evaluation_id: string }
+
+    const audit = await ctx.post('/audit', { evaluation_id, status: 'success', duration_ms: 12 })
+    expect(audit.status).toBe(201)
+    const json = (await audit.json()) as { ok: boolean; audit_record_id: string }
+    expect(json.ok).toBe(true)
+    expect(json.audit_record_id).toBeTruthy()
+
+    // Idempotent replay of the same payload.
+    const replay = await ctx.post('/audit', {
+      evaluation_id,
+      status: 'success',
+      duration_ms: 12,
+    })
+    expect(replay.status).toBe(200)
+    expect(((await replay.json()) as { already_finalized: boolean }).already_finalized).toBe(true)
+  })
+
+  it('returns observational allow for install-scan', async () => {
+    const ctx = setup()
+    store = ctx.store
+    governance = ctx.governance ?? null
+    const res = await ctx.post('/install-scan', {
+      origin: 'openclaw',
+      package: { name: 'left-pad', source: 'npm' },
+    })
+    expect(res.status).toBe(200)
+    expect(((await res.json()) as { decision: string }).decision).toBe('allow')
+  })
+})
diff --git a/packages/proxy/src/sideband/governance-api.ts b/packages/proxy/src/sideband/governance-api.ts
new file mode 100644
index 0000000..df5e1a1
--- /dev/null
+++ b/packages/proxy/src/sideband/governance-api.ts
@@ -0,0 +1,214 @@
+import { Hono } from 'hono'
+import type { Context } from 'hono'
+import { z } from 'zod'
+import { createHash } from 'node:crypto'
+import type { GovernanceService } from './governance-service.js'
+import { canonicalize } from '../util/canonical-json.js'
+import { formatZodErrors } from '../util/format-zod-errors.js'
+
+// ---------------------------------------------------------------------------
+// Request schemas (issue #12). snake_case crosses the wire; the contract is
+// framework-neutral (D13) — no adapter-specific field names or enums.
+// ---------------------------------------------------------------------------
+
+const originSchema = z
+  .string()
+  .regex(/^[a-z0-9_-]{1,64}$/, 'origin must match ^[a-z0-9_-]{1,64}$')
+  .default('sideband')
+
+const metadataSchema = z.record(z.string(), z.unknown()).nullish()
+
+const toolDefinitionSchema = z.object({
+  name: z.string().min(1),
+  description: z.string().optional(),
+  input_schema: z.unknown().optional(),
+  output_schema: z.unknown().optional(),
+  title: z.string().optional(),
+  annotations: z.record(z.string(), z.unknown()).optional(),
+})
+
+const evaluateBody = z.object({
+  origin: originSchema,
+  adapter_version: z.string().max(64).optional(),
+  agent_id: z.string().nullish(),
+  session_id: z.string().nullish(),
+  tool: toolDefinitionSchema,
+  arguments: z.record(z.string(), z.unknown()).optional(),
+  metadata: metadataSchema,
+})
+
+const installScanBody = z.object({
+  origin: originSchema,
+  agent_id: z.string().nullish(),
+  session_id: z.string().nullish(),
+  package: z.object({
+    name: z.string().min(1),
+    version: z.string().optional(),
+    source: z.string().max(64).optional(),
+    spec: z.string().optional(),
+    url: z.string().optional(),
+  }),
+  metadata: metadataSchema,
+})
+
+const auditBody = z.object({
+  evaluation_id: z.string().min(1),
+  status: z.enum(['success', 'error', 'not_executed']),
+  error: z.string().optional(),
+  duration_ms: z.number().optional(),
+  result: z.unknown().optional(),
+  actual_amount: z.number().optional(),
+})
+
+const resolveBody = z.object({
+  resolution: z.enum(['approved', 'denied', 'timeout', 'cancelled']),
+  resolved_by: z.string().optional(),
+  reason: z.string().optional(),
+  scope: z.enum(['once', 'always']).optional(),
+})
+
+/** Max serialized size of the `metadata` object (D11/D15). */
+const MAX_METADATA_BYTES = 4 * 1_024
+
+// ---------------------------------------------------------------------------
+// App
+// ---------------------------------------------------------------------------
+
+/**
+ * Build the Hono sub-app for the four sideband governance endpoints
+ * (issue #12). Mounted at the sideband root by `createSidebandApp`; the
+ * adapter-token bearer guard and Origin/body-size middleware are applied
+ * there. When `service` is undefined, every route returns 503 so the routes
+ * exist (and document themselves) even in evidence-only deployments.
+ */
+export function createGovernanceApp(service: GovernanceService | undefined): Hono {
+  const app = new Hono()
+
+  const unavailable = () => ({ error: 'governance_unavailable' }) as const
+
+  app.post('/evaluate', async (c) => {
+    if (!service) return c.json(unavailable(), 503)
+    const parsed = await parseJson(c)
+    if ('error' in parsed) return c.json(parsed.error, 400)
+    const result = evaluateBody.safeParse(parsed.body)
+    if (!result.success) {
+      return c.json({ error: 'Validation error', details: formatZodErrors(result.error) }, 400)
+    }
+    if (metadataTooLarge(result.data.metadata)) {
+      return c.json({ error: 'metadata_too_large' }, 413)
+    }
+    const r = service.evaluate({
+      origin: result.data.origin,
+      adapter_version: result.data.adapter_version,
+      agent_id: result.data.agent_id ?? null,
+      session_id: result.data.session_id ?? null,
+      tool: result.data.tool,
+      arguments: result.data.arguments,
+      metadata: result.data.metadata ?? null,
+    })
+    return c.json(r.body, asStatus(r.status))
+  })
+
+  app.post('/audit', async (c) => {
+    if (!service) return c.json(unavailable(), 503)
+    const parsed = await parseJson(c)
+    if ('error' in parsed) return c.json(parsed.error, 400)
+    const result = auditBody.safeParse(parsed.body)
+    if (!result.success) {
+      return c.json({ error: 'Validation error', details: formatZodErrors(result.error) }, 400)
+    }
+    const hash = auditPayloadHash(result.data)
+    const r = service.audit(result.data, hash)
+    return c.json(r.body, asStatus(r.status))
+  })
+
+  app.post('/install-scan', async (c) => {
+    if (!service) return c.json(unavailable(), 503)
+    const parsed = await parseJson(c)
+    if ('error' in parsed) return c.json(parsed.error, 400)
+    const result = installScanBody.safeParse(parsed.body)
+    if (!result.success) {
+      return c.json({ error: 'Validation error', details: formatZodErrors(result.error) }, 400)
+    }
+    if (metadataTooLarge(result.data.metadata)) {
+      return c.json({ error: 'metadata_too_large' }, 413)
+    }
+    const r = service.installScan({
+      origin: result.data.origin,
+      agent_id: result.data.agent_id ?? null,
+      session_id: result.data.session_id ?? null,
+      package: result.data.package,
+      metadata: result.data.metadata ?? null,
+    })
+    return c.json(r.body, asStatus(r.status))
+  })
+
+  app.post('/approval/:id/resolve', async (c) => {
+    if (!service) return c.json(unavailable(), 503)
+    const parsed = await parseJson(c)
+    if ('error' in parsed) return c.json(parsed.error, 400)
+    const result = resolveBody.safeParse(parsed.body)
+    if (!result.success) {
+      return c.json({ error: 'Validation error', details: formatZodErrors(result.error) }, 400)
+    }
+    if (
+      (result.data.resolution === 'approved' || result.data.resolution === 'denied') &&
+      !result.data.resolved_by
+    ) {
+      return c.json({ error: 'resolved_by is required for approved/denied' }, 400)
+    }
+    const r = service.resolveApproval(c.req.param('id'), result.data)
+    return c.json(r.body, asStatus(r.status))
+  })
+
+  return app
+}
+
+/** Path prefixes that the adapter-token guard (not the SDK token) protects. */
+export function isGovernancePath(path: string): boolean {
+  return (
+    path === '/evaluate' ||
+    path === '/audit' ||
+    path === '/install-scan' ||
+    path.startsWith('/approval/')
+  )
+}
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+async function parseJson(c: Context): Promise<{ body: unknown } | { error: { error: string } }> {
+  try {
+    return { body: await c.req.json() }
+  } catch {
+    return { error: { error: 'Invalid JSON' } }
+  }
+}
+
+function metadataTooLarge(metadata: Record<string, unknown> | null | undefined): boolean {
+  if (metadata == null) return false
+  return Buffer.byteLength(canonicalize(metadata), 'utf8') > MAX_METADATA_BYTES
+}
+
+/**
+ * SHA-256 over the canonical JSON of the semantic /audit fields (D5). Key
+ * order, whitespace, and omitted-vs-default fields cannot produce spurious
+ * idempotency conflicts; a retry that recomputes duration_ms is an adapter bug
+ * and correctly surfaces as a conflict.
+ */
+function auditPayloadHash(data: z.infer<typeof auditBody>): string {
+  const semantic = {
+    status: data.status,
+    error: data.error ?? null,
+    duration_ms: data.duration_ms ?? null,
+    result: data.result ?? null,
+    actual_amount: data.actual_amount ?? null,
+  }
+  return createHash('sha256').update(canonicalize(semantic)).digest('hex')
+}
+
+/** Narrow a numeric status to Hono's accepted status type. */
+function asStatus(status: number): 200 | 201 | 400 | 404 | 409 | 413 | 503 {
+  return status as 200 | 201 | 400 | 404 | 409 | 413 | 503
+}
diff --git a/packages/proxy/src/sideband/governance-service.test.ts b/packages/proxy/src/sideband/governance-service.test.ts
new file mode 100644
index 0000000..28aeb9e
--- /dev/null
+++ b/packages/proxy/src/sideband/governance-service.test.ts
@@ -0,0 +1,623 @@
+import { describe, it, expect } from 'vitest'
+import { GovernanceService } from './governance-service.js'
+import type { EvaluateInput, AuditInput } from './governance-service.js'
+import { GovernanceConfigError } from './errors.js'
+import { compilePolicies } from '../policy/parser.js'
+import type { PoliciesConfig } from '../config/schema.js'
+import type { CompiledPolicy } from '../policy/types.js'
+import type { AuditRecord } from '../audit/types.js'
+import type { AuditWriter } from '../audit/writer.js'
+import { RateLimiter } from '../policy/rate-limiter.js'
+import { SpendLimiter } from '../policy/spend-limiter.js'
+import { ApprovalRouter } from '../approval/router.js'
+import { ApprovalQueue } from '../approval/queue.js'
+import { EvidenceStore } from '../evidence/index.js'
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function compile(config: Omit<PoliciesConfig, 'dry_run'> & { dry_run?: boolean }): CompiledPolicy {
+  return compilePolicies({ dry_run: false, ...config }).policy
+}
+
+interface Captured {
+  id: string
+  record: Omit<AuditRecord, 'id' | 'created_at'>
+  immediate: boolean
+}
+
+function fakeWriter() {
+  const records: Captured[] = []
+  let counter = 0
+  const writer = {
+    push: (record: Omit<AuditRecord, 'id' | 'created_at'>, id = `id-${String(++counter)}`) =>
+      records.push({ id, record, immediate: false }),
+    pushImmediate: (
+      record: Omit<AuditRecord, 'id' | 'created_at'>,
+      id = `id-${String(++counter)}`,
+    ) => records.push({ id, record, immediate: true }),
+  } as unknown as AuditWriter
+  return { writer, records }
+}
+
+interface ServiceHarness {
+  service: GovernanceService
+  records: Captured[]
+  advance: (ms: number) => void
+  rateLimiter: RateLimiter | undefined
+  spendLimiter: SpendLimiter | undefined
+  approvalRouter: ApprovalRouter | undefined
+  evidenceStore: EvidenceStore | undefined
+}
+
+function makeService(opts?: {
+  policy?: CompiledPolicy
+  withLimiters?: boolean
+  withApprovals?: boolean
+  withEvidence?: boolean
+  ttlMs?: number
+}): ServiceHarness {
+  let time = 1_000_000
+  const now = () => time
+  const advance = (ms: number) => {
+    time += ms
+  }
+  const { writer, records } = fakeWriter()
+  const policy = opts?.policy ?? compile({ default: 'allow', rules: [] })
+
+  const rateLimiter = opts?.withLimiters
+    ? new RateLimiter({ now, cleanupIntervalMs: 0 })
+    : undefined
+  const spendLimiter = opts?.withLimiters
+    ? new SpendLimiter({ now, cleanupIntervalMs: 0 })
+    : undefined
+  const queue = opts?.withApprovals ? new ApprovalQueue({ now, cleanupIntervalMs: 0 }) : undefined
+  const approvalRouter =
+    opts?.withApprovals && queue
+      ? new ApprovalRouter({
+          defaultTimeoutMs: 300_000,
+          defaultOnTimeout: 'deny',
+          channels: new Map(),
+          queue,
+          now,
+        })
+      : undefined
+  const evidenceStore = opts?.withEvidence ? new EvidenceStore({ now }) : undefined
+
+  const service = new GovernanceService({
+    policy,
+    auditWriter: writer,
+    rateLimiter,
+    spendLimiter,
+    approvalRouter,
+    evidenceStore,
+    ttlMs: opts?.ttlMs ?? 600_000,
+    now,
+    sweepIntervalMs: 0,
+  })
+
+  return { service, records, advance, rateLimiter, spendLimiter, approvalRouter, evidenceStore }
+}
+
+function evalInput(overrides?: Partial<EvaluateInput>): EvaluateInput {
+  return {
+    origin: 'openclaw',
+    agent_id: 'main',
+    session_id: null,
+    tool: { name: 'send' },
+    arguments: {},
+    metadata: null,
+    ...overrides,
+  }
+}
+
+function auditInput(id: string, overrides?: Partial<AuditInput>): AuditInput {
+  return { evaluation_id: id, status: 'success', ...overrides }
+}
+
+// ---------------------------------------------------------------------------
+// evaluate
+// ---------------------------------------------------------------------------
+
+describe('GovernanceService.evaluate', () => {
+  it('returns allow for a default-allow policy and creates a pending entry', () => {
+    const { service } = makeService()
+    const res = service.evaluate(evalInput())
+    expect(res.status).toBe(200)
+    expect(res.body['decision']).toBe('allow')
+    expect(res.body['evaluation_id']).toBeTruthy()
+  })
+
+  it('audits deny terminally at evaluate time (no /audit needed)', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'no-send', match: { tool: 'send' }, action: 'deny' }],
+    })
+    const { service, records } = makeService({ policy })
+    const res = service.evaluate(evalInput())
+    expect(res.body['decision']).toBe('deny')
+    expect(res.body['feedback']).toBeTruthy()
+    expect(records).toHaveLength(1)
+    expect(records[0]?.record.policy_decision).toBe('deny')
+    expect(records[0]?.record.block_reason).toBe('policy_denied')
+    expect(records[0]?.record.origin).toBe('openclaw')
+    expect(records[0]?.immediate).toBe(true)
+  })
+
+  it('does not consume rate counters at evaluate (peek only)', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'rl',
+          match: { tool: 'send' },
+          action: 'rate_limit',
+          limits: { max_calls: 2, window: '60s' },
+        },
+      ],
+    })
+    const { service, rateLimiter } = makeService({ policy, withLimiters: true })
+    service.evaluate(evalInput())
+    service.evaluate(evalInput())
+    service.evaluate(evalInput())
+    // Three evaluates, zero consumption — the bucket is still empty.
+    expect(rateLimiter?.getKeyState('tool:send')).toBeUndefined()
+  })
+
+  it('returns rate_limited terminally when peek is over the limit', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'rl',
+          match: { tool: 'send' },
+          action: 'rate_limit',
+          limits: { max_calls: 1, window: '60s' },
+        },
+      ],
+    })
+    const { service, rateLimiter, records } = makeService({ policy, withLimiters: true })
+    // Pre-fill the bucket so peek is over the limit.
+    rateLimiter?.record({ key: 'tool:send', maxCalls: 1, windowMs: 60_000 })
+    const res = service.evaluate(evalInput())
+    expect(res.body['decision']).toBe('rate_limited')
+    expect(records.at(-1)?.record.block_reason).toBe('rate_limited')
+  })
+
+  it('creates a native approval ticket for require_approval', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'ap', match: { tool: 'send' }, action: 'require_approval' }],
+    })
+    const { service, approvalRouter } = makeService({ policy, withApprovals: true })
+    const res = service.evaluate(evalInput())
+    expect(res.body['decision']).toBe('require_approval')
+    const approval = res.body['approval'] as { id: string; resolve_path: string }
+    expect(approval.id).toBeTruthy()
+    expect(approval.resolve_path).toBe(`/approval/${approval.id}/resolve`)
+    expect(approvalRouter?.getTicket(approval.id)?.channel_name).toBe('native:openclaw')
+  })
+
+  describe('drift guard (D6)', () => {
+    it('detects drift across two evaluates and gates per on_tool_drift: block', () => {
+      const policy = compile({ default: 'allow', on_tool_drift: 'block', rules: [] })
+      const { service } = makeService({ policy })
+      const first = service.evaluate(evalInput({ tool: { name: 'send', description: 'v1' } }))
+      expect(first.body['decision']).toBe('allow')
+      const second = service.evaluate(evalInput({ tool: { name: 'send', description: 'v2' } }))
+      expect(second.body['decision']).toBe('deny')
+      expect(second.body['tool_drift']).toBeTruthy()
+    })
+
+    it('isolates drift baselines per origin', () => {
+      const policy = compile({ default: 'allow', on_tool_drift: 'block', rules: [] })
+      const { service } = makeService({ policy })
+      service.evaluate(evalInput({ origin: 'openclaw', tool: { name: 'send', description: 'v1' } }))
+      // A different origin baselining the same tool name with a different def
+      // must NOT be flagged as drift — separate caches.
+      const other = service.evaluate(
+        evalInput({ origin: 'hermes', tool: { name: 'send', description: 'vX' } }),
+      )
+      expect(other.body['decision']).toBe('allow')
+    })
+  })
+
+  describe('memory budgets (D15)', () => {
+    it('rejects tool_input over 64 KiB with 413', () => {
+      const { service } = makeService()
+      const big = 'x'.repeat(70 * 1024)
+      const res = service.evaluate(evalInput({ arguments: { blob: big } }))
+      expect(res.status).toBe(413)
+    })
+
+    it('rejects first-seen tools past the per-origin cap but allows existing updates', () => {
+      const policy = compile({ default: 'allow', on_tool_drift: 'block', rules: [] })
+      const { service } = makeService({ policy })
+      // The cap is large (1024); simulate by spying is overkill — instead verify
+      // that an already-baselined tool keeps evaluating (the at-cap rule).
+      service.evaluate(evalInput({ tool: { name: 'send', description: 'v1' } }))
+      const again = service.evaluate(evalInput({ tool: { name: 'send', description: 'v1' } }))
+      expect(again.status).toBe(200)
+    })
+  })
+})
+
+// ---------------------------------------------------------------------------
+// audit
+// ---------------------------------------------------------------------------
+
+describe('GovernanceService.audit', () => {
+  it('consumes the rate counter on success and is idempotent on replay', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'rl',
+          match: { tool: 'send' },
+          action: 'rate_limit',
+          limits: { max_calls: 5, window: '60s' },
+        },
+      ],
+    })
+    const { service, rateLimiter } = makeService({ policy, withLimiters: true })
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+
+    const a1 = service.audit(auditInput(id), 'hash-1')
+    expect(a1.status).toBe(201)
+    expect(rateLimiter?.getKeyState('tool:send')?.current).toBe(1)
+
+    // Identical replay — idempotent, no double consumption.
+    const a2 = service.audit(auditInput(id), 'hash-1')
+    expect(a2.status).toBe(200)
+    expect(a2.body['already_finalized']).toBe(true)
+    expect(rateLimiter?.getKeyState('tool:send')?.current).toBe(1)
+  })
+
+  it('returns 409 evaluation_conflict on a different payload for a finalized id', () => {
+    const { service } = makeService()
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+    service.audit(auditInput(id), 'hash-1')
+    const conflict = service.audit(auditInput(id, { status: 'error' }), 'hash-2')
+    expect(conflict.status).toBe(409)
+    expect(conflict.body['error']).toBe('evaluation_conflict')
+  })
+
+  it('does not consume counters for not_executed', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'rl',
+          match: { tool: 'send' },
+          action: 'rate_limit',
+          limits: { max_calls: 5, window: '60s' },
+        },
+      ],
+    })
+    const { service, rateLimiter } = makeService({ policy, withLimiters: true })
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+    service.audit(auditInput(id, { status: 'not_executed' }), 'h')
+    expect(rateLimiter?.getKeyState('tool:send')).toBeUndefined()
+  })
+
+  it('returns 404 for an unknown evaluation id', () => {
+    const { service } = makeService()
+    expect(service.audit(auditInput('nope'), 'h').status).toBe(404)
+  })
+
+  it('returns 200 already_finalized for a terminal-at-evaluate decision', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'd', match: { tool: 'send' }, action: 'deny' }],
+    })
+    const { service } = makeService({ policy })
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+    const res = service.audit(auditInput(id), 'any')
+    expect(res.status).toBe(200)
+    expect(res.body['finalized_by']).toBe('evaluate')
+  })
+
+  it('overrides spend with actual_amount at audit time', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'sp',
+          match: { tool: 'send' },
+          action: 'spend_limit',
+          limits: { max_spend: { field: '$.cost', limit: 100, currency: 'USD', window: '1d' } },
+        },
+      ],
+    })
+    const { service, spendLimiter } = makeService({ policy, withLimiters: true })
+    const ev = service.evaluate(evalInput({ arguments: { cost: 10 } }))
+    const id = ev.body['evaluation_id'] as string
+    service.audit(auditInput(id, { actual_amount: 42 }), 'h')
+    expect(spendLimiter?.getKeyState('tool:send')?.current_spend).toBe(42)
+  })
+})
+
+// ---------------------------------------------------------------------------
+// approvals + deadlines
+// ---------------------------------------------------------------------------
+
+describe('GovernanceService approvals and deadlines', () => {
+  it('blocks /audit with 409 until the approval is resolved, then copies status', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'ap', match: { tool: 'send' }, action: 'require_approval' }],
+    })
+    const { service, records } = makeService({ policy, withApprovals: true })
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+    const approval = ev.body['approval'] as { id: string }
+
+    expect(service.audit(auditInput(id), 'h').status).toBe(409)
+
+    const resolve = service.resolveApproval(approval.id, {
+      resolution: 'approved',
+      resolved_by: 'telegram:@oli',
+    })
+    expect(resolve.status).toBe(200)
+
+    const audited = service.audit(auditInput(id), 'h')
+    expect(audited.status).toBe(201)
+    expect(records.at(-1)?.record.approval_status).toBe('approved')
+    expect(records.at(-1)?.record.approved_by).toBe('telegram:@oli')
+  })
+
+  it('enforces evaluation TTL on access (404 evaluation_expired before the sweep)', () => {
+    const { service, advance, records } = makeService({ ttlMs: 1000 })
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+
+    advance(2000) // past TTL, but no sweep has run (sweepIntervalMs: 0)
+    const res = service.audit(auditInput(id), 'h')
+    expect(res.status).toBe(404)
+    expect(res.body['error']).toBe('evaluation_expired')
+    // An evaluation_expired record was written (the bypass signal).
+    expect(records.at(-1)?.record.record_kind).toBe('evaluation_expired')
+    expect(records.at(-1)?.record.block_reason).toBeNull()
+  })
+
+  it('times out a native ticket when its deadline passes before the evaluation TTL', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'ap',
+          match: { tool: 'send' },
+          action: 'require_approval',
+          approval: { channel: 'native', timeout: '1s' },
+        },
+      ],
+    })
+    const { service, advance, approvalRouter } = makeService({
+      policy,
+      withApprovals: true,
+      ttlMs: 600_000,
+    })
+    const ev = service.evaluate(evalInput())
+    const approval = ev.body['approval'] as { id: string }
+    advance(2000) // ticket deadline passed, evaluation TTL not
+    service.sweep()
+    expect(approvalRouter?.getTicket(approval.id)?.status).toBe('timeout')
+  })
+
+  it('rejects a late resolve before the sweep with 409 (on-access enforcement)', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'ap',
+          match: { tool: 'send' },
+          action: 'require_approval',
+          approval: { channel: 'native', timeout: '1s' },
+        },
+      ],
+    })
+    const { service, advance } = makeService({ policy, withApprovals: true, ttlMs: 600_000 })
+    const ev = service.evaluate(evalInput())
+    const approval = ev.body['approval'] as { id: string }
+
+    advance(2000) // past the ticket deadline, but NO sweep has run
+    const res = service.resolveApproval(approval.id, { resolution: 'approved', resolved_by: 'x' })
+    expect(res.status).toBe(409)
+    expect(res.body['error']).toBe('already_resolved')
+    expect(res.body['status']).toBe('timeout')
+  })
+})
+
+// ---------------------------------------------------------------------------
+// actual_amount validation (§4 policy) and memory-budget overshoot
+// ---------------------------------------------------------------------------
+
+describe('GovernanceService /audit validation and budgets', () => {
+  const spendPolicy = () =>
+    compile({
+      default: 'allow',
+      rules: [
+        {
+          name: 'sp',
+          match: { tool: 'send' },
+          action: 'spend_limit',
+          limits: { max_spend: { field: '$.cost', limit: 100, currency: 'USD', window: '1d' } },
+        },
+      ],
+    })
+
+  it('rejects a negative actual_amount with 400 invalid_actual_amount', () => {
+    const { service } = makeService({ policy: spendPolicy(), withLimiters: true })
+    const ev = service.evaluate(evalInput({ arguments: { cost: 10 } }))
+    const id = ev.body['evaluation_id'] as string
+    const res = service.audit(auditInput(id, { actual_amount: -5 }), 'h')
+    expect(res.status).toBe(400)
+    expect(res.body['error']).toBe('invalid_actual_amount')
+  })
+
+  it('rejects actual_amount on a non-spend evaluation with 400 no_spend_rule', () => {
+    const { service } = makeService() // default allow, no spend plan
+    const ev = service.evaluate(evalInput())
+    const id = ev.body['evaluation_id'] as string
+    const res = service.audit(auditInput(id, { actual_amount: 5 }), 'h')
+    expect(res.status).toBe(400)
+    expect(res.body['error']).toBe('no_spend_rule')
+  })
+
+  it('does not overshoot the pending-bytes cap (the crossing entry is refused)', () => {
+    // A tiny byte cap so the second pending entry crosses it.
+    const service = new GovernanceService({
+      policy: compile({ default: 'allow', rules: [] }),
+      sweepIntervalMs: 0,
+      maxPendingBytes: 30,
+    })
+    // First evaluate fits; second would push pendingBytes over the cap.
+    const first = service.evaluate(evalInput({ arguments: { a: 'xxxxxxxxxx' } }))
+    expect(first.status).toBe(200)
+    const second = service.evaluate(evalInput({ arguments: { b: 'yyyyyyyyyy' } }))
+    expect(second.status).toBe(503)
+    expect(second.body['error']).toBe('evaluation_backlog_full')
+    service.close()
+  })
+})
+
+// ---------------------------------------------------------------------------
+// approval-router wiring hardening (fail-closed guard)
+// ---------------------------------------------------------------------------
+
+describe('GovernanceService approval wiring', () => {
+  it('throws at construction when a require_approval rule exists without approvalRouter', () => {
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'ap', match: { tool: 'send' }, action: 'require_approval' }],
+    })
+    expect(() => new GovernanceService({ policy, sweepIntervalMs: 0 })).toThrow(
+      GovernanceConfigError,
+    )
+  })
+
+  it('throws at construction when flag_destructive can emit require_approval', () => {
+    const policy = compile({
+      default: 'allow',
+      flag_destructive: 'require_approval',
+      rules: [],
+    })
+    expect(() => new GovernanceService({ policy, sweepIntervalMs: 0 })).toThrow(
+      GovernanceConfigError,
+    )
+  })
+
+  it('throws at construction when on_tool_drift can emit require_approval', () => {
+    const policy = compile({
+      default: 'allow',
+      on_tool_drift: 'require_approval',
+      rules: [],
+    })
+    expect(() => new GovernanceService({ policy, sweepIntervalMs: 0 })).toThrow(
+      GovernanceConfigError,
+    )
+  })
+
+  it('allows construction when approvalRouter is present', () => {
+    const queue = new ApprovalQueue({ cleanupIntervalMs: 0 })
+    const router = new ApprovalRouter({
+      defaultTimeoutMs: 1000,
+      defaultOnTimeout: 'deny',
+      channels: new Map(),
+      queue,
+    })
+    const policy = compile({
+      default: 'allow',
+      rules: [{ name: 'ap', match: { tool: 'send' }, action: 'require_approval' }],
+    })
+    expect(
+      () =>
+        new GovernanceService({
+          policy,
+          approvalRouter: router,
+          sweepIntervalMs: 0,
+        }),
+    ).not.toThrow()
+    router.close()
+    queue.close()
+  })
+
+  it('rejects an updatePolicy upgrade to approval-capable without approvalRouter', () => {
+    const oldPolicy = compile({
+      default: 'allow',
+      rules: [{ name: 'deny-send', match: { tool: 'send' }, action: 'deny' }],
+    })
+    const nextPolicy = compile({
+      default: 'allow',
+      rules: [{ name: 'ap', match: { tool: 'send' }, action: 'require_approval' }],
+    })
+    const service = new GovernanceService({ policy: oldPolicy, sweepIntervalMs: 0 })
+    const before = service.evaluate(evalInput())
+    expect(before.body['decision']).toBe('deny')
+
+    expect(() => {
+      service.updatePolicy(nextPolicy)
+    }).toThrow(GovernanceConfigError)
+
+    const after = service.evaluate(evalInput())
+    expect(after.body['decision']).toBe('deny')
+    service.close()
+  })
+})
+
+// ---------------------------------------------------------------------------
+// install-scan + resolve
+// ---------------------------------------------------------------------------
+
+describe('GovernanceService.installScan', () => {
+  it('returns observational allow and writes an install_scan record', () => {
+    const { service, records } = makeService()
+    const res = service.installScan({
+      origin: 'openclaw',
+      agent_id: null,
+      session_id: null,
+      package: { name: 'left-pad', source: 'npm', version: '1.3.0' },
+      metadata: null,
+    })
+    expect(res.status).toBe(200)
+    expect(res.body['decision']).toBe('allow')
+    expect(records[0]?.record.record_kind).toBe('install_scan')
+    expect(records[0]?.record.tool_name).toBe('install:npm:left-pad')
+  })
+})
+
+describe('GovernanceService.resolveApproval', () => {
+  it('404s an unknown ticket and 409s a non-native ticket', () => {
+    const queue = new ApprovalQueue({ cleanupIntervalMs: 0 })
+    const router = new ApprovalRouter({
+      defaultTimeoutMs: 1000,
+      defaultOnTimeout: 'deny',
+      channels: new Map(),
+      queue,
+    })
+    const service = new GovernanceService({
+      policy: compile({ default: 'allow', rules: [] }),
+      approvalRouter: router,
+      sweepIntervalMs: 0,
+    })
+    expect(service.resolveApproval('nope', { resolution: 'approved' }).status).toBe(404)
+
+    // A router-managed (non-native) ticket cannot be resolved via the sideband.
+    void router.submit({
+      tool_name: 't',
+      tool_input: {},
+      matched_rule: undefined,
+      session_id: null,
+    })
+    const ticketId = queue.listPending()[0]?.id as string
+    expect(service.resolveApproval(ticketId, { resolution: 'approved' }).body['error']).toBe(
+      'not_a_native_ticket',
+    )
+    router.close()
+  })
+})
diff --git a/packages/proxy/src/sideband/governance-service.ts b/packages/proxy/src/sideband/governance-service.ts
new file mode 100644
index 0000000..a22a0b3
--- /dev/null
+++ b/packages/proxy/src/sideband/governance-service.ts
@@ -0,0 +1,1015 @@
+// ---------------------------------------------------------------------------
+// GovernanceService — the engine behind the sideband governance endpoints
+// (issue #12). Hook-based adapters (OpenClaw and future ones) drive Helio's
+// policy engine over HTTP without an MCP transport to interpose on:
+//
+//   POST /evaluate  → decide a tool call, side-effect-free on limit counters
+//   POST /audit     → record the outcome, consuming counters (idempotent)
+//   POST /install-scan → evaluate an install (observational until #13)
+//   /approval/:id/resolve → record a natively-handled approval
+//
+// This service owns the decision pipeline reuse (shared with the MCP path),
+// the pending-evaluation registry with TTL + memory budgets (D4/D15), per-
+// origin drift caches (D6), idempotent finalize semantics (D5), and the
+// native-approval deadline rules (D10). It deliberately holds NO HTTP concern;
+// the route layer (governance-api.ts) validates and adapts.
+// ---------------------------------------------------------------------------
+
+import { randomUUID } from 'node:crypto'
+import type { CompiledPolicy, PolicyAction, CompiledLimits } from '../policy/types.js'
+import { decide } from '../policy/decision-pipeline.js'
+import { ToolAnnotationCache } from '../policy/annotation-cache.js'
+import type { ToolDriftChange } from '../policy/annotation-cache.js'
+import { resolvePath } from '../policy/matchers.js'
+import { canonicalize } from '../util/canonical-json.js'
+import type { EvidenceStore } from '../evidence/store.js'
+import type { AuditWriter } from '../audit/writer.js'
+import type { AuditRecord } from '../audit/types.js'
+import type { ApprovalRouter, NativeResolution } from '../approval/router.js'
+import type { ApprovalTicket } from '../approval/types.js'
+import type { RateLimiter } from '../policy/rate-limiter.js'
+import type { SpendLimiter } from '../policy/spend-limiter.js'
+import { GovernanceConfigError } from './errors.js'
+
+// ---------------------------------------------------------------------------
+// Memory budgets (D15) — constants in v1, tunable later without contract change
+// ---------------------------------------------------------------------------
+
+const MAX_ORIGINS = 32
+const MAX_TOOLS_PER_ORIGIN = 1_024
+const MAX_TOOL_INPUT_BYTES = 64 * 1_024
+const MAX_PENDING_COUNT = 10_000
+const MAX_PENDING_BYTES = 64 * 1_024 * 1_024
+
+const SWEEP_INTERVAL_MS = 30_000
+
+// ---------------------------------------------------------------------------
+// Wire-facing types (snake_case crosses the boundary; the route layer maps)
+// ---------------------------------------------------------------------------
+
+/** The outcome vocabulary adapters branch on (D8) — never internal rule actions. */
+export type WireDecision =
+  | 'allow'
+  | 'deny'
+  | 'require_approval'
+  | 'rate_limited'
+  | 'spend_limited'
+  | 'dry_run'
+
+/** Tool definition carried by /evaluate (optional; enables the drift guard). */
+export interface WireToolDefinition {
+  readonly name: string
+  readonly description?: string
+  readonly input_schema?: unknown
+  readonly output_schema?: unknown
+  readonly title?: string
+  readonly annotations?: Record<string, unknown>
+}
+
+export interface EvaluateInput {
+  readonly origin: string
+  readonly adapter_version?: string
+  readonly agent_id: string | null
+  readonly session_id: string | null
+  readonly tool: WireToolDefinition
+  readonly arguments: Record<string, unknown> | undefined
+  readonly metadata: Record<string, unknown> | null
+}
+
+export interface InstallScanInput {
+  readonly origin: string
+  readonly agent_id: string | null
+  readonly session_id: string | null
+  readonly package: {
+    readonly name: string
+    readonly version?: string
+    readonly source?: string
+    readonly spec?: string
+    readonly url?: string
+  }
+  readonly metadata: Record<string, unknown> | null
+}
+
+export interface AuditInput {
+  readonly evaluation_id: string
+  readonly status: 'success' | 'error' | 'not_executed'
+  readonly error?: string
+  readonly duration_ms?: number
+  readonly result?: unknown
+  readonly actual_amount?: number
+}
+
+export interface ResolveApprovalInput {
+  readonly resolution: 'approved' | 'denied' | 'timeout' | 'cancelled'
+  readonly resolved_by?: string
+  readonly reason?: string
+  readonly scope?: 'once' | 'always'
+}
+
+/** A service result: HTTP status + JSON body, adapted verbatim by the route. */
+export interface ServiceResult {
+  readonly status: number
+  readonly body: Record<string, unknown>
+}
+
+// ---------------------------------------------------------------------------
+// Internal registry types
+// ---------------------------------------------------------------------------
+
+interface LimitPlan {
+  readonly kind: 'rate' | 'spend'
+  readonly key: string
+  readonly limits: CompiledLimits
+  /** Resolved spend amount at evaluate time (spend only). */
+  readonly amount?: number
+  readonly currency?: string
+}
+
+interface PendingEvaluation {
+  readonly evaluationId: string
+  readonly origin: string
+  readonly agentId: string | null
+  readonly sessionId: string | null
+  readonly toolName: string
+  readonly toolInput: Record<string, unknown>
+  readonly metadata: Record<string, unknown> | null
+  readonly action: PolicyAction
+  readonly matchedRuleName: string | null
+  readonly matchedRuleIndex: number | null
+  readonly flaggedDestructive: boolean
+  readonly limitPlan: LimitPlan | undefined
+  readonly approvalTicketId: string | undefined
+  readonly timestampIso: string
+  readonly createdAtMs: number
+  readonly evaluationExpiresAtMs: number
+  readonly ticketTimeoutAtMs: number | undefined
+  readonly bytes: number
+}
+
+type FinalizedBy = 'evaluate' | 'audit' | 'expired'
+
+interface Tombstone {
+  readonly auditRecordId: string
+  /** null = any payload accepted idempotently (terminal-at-evaluate / expired). */
+  readonly payloadHash: string | null
+  readonly finalizedBy: FinalizedBy
+  readonly expiresAtMs: number
+}
+
+export interface GovernanceServiceOptions {
+  readonly policy: CompiledPolicy
+  readonly environment?: string
+  readonly evidenceStore?: EvidenceStore
+  readonly approvalRouter?: ApprovalRouter
+  readonly rateLimiter?: RateLimiter
+  readonly spendLimiter?: SpendLimiter
+  readonly auditWriter?: AuditWriter
+  /** Default approval timeout (ms) when a rule sets none. */
+  readonly approvalTimeoutMs?: number
+  /** Pending-evaluation TTL (ms). Default 10 minutes. */
+  readonly ttlMs?: number
+  /** Clock for testable time. Defaults to Date.now. */
+  readonly now?: () => number
+  /** Sweep interval (ms). 0 disables the periodic GC backstop. Default 30s. */
+  readonly sweepIntervalMs?: number
+  /** Max pending evaluations (count). Default 10,000 (D15). Overridable for tests. */
+  readonly maxPending?: number
+  /** Max pending-evaluation footprint (serialized bytes). Default 64 MiB (D15). */
+  readonly maxPendingBytes?: number
+}
+
+// ---------------------------------------------------------------------------
+// GovernanceService
+// ---------------------------------------------------------------------------
+
+export class GovernanceService {
+  private policy: CompiledPolicy
+  private readonly environment: string | undefined
+  private readonly evidenceStore: EvidenceStore | undefined
+  private readonly approvalRouter: ApprovalRouter | undefined
+  private readonly rateLimiter: RateLimiter | undefined
+  private readonly spendLimiter: SpendLimiter | undefined
+  private readonly auditWriter: AuditWriter | undefined
+  private readonly approvalTimeoutMs: number
+  private readonly ttlMs: number
+  private readonly now: () => number
+  private readonly maxPending: number
+  private readonly maxPendingBytes: number
+
+  private readonly pending = new Map<string, PendingEvaluation>()
+  private readonly tombstones = new Map<string, Tombstone>()
+  private readonly caches = new Map<string, ToolAnnotationCache>()
+  /** Native approval ticket id → its pending evaluation id, for on-access
+   * deadline enforcement on the resolve path (R4-1). */
+  private readonly ticketToEvaluation = new Map<string, string>()
+  private pendingBytes = 0
+  private sweepTimer: ReturnType<typeof setInterval> | null = null
+  private closed = false
+
+  constructor(options: GovernanceServiceOptions) {
+    this.policy = options.policy
+    this.environment = options.environment
+    this.evidenceStore = options.evidenceStore
+    this.approvalRouter = options.approvalRouter
+    this.rateLimiter = options.rateLimiter
+    this.spendLimiter = options.spendLimiter
+    this.auditWriter = options.auditWriter
+    this.approvalTimeoutMs = options.approvalTimeoutMs ?? 300_000
+    this.ttlMs = options.ttlMs ?? 600_000
+    this.now = options.now ?? Date.now
+    this.maxPending = options.maxPending ?? MAX_PENDING_COUNT
+    this.maxPendingBytes = options.maxPendingBytes ?? MAX_PENDING_BYTES
+    this.assertApprovalRouter(this.policy)
+
+    const sweepMs = options.sweepIntervalMs ?? SWEEP_INTERVAL_MS
+    if (sweepMs > 0) {
+      this.sweepTimer = setInterval(() => {
+        this.sweep()
+      }, sweepMs)
+      this.sweepTimer.unref()
+    }
+  }
+
+  /** Swap the compiled policy on hot-reload (mirrors GovernedForwarder). */
+  updatePolicy(policy: CompiledPolicy): void {
+    this.assertApprovalRouter(policy)
+    this.policy = policy
+  }
+
+  // -------------------------------------------------------------------------
+  // POST /evaluate
+  // -------------------------------------------------------------------------
+
+  evaluate(req: EvaluateInput): ServiceResult {
+    // D15 budgets: tool_input size, origin cardinality, pending pressure.
+    const inputBytes = byteLength(req.arguments ?? {})
+    if (inputBytes > MAX_TOOL_INPUT_BYTES) {
+      return { status: 413, body: { error: 'tool_input_too_large' } }
+    }
+    // Account for this entry's footprint up front so the global cap cannot be
+    // overshot by the entry that crosses it.
+    const entryBytes = inputBytes + byteLength(req.metadata ?? {})
+    if (!this.caches.has(req.origin) && this.caches.size >= MAX_ORIGINS) {
+      return { status: 400, body: { error: 'origin_limit_exceeded' } }
+    }
+    if (
+      this.pending.size >= this.maxPending ||
+      this.pendingBytes + entryBytes > this.maxPendingBytes
+    ) {
+      return { status: 503, body: { error: 'evaluation_backlog_full' } }
+    }
+
+    const cache = this.cacheFor(req.origin)
+    const toolName = req.tool.name
+
+    // Drift guard (D6): merge the supplied definition into the per-origin
+    // cache. A first-seen tool past the per-origin cap is refused fail-closed;
+    // updates to already-baselined tools always proceed.
+    const hasDefinition = definitionProvided(req.tool)
+    if (hasDefinition) {
+      if (!cache.has(toolName) && cache.size >= MAX_TOOLS_PER_ORIGIN) {
+        return { status: 400, body: { error: 'tool_baseline_limit' } }
+      }
+      cache.updateSingle(toMcpToolDef(req.tool))
+    }
+
+    const pipeline = decide({
+      toolName,
+      toolArguments: req.arguments,
+      sessionId: req.session_id ?? undefined,
+      policy: this.policy,
+      environment: this.environment,
+      evidenceStore: this.evidenceStore,
+      baselineAnnotations: cache.get(toolName),
+      currentAnnotations: cache.getCurrent(toolName),
+      driftEvent: cache.getDrift(toolName),
+    })
+
+    const { decision } = pipeline
+    const evaluationId = randomUUID()
+    const timestampIso = new Date(this.now()).toISOString()
+
+    // Resolve the limit step (peek — never consume at /evaluate).
+    let wire: WireDecision
+    let limitPlan: LimitPlan | undefined
+    let limitsBlock: Record<string, unknown> | undefined
+
+    if (pipeline.isDryRun) {
+      wire = 'dry_run'
+    } else if (decision.action === 'deny') {
+      wire = 'deny'
+    } else if (decision.action === 'require_approval') {
+      wire = 'require_approval'
+    } else if (decision.action === 'rate_limit') {
+      const planned = this.planRate(decision, toolName, req.session_id)
+      limitPlan = planned?.plan
+      limitsBlock = planned?.block ? { rate: planned.block } : undefined
+      wire = planned?.allowed ? 'allow' : 'rate_limited'
+    } else if (decision.action === 'spend_limit') {
+      const planned = this.planSpend(decision, toolName, req.session_id, req.arguments)
+      limitPlan = planned?.plan
+      limitsBlock = planned?.block ? { spend: planned.block } : undefined
+      wire = planned?.allowed ? 'allow' : 'spend_limited'
+    } else {
+      wire = 'allow'
+    }
+
+    const matchedRuleName = decision.matchedRule?.name ?? null
+    const matchedRuleIndex = decision.matchedRule?.index ?? null
+
+    const responseBody: Record<string, unknown> = {
+      evaluation_id: evaluationId,
+      decision: wire,
+      reason: decision.reason,
+      matched_rule: matchedRuleName,
+      matched_rule_index: matchedRuleIndex,
+    }
+    if (isBlocking(wire)) {
+      responseBody['feedback'] = buildFeedback(decision.matchedRule, decision.reason)
+    }
+    if (limitsBlock) responseBody['limits'] = limitsBlock
+    if (wire === 'dry_run') {
+      responseBody['dry_run'] = {
+        would_forward: decision.action === 'allow' && !pipeline.evidenceBlocked,
+        evidence_satisfied: !pipeline.evidenceBlocked,
+        limits_ok: true,
+      }
+    }
+    if (pipeline.driftEvent) {
+      responseBody['tool_drift'] = { changes: pipeline.driftEvent.changes }
+    }
+
+    // Terminal decisions (D5): audit immediately, tombstone, no pending entry.
+    if (isTerminalAtEvaluate(wire)) {
+      const auditId = this.writeAudit({
+        timestampIso,
+        origin: req.origin,
+        agentId: req.agent_id,
+        sessionId: req.session_id,
+        toolName,
+        toolInput: req.arguments ?? {},
+        metadata: req.metadata,
+        action: decision.action,
+        wire,
+        matchedRuleName,
+        matchedRuleIndex,
+        flaggedDestructive: pipeline.flaggedDestructive,
+        dryRun: wire === 'dry_run',
+        recordKind: 'tool_call',
+        limitsChain: limitsBlock,
+      })
+      this.tombstones.set(evaluationId, {
+        auditRecordId: auditId,
+        payloadHash: null,
+        finalizedBy: 'evaluate',
+        expiresAtMs: this.now() + this.ttlMs,
+      })
+      return { status: 200, body: responseBody }
+    }
+
+    // allow / require_approval → pending entry awaiting /audit.
+    let approvalTicketId: string | undefined
+    let ticketTimeoutAtMs: number | undefined
+    if (wire === 'require_approval') {
+      // Invariant: a require_approval decision can only arise from an
+      // approval-capable policy, and assertApprovalRouter() guarantees a router
+      // exists for such policies at construct/reload time. Assert rather than
+      // silently skip — a missing router here would otherwise fail open (a
+      // require_approval response with no ticket the adapter could resolve).
+      const router = this.approvalRouter
+      if (!router) {
+        throw new GovernanceConfigError(
+          '[helio] invariant violation: require_approval decision without an approvalRouter',
+        )
+      }
+      const timeoutMs = decision.matchedRule?.approval?.timeoutMs ?? this.approvalTimeoutMs
+      const ticket = router.createNativeTicket({
+        tool_name: toolName,
+        tool_input: req.arguments ?? {},
+        matched_rule: decision.matchedRule,
+        session_id: req.session_id,
+        origin: req.origin,
+        timeout_ms: timeoutMs,
+      })
+      approvalTicketId = ticket.id
+      ticketTimeoutAtMs = this.now() + timeoutMs
+      responseBody['approval'] = {
+        id: ticket.id,
+        timeout_ms: timeoutMs,
+        resolve_path: `/approval/${ticket.id}/resolve`,
+      }
+    }
+
+    const entry: PendingEvaluation = {
+      evaluationId,
+      origin: req.origin,
+      agentId: req.agent_id,
+      sessionId: req.session_id,
+      toolName,
+      toolInput: req.arguments ?? {},
+      metadata: req.metadata,
+      action: decision.action,
+      matchedRuleName,
+      matchedRuleIndex,
+      flaggedDestructive: pipeline.flaggedDestructive,
+      limitPlan,
+      approvalTicketId,
+      timestampIso,
+      createdAtMs: this.now(),
+      evaluationExpiresAtMs: this.now() + this.ttlMs,
+      ticketTimeoutAtMs,
+      bytes: entryBytes,
+    }
+    this.pending.set(evaluationId, entry)
+    this.pendingBytes += entryBytes
+    if (approvalTicketId) this.ticketToEvaluation.set(approvalTicketId, evaluationId)
+
+    return { status: 200, body: responseBody }
+  }
+
+  // -------------------------------------------------------------------------
+  // POST /audit
+  // -------------------------------------------------------------------------
+
+  audit(req: AuditInput, payloadHash: string): ServiceResult {
+    const id = req.evaluation_id
+
+    // Idempotency: a prior finalize leaves a tombstone (D5 response matrix).
+    const tomb = this.tombstones.get(id)
+    if (tomb) {
+      if (tomb.finalizedBy === 'expired') {
+        return { status: 404, body: { error: 'evaluation_expired' } }
+      }
+      if (tomb.finalizedBy === 'evaluate') {
+        return {
+          status: 200,
+          body: {
+            ok: true,
+            audit_record_id: tomb.auditRecordId,
+            already_finalized: true,
+            finalized_by: 'evaluate',
+          },
+        }
+      }
+      // finalizedBy 'audit' — identical replay is idempotent; a different
+      // payload under the same id is an adapter bug.
+      if (tomb.payloadHash === payloadHash) {
+        return {
+          status: 200,
+          body: { ok: true, audit_record_id: tomb.auditRecordId, already_finalized: true },
+        }
+      }
+      return { status: 409, body: { error: 'evaluation_conflict' } }
+    }
+
+    const entry = this.pending.get(id)
+    if (!entry) {
+      return { status: 404, body: { error: 'evaluation_unknown' } }
+    }
+
+    // On-access deadline enforcement (R4-1): apply any crossed deadline before
+    // handling, so behavior is deterministic regardless of sweep timing.
+    if (this.enforceDeadlines(entry) === 'expired') {
+      return { status: 404, body: { error: 'evaluation_expired' } }
+    }
+
+    // require_approval must be resolved before auditing (D10). Retryable.
+    let approvalStatus: string | null = null
+    let approvedBy: string | null = null
+    if (entry.approvalTicketId) {
+      const ticket = this.getTicketStatus(entry.approvalTicketId)
+      const status = ticket?.status
+      if (!status || status === 'pending') {
+        return { status: 409, body: { error: 'approval_unresolved' } }
+      }
+      approvalStatus = status
+      approvedBy = ticket.resolved_by ?? null
+    }
+
+    // Validate the optional post-hoc spend override (§4 actual_amount policy).
+    // Done before any state mutation: a bad value is an adapter bug we reject
+    // explicitly rather than letting it throw inside the limiter.
+    if (req.actual_amount !== undefined) {
+      if (!Number.isFinite(req.actual_amount) || req.actual_amount < 0) {
+        return { status: 400, body: { error: 'invalid_actual_amount' } }
+      }
+      if (entry.limitPlan?.kind !== 'spend') {
+        return { status: 400, body: { error: 'no_spend_rule' } }
+      }
+    }
+
+    const callHappened = req.status === 'success' || req.status === 'error'
+
+    // Consume limit counters now (D3) — only when the call actually executed.
+    let limitsChain: Record<string, unknown> | undefined
+    if (callHappened && entry.limitPlan) {
+      limitsChain = this.commitLimit(entry.limitPlan, req.actual_amount)
+    }
+
+    // Record the tool call for evidence/dependency tracking.
+    if (callHappened && this.evidenceStore && entry.sessionId) {
+      this.evidenceStore.recordToolCall(entry.sessionId, entry.toolName, req.status === 'success')
+    }
+
+    const auditId = this.writeAudit({
+      timestampIso: entry.timestampIso,
+      origin: entry.origin,
+      agentId: entry.agentId,
+      sessionId: entry.sessionId,
+      toolName: entry.toolName,
+      toolInput: entry.toolInput,
+      metadata: entry.metadata,
+      action: entry.action,
+      wire: entry.action === 'require_approval' ? 'require_approval' : 'allow',
+      matchedRuleName: entry.matchedRuleName,
+      matchedRuleIndex: entry.matchedRuleIndex,
+      flaggedDestructive: entry.flaggedDestructive,
+      dryRun: false,
+      recordKind: 'tool_call',
+      limitsChain,
+      approvalStatus,
+      approvedBy,
+      upstreamError: req.status === 'error' ? (req.error ?? 'tool call failed') : null,
+      upstreamResponse: req.result ?? null,
+      upstreamLatencyMs: req.duration_ms ?? null,
+    })
+
+    this.discardPending(entry)
+    this.tombstones.set(id, {
+      auditRecordId: auditId,
+      payloadHash,
+      finalizedBy: 'audit',
+      expiresAtMs: this.now() + this.ttlMs,
+    })
+
+    return { status: 201, body: { ok: true, audit_record_id: auditId } }
+  }
+
+  // -------------------------------------------------------------------------
+  // POST /install-scan (observational until #13 — D9)
+  // -------------------------------------------------------------------------
+
+  installScan(req: InstallScanInput): ServiceResult {
+    const evaluationId = randomUUID()
+    const toolName = `install:${req.package.source ?? 'pkg'}:${req.package.name}`
+    const auditId = this.writeAudit({
+      timestampIso: new Date(this.now()).toISOString(),
+      origin: req.origin,
+      agentId: req.agent_id,
+      sessionId: req.session_id,
+      toolName,
+      toolInput: { ...req.package },
+      metadata: req.metadata,
+      action: 'allow',
+      wire: 'allow',
+      matchedRuleName: null,
+      matchedRuleIndex: null,
+      flaggedDestructive: false,
+      dryRun: false,
+      recordKind: 'install_scan',
+    })
+    this.tombstones.set(evaluationId, {
+      auditRecordId: auditId,
+      payloadHash: null,
+      finalizedBy: 'evaluate',
+      expiresAtMs: this.now() + this.ttlMs,
+    })
+    return {
+      status: 200,
+      body: {
+        evaluation_id: evaluationId,
+        decision: 'allow',
+        reason: 'no install-time rules defined',
+        matched_rule: null,
+        matched_rule_index: null,
+      },
+    }
+  }
+
+  // -------------------------------------------------------------------------
+  // POST /approval/:id/resolve (D10)
+  // -------------------------------------------------------------------------
+
+  resolveApproval(ticketId: string, req: ResolveApprovalInput): ServiceResult {
+    if (!this.approvalRouter) {
+      return { status: 503, body: { error: 'governance_unavailable' } }
+    }
+    const ticket = this.getTicketStatus(ticketId)
+    if (!ticket) {
+      return { status: 404, body: { error: 'ticket_not_found' } }
+    }
+    if (!ticket.channel_name.startsWith('native:')) {
+      return { status: 409, body: { error: 'not_a_native_ticket' } }
+    }
+
+    // On-access deadline enforcement (R4-1): a resolve arriving after the
+    // ticket/evaluation deadline but before the sweep must not succeed. Apply
+    // any crossed deadline to the linked evaluation first, then read the ticket
+    // post-transition — exactly as the /audit path does.
+    const evaluationId = this.ticketToEvaluation.get(ticketId)
+    const entry = evaluationId ? this.pending.get(evaluationId) : undefined
+    if (entry) this.enforceDeadlines(entry)
+
+    const current = this.getTicketStatus(ticketId)
+    if (!current || current.status !== 'pending') {
+      return { status: 409, body: { error: 'already_resolved', status: current?.status } }
+    }
+
+    const resolved = this.approvalRouter.resolveNativeTicket(
+      ticketId,
+      req.resolution as NativeResolution,
+      req.resolved_by,
+      { denial_reason: req.resolution === 'denied' ? req.reason : undefined },
+    )
+    if (!resolved) {
+      return { status: 409, body: { error: 'already_resolved' } }
+    }
+    return { status: 200, body: { ok: true } }
+  }
+
+  // -------------------------------------------------------------------------
+  // Sweep — GC backstop for callers that never return (D4/D10/R4-1)
+  // -------------------------------------------------------------------------
+
+  sweep(): void {
+    for (const entry of [...this.pending.values()]) {
+      this.enforceDeadlines(entry)
+    }
+    const now = this.now()
+    for (const [id, tomb] of this.tombstones) {
+      if (tomb.expiresAtMs <= now) this.tombstones.delete(id)
+    }
+  }
+
+  close(): void {
+    if (this.closed) return
+    this.closed = true
+    if (this.sweepTimer) {
+      clearInterval(this.sweepTimer)
+      this.sweepTimer = null
+    }
+    this.pending.clear()
+    this.tombstones.clear()
+    this.caches.clear()
+    this.pendingBytes = 0
+  }
+
+  // -------------------------------------------------------------------------
+  // Internals
+  // -------------------------------------------------------------------------
+
+  /** Apply crossed deadlines to one pending entry. Returns its post-state. */
+  private enforceDeadlines(entry: PendingEvaluation): 'active' | 'expired' {
+    const now = this.now()
+
+    // Evaluation TTL fires first → finalize expired (and time out a live ticket).
+    if (now >= entry.evaluationExpiresAtMs) {
+      if (entry.approvalTicketId) {
+        this.approvalRouter?.resolveNativeTicket(entry.approvalTicketId, 'timeout')
+      }
+      const auditId = this.writeAudit({
+        timestampIso: entry.timestampIso,
+        origin: entry.origin,
+        agentId: entry.agentId,
+        sessionId: entry.sessionId,
+        toolName: entry.toolName,
+        toolInput: entry.toolInput,
+        metadata: entry.metadata,
+        action: entry.action,
+        wire: entry.action === 'require_approval' ? 'require_approval' : 'allow',
+        matchedRuleName: entry.matchedRuleName,
+        matchedRuleIndex: entry.matchedRuleIndex,
+        flaggedDestructive: entry.flaggedDestructive,
+        dryRun: false,
+        recordKind: 'evaluation_expired',
+        sidebandUnreported: true,
+      })
+      this.discardPending(entry)
+      this.tombstones.set(entry.evaluationId, {
+        auditRecordId: auditId,
+        payloadHash: null,
+        finalizedBy: 'expired',
+        expiresAtMs: now + this.ttlMs,
+      })
+      // eslint-disable-next-line no-console -- Operational bypass/tamper signal
+      console.error(
+        `[helio] Sideband evaluation ${entry.evaluationId} expired without /audit ` +
+          `(origin=${entry.origin}, tool=${entry.toolName}) — recorded as evaluation_expired`,
+      )
+      return 'expired'
+    }
+
+    // Ticket timeout fires first → time out the ticket; evaluation stays pending.
+    if (
+      entry.approvalTicketId &&
+      entry.ticketTimeoutAtMs !== undefined &&
+      now >= entry.ticketTimeoutAtMs
+    ) {
+      this.approvalRouter?.resolveNativeTicket(entry.approvalTicketId, 'timeout')
+    }
+    return 'active'
+  }
+
+  private cacheFor(origin: string): ToolAnnotationCache {
+    let cache = this.caches.get(origin)
+    if (!cache) {
+      cache = new ToolAnnotationCache()
+      this.caches.set(origin, cache)
+    }
+    return cache
+  }
+
+  private discardPending(entry: PendingEvaluation): void {
+    if (this.pending.delete(entry.evaluationId)) {
+      this.pendingBytes -= entry.bytes
+    }
+    if (entry.approvalTicketId) this.ticketToEvaluation.delete(entry.approvalTicketId)
+  }
+
+  private getTicketStatus(ticketId: string): ApprovalTicket | undefined {
+    return this.approvalRouter?.getTicket(ticketId)
+  }
+
+  private planRate(
+    decision: ReturnType<typeof decide>['decision'],
+    toolName: string,
+    sessionId: string | null,
+  ): { plan?: LimitPlan; block?: Record<string, unknown>; allowed: boolean } | undefined {
+    const limits = decision.matchedRule?.limits
+    if (!this.rateLimiter || !limits?.maxCalls || !limits.windowMs) {
+      return { allowed: true }
+    }
+    const key = buildLimitKey(limits.key, toolName, sessionId)
+    const peek = this.rateLimiter.peek({
+      key,
+      maxCalls: limits.maxCalls,
+      windowMs: limits.windowMs,
+    })
+    return {
+      plan: { kind: 'rate', key, limits },
+      block: {
+        current: peek.current,
+        limit: peek.limit,
+        window_ms: peek.windowMs,
+        reset_at_ms: peek.resetAtMs,
+      },
+      allowed: peek.allowed,
+    }
+  }
+
+  private planSpend(
+    decision: ReturnType<typeof decide>['decision'],
+    toolName: string,
+    sessionId: string | null,
+    args: Record<string, unknown> | undefined,
+  ): { plan?: LimitPlan; block?: Record<string, unknown>; allowed: boolean } | undefined {
+    const maxSpend = decision.matchedRule?.limits?.maxSpend
+    if (!this.spendLimiter || !maxSpend) return { allowed: true }
+
+    const key = buildLimitKey(maxSpend.key, toolName, sessionId)
+    const rawAmount = resolvePath(maxSpend.field, args ?? {})
+    if (typeof rawAmount !== 'number' || !Number.isFinite(rawAmount) || rawAmount < 0) {
+      // Invalid amount — terminal block (mirrors the MCP invalid-amount deny).
+      return { allowed: false, block: { reason: 'invalid_amount', limit: maxSpend.limit } }
+    }
+    const peek = this.spendLimiter.peek({
+      key,
+      amount: rawAmount,
+      limit: maxSpend.limit,
+      windowMs: maxSpend.windowMs,
+    })
+    return {
+      plan: {
+        kind: 'spend',
+        key,
+        limits: decision.matchedRule.limits,
+        amount: rawAmount,
+        currency: maxSpend.currency,
+      },
+      block: {
+        current_spend: peek.currentSpend,
+        limit: peek.limit,
+        currency: maxSpend.currency,
+        window_ms: peek.windowMs,
+        reset_at_ms: peek.resetAtMs,
+      },
+      allowed: peek.allowed,
+    }
+  }
+
+  /** Commit a limit plan at /audit time and return the evidence_chain block. */
+  private commitLimit(
+    plan: LimitPlan,
+    actualAmount: number | undefined,
+  ): Record<string, unknown> | undefined {
+    if (plan.kind === 'rate' && this.rateLimiter && plan.limits.maxCalls && plan.limits.windowMs) {
+      const r = this.rateLimiter.record({
+        key: plan.key,
+        maxCalls: plan.limits.maxCalls,
+        windowMs: plan.limits.windowMs,
+      })
+      return {
+        rate_limit: {
+          allowed: r.allowed,
+          current: r.current,
+          limit: r.limit,
+          window_ms: r.windowMs,
+          reset_at_ms: r.resetAtMs,
+        },
+      }
+    }
+    if (plan.kind === 'spend' && this.spendLimiter && plan.limits.maxSpend) {
+      const amount = actualAmount ?? plan.amount ?? 0
+      const r = this.spendLimiter.record({
+        key: plan.key,
+        amount,
+        limit: plan.limits.maxSpend.limit,
+        windowMs: plan.limits.maxSpend.windowMs,
+      })
+      this.spendLimiter.setCurrency(plan.key, plan.limits.maxSpend.currency)
+      return {
+        spend_limit: {
+          allowed: r.allowed,
+          current_spend: r.currentSpend,
+          limit: r.limit,
+          window_ms: r.windowMs,
+          reset_at_ms: r.resetAtMs,
+        },
+      }
+    }
+    return undefined
+  }
+
+  private writeAudit(args: WriteAuditArgs): string {
+    const id = randomUUID()
+    if (!this.auditWriter) return id
+
+    const blockReason = deriveBlockReason(args)
+    let evidenceChain: Record<string, unknown> | null = args.limitsChain ?? null
+    if (args.sidebandUnreported) {
+      evidenceChain = { ...(evidenceChain ?? {}), sideband: { unreported: true } }
+    }
+
+    const record: Omit<AuditRecord, 'id' | 'created_at'> = {
+      timestamp: args.timestampIso,
+      session_id: args.sessionId,
+      agent_id: args.agentId,
+      environment: this.environment ?? null,
+      tool_name: args.toolName,
+      tool_input: args.toolInput,
+      policy_decision: args.action,
+      block_reason: blockReason,
+      matched_rule: args.matchedRuleName,
+      matched_rule_index: args.matchedRuleIndex,
+      evidence_chain: evidenceChain,
+      approval_status: args.approvalStatus ?? null,
+      approved_by: args.approvedBy ?? null,
+      upstream_response: args.upstreamResponse ?? null,
+      upstream_error: args.upstreamError ?? null,
+      upstream_http_status: null,
+      upstream_latency_ms: args.upstreamLatencyMs ?? null,
+      total_duration_ms: 0,
+      approval_wait_ms: 0,
+      proxy_compute_ms: 0,
+      flagged_destructive: args.flaggedDestructive,
+      dry_run: args.dryRun,
+      record_kind: args.recordKind,
+      origin: args.origin,
+      metadata: args.metadata,
+    }
+
+    // Enforcement decisions and approvals go on the priority flush queue; plain
+    // allows stay buffered. Crash durability is the crash-drain hook's job.
+    const isEnforcement =
+      args.recordKind === 'evaluation_expired' ||
+      blockReason !== null ||
+      args.approvalStatus != null
+    if (isEnforcement) this.auditWriter.pushImmediate(record, id)
+    else this.auditWriter.push(record, id)
+    return id
+  }
+
+  private assertApprovalRouter(policy: CompiledPolicy): void {
+    if (!policyCanRequireApproval(policy) || this.approvalRouter) return
+    throw new GovernanceConfigError(
+      '[helio] GovernanceService misconfiguration: approval-capable policy ' +
+        '(a require_approval rule, or flag_destructive/on_tool_drift set to require_approval) ' +
+        'requires an approvalRouter',
+    )
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Free helpers
+// ---------------------------------------------------------------------------
+
+interface WriteAuditArgs {
+  timestampIso: string
+  origin: string
+  agentId: string | null
+  sessionId: string | null
+  toolName: string
+  toolInput: Record<string, unknown>
+  metadata: Record<string, unknown> | null
+  action: PolicyAction
+  wire: WireDecision
+  matchedRuleName: string | null
+  matchedRuleIndex: number | null
+  flaggedDestructive: boolean
+  dryRun: boolean
+  recordKind: AuditRecord['record_kind']
+  limitsChain?: Record<string, unknown>
+  approvalStatus?: string | null
+  approvedBy?: string | null
+  upstreamError?: string | null
+  upstreamResponse?: unknown
+  upstreamLatencyMs?: number | null
+  sidebandUnreported?: boolean
+}
+
+function deriveBlockReason(args: WriteAuditArgs): string | null {
+  if (args.recordKind === 'evaluation_expired') return null // bypass signal, not a block (F4)
+  if (args.dryRun) return null
+  if (args.approvalStatus === 'denied') return 'approval_denied'
+  if (args.approvalStatus === 'timeout') return 'approval_timeout'
+  if (args.approvalStatus === 'cancelled') return 'cancelled'
+  switch (args.wire) {
+    case 'deny':
+      return 'policy_denied'
+    case 'rate_limited':
+      return 'rate_limited'
+    case 'spend_limited':
+      return 'spend_limited'
+    default:
+      return null
+  }
+}
+
+function buildFeedback(
+  rule: { feedback?: { message: string; suggestion?: string } } | undefined,
+  reason: string,
+): Record<string, unknown> {
+  const message = rule?.feedback?.message ?? reason
+  const suggestion = rule?.feedback?.suggestion
+  return suggestion ? { message, suggestion } : { message }
+}
+
+function isBlocking(wire: WireDecision): boolean {
+  return wire === 'deny' || wire === 'rate_limited' || wire === 'spend_limited'
+}
+
+function isTerminalAtEvaluate(wire: WireDecision): boolean {
+  return (
+    wire === 'deny' || wire === 'rate_limited' || wire === 'spend_limited' || wire === 'dry_run'
+  )
+}
+
+function policyCanRequireApproval(policy: CompiledPolicy): boolean {
+  if (policy.flagDestructive === 'require_approval' || policy.onToolDrift === 'require_approval') {
+    return true
+  }
+  return policy.rules.some((rule) => rule.action === 'require_approval')
+}
+
+function buildLimitKey(
+  keyType: 'tool' | 'agent' | 'session' | undefined,
+  toolName: string,
+  sessionId: string | null,
+): string {
+  switch (keyType) {
+    case 'session':
+      return `session:${sessionId ?? 'unknown'}`
+    case 'agent':
+    case 'tool':
+    default:
+      return `tool:${toolName}`
+  }
+}
+
+function definitionProvided(tool: WireToolDefinition): boolean {
+  return (
+    tool.description !== undefined ||
+    tool.input_schema !== undefined ||
+    tool.output_schema !== undefined ||
+    tool.title !== undefined ||
+    tool.annotations !== undefined
+  )
+}
+
+/** Map the wire tool object to an MCP-shaped definition for the cache. */
+function toMcpToolDef(tool: WireToolDefinition): Record<string, unknown> {
+  const def: Record<string, unknown> = { name: tool.name }
+  if (tool.description !== undefined) def['description'] = tool.description
+  if (tool.input_schema !== undefined) def['inputSchema'] = tool.input_schema
+  if (tool.output_schema !== undefined) def['outputSchema'] = tool.output_schema
+  if (tool.title !== undefined) def['title'] = tool.title
+  if (tool.annotations !== undefined) def['annotations'] = tool.annotations
+  return def
+}
+
+function byteLength(value: unknown): number {
+  return Buffer.byteLength(canonicalize(value), 'utf8')
+}
+
+/** Drift changes are re-exported for the route layer's response typing. */
+export type { ToolDriftChange }
diff --git a/packages/proxy/src/upstream/e2e.test.ts b/packages/proxy/src/upstream/e2e.test.ts
index a55540a..bfb1848 100644
--- a/packages/proxy/src/upstream/e2e.test.ts
+++ b/packages/proxy/src/upstream/e2e.test.ts
@@ -60,7 +60,7 @@ function makeConfig(upstreamUrl: string): HelioConfig {
       retention: '90d',
       include_responses: true,
     },
-    sdk: { enabled: false, port: 3200, host: '127.0.0.1' },
+    sdk: { enabled: false, port: 3200, host: '127.0.0.1', evaluation_ttl: '10m' },
   } as HelioConfig
 }
 
diff --git a/packages/proxy/src/util/canonical-json.ts b/packages/proxy/src/util/canonical-json.ts
new file mode 100644
index 0000000..e588726
--- /dev/null
+++ b/packages/proxy/src/util/canonical-json.ts
@@ -0,0 +1,40 @@
+// ---------------------------------------------------------------------------
+// Canonical JSON — deterministic, key-sorted serialization.
+//
+// Used wherever two structurally-equal values must produce the same string:
+// tool-definition fingerprinting for drift detection (annotation-cache) and
+// idempotency hashing of /audit payloads (sideband governance, issue #12, D5).
+// ---------------------------------------------------------------------------
+
+/** Deterministic JSON encoding with recursively sorted object keys. */
+export function canonicalize(value: unknown): string {
+  // JSON.stringify returns undefined for top-level `undefined` at runtime even
+  // though its TS overloads type the return as `string` for non-undefined
+  // inputs. The explicit widening annotation keeps the coalesce legitimate.
+  const encoded: string | undefined = JSON.stringify(sortKeysDeep(value))
+  // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition -- see above: runtime diverges from TS overload
+  return encoded ?? ''
+}
+
+/** Recursively sort object keys so structurally-equal values encode equally. */
+export function sortKeysDeep(value: unknown): unknown {
+  if (Array.isArray(value)) return value.map(sortKeysDeep)
+  if (value !== null && typeof value === 'object') {
+    const source = value as Record<string, unknown>
+    const out: Record<string, unknown> = {}
+    for (const key of Object.keys(source).sort()) {
+      // Object.defineProperty (not assignment) so a JSON-parsed "__proto__"
+      // key becomes an own property instead of silently setting the
+      // prototype — otherwise content under that key never registers, a blind
+      // spot in a security control.
+      Object.defineProperty(out, key, {
+        value: sortKeysDeep(source[key]),
+        enumerable: true,
+        writable: true,
+        configurable: true,
+      })
+    }
+    return out
+  }
+  return value
+}