From 9db72dcad04a4ede6073a05b896458f3b859c7db Mon Sep 17 00:00:00 2001
From: Federico Kamelhar <federico.kamelhar@oracle.com>
Date: Fri, 1 May 2026 20:55:17 -0400
Subject: [PATCH] docs: rewrite tools/idempotency/hooks/streaming/mcp + new
 homepage hero + Capabilities + dark-mode diagrams
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Concept-page rewrites — same template as the provider pages: pitch /
when to pick / getting started / capabilities (each runnable) /
gotchas / source. Pages sized 74→156, 56→138, 82→148, 68→140,
51→139.

Homepage rewrite:
- H1 → "Build agents that reason and solve together." (verb-led, two
  beats, captures both reasoning + multi-agent pillars).
- Subhead → "The Oracle Gen AI Multi-Agent Reasoning SDK." (matches
  the logo wordmark, Oracle-anchored category claim).
- Body → reasoning lives inside the loop (Reflexion / Grounding /
  Causal) + six-shape verb chain (Compose / Orchestrate / Swarm /
  Handoff / StateGraph / Functional / A2A) + Oracle proof closer.
- "Six things you can ship" section — developer-narrative ordering
  (Reason → Act → Coordinate → Mesh → Decide → Deploy), benefit-led
  H3 titles, 14–20-line code blocks with real imports + commented
  output, descriptive concept + tutorial links (no opaque "Tutorial
  03" labels).
- H1 margin tightened (1.4rem → 0.6rem) so subhead sits closer.
- Drop title='travel_concierge.py' from the hero code block.

Capabilities page (renamed from "Feature matrix"):
- Honest naming — it's a capability list, not a comparison matrix.
- New top-of-page Oracle-distinctive admonition (Oracle-red border +
  sand-tinted gradient + red star icon, defined in CSS) calls out
  the six wedge features at a glance.
- Every section uses the same 3-column table (Feature / What it does
  / Surface + concept link). Drops the prose dumps.
- mkdocs.yml nav label updated.

Dark-mode diagrams:
- Architecture SVGs (agent-loop, multi-agent-patterns, the
  per-pattern SVGs under img/patterns/, architecture, sequence-26ai)
  were authored against a light background; dark-mode rendering hid
  the dark-grey strokes. Added a near-white card in slate scheme.

Model id fix mirrored: openai.gpt-5.5 → openai.gpt-5 in homepage
code blocks, README, and openai provider doc. (Same fix shipping
in PR #38 across the rest of the codebase.)

Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
---
 README.md                         |  25 +--
 docs/FEATURES.md                  | 200 ++++++++++++++--------
 docs/concepts/hooks.md            | 206 +++++++++++++++++-----
 docs/concepts/idempotency.md      | 152 +++++++++++++----
 docs/concepts/mcp.md              | 163 +++++++++++++++---
 docs/concepts/providers/openai.md |  10 +-
 docs/concepts/streaming.md        | 173 +++++++++++++++----
 docs/concepts/tools.md            | 203 ++++++++++++++++++----
 docs/img/sequence-26ai.svg        |   2 +-
 docs/index.md                     | 272 +++++++++++++++++++++++-------
 docs/stylesheets/locus.css        |  58 ++++++-
 mkdocs.yml                        |   2 +-
 12 files changed, 1146 insertions(+), 320 deletions(-)
diff --git a/README.md b/README.md
index ce67082..507836b 100644
--- a/README.md
+++ b/README.md
@@ -3,8 +3,8 @@
 </p>
 
 <p align="center">
-  <strong>Build AI workflows that actually ship.</strong><br>
-  Oracle Generative AI · Multi-Agent · Reasoning · Orchestrator SDK.
+  <strong>Build agents that reason and solve together.</strong><br>
+  The Oracle Gen AI Multi-Agent Reasoning SDK.
 </p>
 
 <p align="center">
@@ -26,14 +26,17 @@
 
 ---
 
-Spin up a **swarm** of specialists. Hand a conversation off across an
-**escalation desk**. Run an **orchestrator** of experts in parallel.
-Wire up a **state graph** that loops until confident. Mesh agents
-**across processes** with A2A. Or just ship one self-correcting agent
-that knows when to stop.
+Reasoning lives inside the loop. **Reflexion** evaluates every turn.
+**Grounding** verifies every claim against its source. **Causal**
+traces root cause from symptom.
 
-Six multi-agent shapes plus A2A. One Oracle-native runtime. Every
-model on OCI the day it lands.
+Six shapes for six problems. **Compose** linear pipelines.
+**Orchestrate** specialists in parallel. **Swarm** for peer-to-peer
+research. **Handoff** for escalation desks. **StateGraph** loops
+until confident. **Functional** maps across agents. **A2A** meshes
+across processes.
+
+Every model on Oracle Generative AI the day it lands.
 
 ```bash
 pip install "locus[oci]"
@@ -58,7 +61,7 @@ def book_flight(flight_id: str, customer_id: str) -> dict:
     return billing.charge_and_book(flight_id, customer_id)
 
 agent = Agent(
-    model="oci:openai.gpt-5.5",
+    model="oci:openai.gpt-5",
     tools=[search_flights, book_flight],
     system_prompt="You are a travel concierge. Find a flight, then book it.",
     reflexion=True,                                      # self-correct mid-run
@@ -191,7 +194,7 @@ def book_meeting(date: str, attendees: list[str]) -> dict:
     return calendar.book(date, attendees)
 
 agent = Agent(
-    model="oci:openai.gpt-5.5",
+    model="oci:openai.gpt-5",
     tools=[get_today_date, book_meeting],
     system_prompt="You are a scheduling assistant.",
 )
diff --git a/docs/FEATURES.md b/docs/FEATURES.md
index a38f4c0..76d038f 100644
--- a/docs/FEATURES.md
+++ b/docs/FEATURES.md
@@ -1,93 +1,157 @@
-# Locus feature matrix
+# Capabilities
 
-What ships in `locus`, grouped by area.
+Everything `locus` ships, what it does, and where to find it.
+
+!!! oracle-distinctive "Distinctive to locus"
+    These ship as core primitives, inside the ReAct loop — not as
+    middleware, plugins, or third-party libraries:
+
+    - **Idempotent tools** — `@tool(idempotent=True)` dedupes on `(name, args)` inside the loop. No double-charge, double-book, double-page.
+    - **Reasoning loop nodes** — Reflexion, Grounding, Causal as first-class
+      Think → Execute → **Reflect** → Think nodes, not bolted-on libraries.
+    - **GSAR** — typed-grounding layer from [arXiv:2604.23366](https://arxiv.org/abs/2604.23366) with four-way claim partition + tiered replanning.
+    - **Termination algebra** — `MaxIterations(10) | TextMention("DONE") & ConfidenceMet(0.9)` is real Python (`__or__` / `__and__` operator overloads).
+    - **Six multi-agent shapes plus A2A** — Composition, Orchestrator, Swarm, Handoff, StateGraph, Functional + A2A for cross-process meshes.
+    - **OCI Generative AI day-zero** — two transports (V1 and native SDK), auto-routed by model id.
 
 ## Agent core
 
-| Feature | Surface |
-|---|---|
-| `Agent` + `AgentConfig` + `AgentResult` | `locus.agent` |
-| Composable termination algebra (`MaxIterations \| ToolCalled & ConfidenceMet`) | `locus.core.termination` |
-| Idempotent tools — `@tool(idempotent=True)` dedupes repeat calls | `locus.tools.decorator` |
-| Reflexion (`reflexion=True`) + Grounding (`grounding=True`) | `locus.reasoning` |
-| Causal chains (standalone graph builder) | `locus.reasoning.causal.CausalChain` |
-| Cancel signal (thread-safe `agent.cancel()`) | `Agent.cancel` |
-| Interrupts + resume (HITL) | `agent.run` yields `InterruptEvent`; `agent.resume(...)` |
-| Structured output (`output_schema=` Pydantic) | `locus.agent.config`, `locus.core.structured` |
-| Hooks lifecycle (before/after × invocation × tool × model + iteration) | `locus.hooks.provider` |
-| Plugin bundling (hooks + tools as one unit) | `locus.hooks.plugin` |
-
-## Memory
-
-| Feature | Backends |
-|---|---|
-| Native checkpointers | `MemoryCheckpointer`, `FileCheckpointer`, `HTTPCheckpointer`, `OCIBucketBackend` |
-| Storage-backed (auto-wrapped via `StorageBackendAdapter`) | `SQLiteBackend`, `RedisBackend`, `PostgreSQLBackend`, `OpenSearchBackend`, `OracleBackend` |
-| Conversation managers | `SlidingWindowManager`, `SummarizingManager`, `LLMCompactor` |
-| Long-term key-value store with optimistic locking (`version` counter) | `locus.memory.store` |
+| Feature | What it does | Surface |
+|---|---|---|
+| **Agent** + `AgentConfig` + `AgentResult` | The Think → Execute → Reflect → Terminate loop | `locus.agent` · [Agent loop](concepts/agent-loop.md) |
+| **Termination algebra** | Compose stop conditions with `&` and `\|` operator overloads | `locus.core.termination` · [Termination](concepts/termination.md) |
+| **Idempotent tools** | `@tool(idempotent=True)` dedupes repeat calls inside the loop — exactly-once side effects | `locus.tools.decorator` · [Idempotency](concepts/idempotency.md) |
+| **Reflexion** | Self-evaluation node in the ReAct cycle; rewrites the next turn when the last one was wrong | `Agent(reflexion=True)` · [Reasoning](concepts/reasoning.md) |
+| **Grounding** | LLM-as-judge claim verification against tool results; below-threshold triggers replanning | `Agent(grounding=True)` · [Reasoning](concepts/reasoning.md) |
+| **Causal chains** | Cause-effect graph builder with cycle/contradiction detection | `locus.reasoning.causal.CausalChain` · [Reasoning](concepts/reasoning.md) |
+| **GSAR** | Typed-grounding safety layer (arXiv:2604.23366) — four-way claim partition + tiered replanning | `Agent(gsar=GSARConfig(...))` · [GSAR](concepts/gsar.md) |
+| **Cancel** | Thread-safe abort during a run; emits `TerminateEvent` with reason | `agent.cancel()` · [Agent loop](concepts/agent-loop.md) |
+| **Interrupts (HITL)** | Pause via `InterruptEvent`; resume with `agent.resume(...)` | `locus.core.interrupt` · [Interrupts](concepts/interrupts.md) |
+| **Structured output** | Pass `output_schema=` (Pydantic), final answer is parsed into a typed instance | `locus.agent.config`, `locus.core.structured` · [Structured output](concepts/structured-output.md) |
+| **Hooks** | before/after × invocation × tool × model lifecycle observation + steering | `locus.hooks.provider` · [Hooks](concepts/hooks.md) |
+| **Plugins** | Bundle hooks + tools as one drop-in unit | `locus.hooks.plugin` · [Hooks](concepts/hooks.md) |
+
+## Multi-agent
+
+| Shape | What it does | Surface |
+|---|---|---|
+| **Composition** | Linear chain · fan-out + merge — the simplest multi-agent shape | `locus.multiagent.composition` · [Composition](concepts/multi-agent/composition.md) |
+| **Orchestrator** | One coordinator dispatches specialists in parallel | `locus.multiagent.orchestrator` · [Orchestrator](concepts/multi-agent/orchestrator.md) |
+| **Swarm** | Open-ended peer-to-peer collaboration | `locus.multiagent.swarm` · [Swarm](concepts/multi-agent/swarm.md) |
+| **Handoff** | Specialist-to-specialist context transfer with chain-of-custody | `locus.multiagent.handoff` · [Handoff](concepts/multi-agent/handoff.md) |
+| **StateGraph** | Cycles, conditional edges, subgraphs — when DAG isn't enough | `locus.multiagent.graph` · [StateGraph](concepts/multi-agent/graph.md) |
+| **Functional API** | Map / reduce over agents with `@task` and `@entrypoint` | `locus.multiagent.functional` · [Functional](concepts/multi-agent/functional.md) |
+| **A2A** | Cross-process agent meshes — `AgentCard` discovery + HTTP/SSE transport | `locus.a2a` · [A2A](concepts/multi-agent/a2a.md) |
+
+## Reasoning
+
+| Feature | What it does | Surface |
+|---|---|---|
+| **Reflexion** | After each turn, the agent self-evaluates and re-plans on wrong premises | `Agent(reflexion=True)` · [Reasoning](concepts/reasoning.md) |
+| **Grounding** | LLM-as-judge over claims vs the tool results that produced them | `Agent(grounding=True)` · [Reasoning](concepts/reasoning.md) |
+| **Causal** | Build a cause-effect graph from the trace; surface contradictions | `build_causal_chain()` · [Reasoning](concepts/reasoning.md) |
+| **GSAR** | Typed claim partition (cited / supported / unsupported / mismatched) + `proceed`/`regenerate`/`replan`/`abstain` decision | `Agent(gsar=GSARConfig(...))` · [GSAR](concepts/gsar.md) |
 
 ## Tools
 
-| Feature | Surface |
-|---|---|
-| `@tool` decorator with auto JSON-Schema | `locus.tools.decorator` |
-| Sequential / Concurrent / CircuitBreaker executors | `locus.tools.executor` |
-| Tool-result store offload (large outputs) | `locus.tools.result_storage` |
-| MCP — client + server | `locus.integrations.fastmcp` |
-| Path/URL safety helpers | `locus.tools.path_safety`, `locus.tools.url_safety` |
+| Feature | What it does | Surface |
+|---|---|---|
+| `@tool` decorator | Function → JSON-Schema-typed tool the model can call | `locus.tools.decorator` · [Tools](concepts/tools.md) |
+| Idempotent dedup | `@tool(idempotent=True)` skips repeat calls (same args) in the loop | `locus.tools.decorator` · [Idempotency](concepts/idempotency.md) |
+| **Sequential executor** | Run tool calls one at a time | `locus.tools.executor` · [Executors](concepts/executors.md) |
+| **Concurrent executor** | Run tool calls in parallel | `locus.tools.executor` · [Executors](concepts/executors.md) |
+| **CircuitBreaker executor** | Auto-disable a tool after N failures | `locus.tools.executor` · [Executors](concepts/executors.md) |
+| Result-store offload | Move large tool results to object storage; agent sees a pointer | `locus.tools.result_storage` |
+| Path / URL safety | Validate filesystem and network access from tool args | `locus.tools.path_safety`, `locus.tools.url_safety` · [Safety](concepts/safety.md) |
+| **MCP — client + server** | Talk to / be talked to by Anthropic-spec MCP servers | `locus.integrations.fastmcp` · [MCP](concepts/mcp.md) |
+
+## Memory — checkpointer backends
+
+| Backend | Best for | Surface |
+|---|---|---|
+| `MemoryCheckpointer` | Tests, REPL — in-process dict | `locus.memory.backends.memory` · [Checkpointers](concepts/checkpointers.md) |
+| `FileCheckpointer` | Local dev — JSON files on disk | `locus.memory.backends.file` |
+| `HTTPCheckpointer` | A remote checkpoint service you already run | `locus.memory.backends.http` |
+| **`OCIBucketBackend`** | OCI-native, lifecycle policies, region replication | `locus.memory.backends.oci_bucket` |
+| `SQLiteBackend` | Single-process durability | `locus.memory.backends.sqlite` |
+| `RedisBackend` | Multi-replica, fast, TTLs | `locus.memory.backends.redis` |
+| `PostgreSQLBackend` | Production DB with metadata queries | `locus.memory.backends.postgresql` |
+| `OpenSearchBackend` | Full-text search across past runs | `locus.memory.backends.opensearch` |
+| `OracleBackend` | Oracle DB with JSON queries | `locus.memory.backends.oracle` |
+
+## Memory — context management
+
+| Feature | What it does | Surface |
+|---|---|---|
+| `SlidingWindowManager` | Keeps the last N messages; drops the rest | `locus.memory.compactor` · [Conversation management](concepts/conversation-management.md) |
+| `SummarizingManager` | LLM rollup of older turns | `locus.memory.compactor` |
+| **`LLMCompactor`** | Budget-aware compaction with head + tail protection | `locus.memory.compactor` |
+| Long-term key-value store | Cross-run user prefs / results with optimistic-locking `version` counter | `locus.memory.store` |
 
 ## Hooks (built-in)
 
-`LoggingHook`, `StructuredLoggingHook`, `TelemetryHook` (OpenTelemetry),
-`NoOpTelemetryHook`, `ModelRetryHook`, `GuardrailsHook`,
-`ContentFilterHook`, `SteeringHook` — all import from
-`locus.hooks.builtin`.
+| Hook | What it does | Import |
+|---|---|---|
+| `LoggingHook` / `StructuredLoggingHook` | Stdlib / structured-JSON logs of every event | `locus.hooks.builtin` · [Observability](concepts/observability.md) |
+| **`TelemetryHook`** | OpenTelemetry traces + metrics (counters, histograms) | `locus.hooks.builtin` |
+| `NoOpTelemetryHook` | Opt-out variant for tests | `locus.hooks.builtin` |
+| `ModelRetryHook` | Auto-retry model calls on throttle/empty with exponential back-off | `locus.hooks.builtin` · [Retry](concepts/retry.md) |
+| **`GuardrailsHook`** | Block dangerous tools, redact PII, enforce content/topic policies | `locus.hooks.builtin` · [Safety](concepts/safety.md) |
+| `ContentFilterHook` | Standalone content moderation | `locus.hooks.builtin` |
+| **`SteeringHook`** | LLM-as-judge approval gate on every tool call | `locus.hooks.builtin` · [Safety](concepts/safety.md) |
 
-## Multi-agent
+## Streaming + Server
 
-`SequentialPipeline` / `ParallelPipeline` / `LoopAgent`
-(plus `sequential()`, `parallel()`, `loop()` helpers); `Orchestrator` +
-`Specialist`; `Swarm` + `SharedContext`; `Handoff` + `HandoffAgent`;
-`StateGraph` (cycles, conditional edges, subgraphs); Functional API
-(`@task` / `@entrypoint`); `A2AServer` + `A2AClient` + `AgentCard`.
+| Feature | What it does | Surface |
+|---|---|---|
+| **Typed events** | Frozen Pydantic events for `match`-statement consumers | `locus.core.events` · [Events](concepts/events.md) |
+| `StructuredStream` | Incremental Pydantic-partial parsing during streaming | `locus.core.structured` |
+| Console + SSE handlers | Render to terminal or stream over Server-Sent Events | `locus.core.events` · [Streaming](concepts/streaming.md) |
+| **`AgentServer`** | Drop-in FastAPI app: `/invoke`, `/stream`, `/threads/{id}`, `/health` | `locus.server` · [Agent Server](concepts/server.md) |
+| Per-principal threads | Bearer-token auth + thread-id namespacing prevents cross-tenant leaks | `AgentServer(api_key=...)` · [Agent Server](concepts/server.md) |
+| Graph streaming | Multi-agent state-graph event streams | `locus.multiagent.graph` · [Graph streaming](concepts/graph-streaming.md) |
 
 ## RAG
 
-Seven vector stores under `locus.rag.stores`: Chroma, in-memory,
-OpenSearch, Oracle 26ai, pgvector, Pinecone, Qdrant. Embeddings:
-`OCIEmbeddings`, `OpenAIEmbeddings`. Multimodal processors:
-`TextProcessor`, `ImageProcessor`, `PDFProcessor`, `AudioProcessor`,
-`MultimodalProcessor`.
+| Component | Options | Surface |
+|---|---|---|
+| Vector stores | Oracle 26ai · OpenSearch · pgvector · Qdrant · Pinecone · Chroma · in-memory | `locus.rag.stores` · [RAG](concepts/rag.md) |
+| Embeddings | `OCIEmbeddings` (Cohere) · `OpenAIEmbeddings` | `locus.rag.embeddings` |
+| Multimodal processors | Text · PDF (text + OCR) · Image (OCR) · Audio (transcription) | `locus.rag.multimodal` |
+| Tool wiring | `create_rag_tool(retriever)` exposes the retriever as a `@tool` | `locus.rag.tools` |
 
-## Streaming + Server
+## Models
 
-Typed events (`ThinkEvent`, `ModelChunkEvent`, `ToolStartEvent`,
-`ToolCompleteEvent`, `ReflectEvent`, `GroundingEvent`, `InterruptEvent`,
-`TerminateEvent`); `StructuredStream` (incremental Pydantic partials);
-console + SSE handlers; `AgentServer` with `/invoke`, `/stream`,
-`GET /threads/{id}`, `DELETE /threads/{id}`, `/health` and
-bearer-principal-scoped thread namespaces.
+| Provider | Models | Surface |
+|---|---|---|
+| **OCI Generative AI — V1 transport** | `openai.*`, `meta.*`, `xai.*`, `google.*`, `mistral.*` on OCI | `locus.models.providers.oci.openai_compat` · [OCI](concepts/providers/oci.md) |
+| **OCI Generative AI — SDK transport** | Cohere `command-r-*` series — proprietary chat shape | `locus.models.providers.oci.OCIModel` · [OCI](concepts/providers/oci.md) |
+| OpenAI | All commercial models (gpt-5, o-series, etc) | `locus.models.providers.openai` · [OpenAI](concepts/providers/openai.md) |
+| Anthropic | Claude 4 / 4.5 / 4.7 / 4.8 — direct API | `locus.models.providers.anthropic` · [Anthropic](concepts/providers/anthropic.md) |
+| Ollama | Local models | `locus.models.providers.ollama` · [Ollama](concepts/providers/ollama.md) |
+| Auto-routing | `get_model("oci:openai.gpt-5")` picks transport from id | `locus.models.registry.get_model` |
+| Decorators | Failover · pooled · cached · rate-limited wrappers over any provider | `locus.models.decorators` |
 
 ## Skills + Playbooks
 
-Three-tier skill disclosure (`SkillsPlugin`); `PlaybookEnforcer` with
-YAML / JSON / Python loaders; `Skill.from_directory()` activation.
-
-## Models
-
-`OpenAIModel`, `AnthropicModel`, `OllamaModel`, `OCIModel` (native SDK
-transport for Cohere R-series), `OCIOpenAIModel` (`/openai/v1` for
-openai.*/ meta.* / xai.*/ google.* / mistral.* on OCI). `get_model()`
-auto-routes by model id. Failover, pooled, caching, rate-limit
-decorators included.
+| Feature | What it does | Surface |
+|---|---|---|
+| **Skills** | AgentSkills.io progressive disclosure (catalog → instructions → resources) | `locus.skills.SkillsPlugin` · [Skills](concepts/skills.md) |
+| `Skill.from_directory()` | Load a folder of `SKILL.md` bundles | `locus.skills.models.Skill` |
+| **Playbooks** | Numbered execution plans with per-step `PlaybookEnforcer` | `locus.playbooks` · [Playbooks](concepts/playbooks.md) |
+| YAML / JSON / Python loaders | Author playbooks in any of three formats | `locus.playbooks.loader` |
 
 ## Evaluation
 
-`EvalCase`, `EvalRunner`, `EvalReport`, `EvalResult` — pass/score/duration
-reporting, custom evaluators, `expected_tools` / `expected_output_contains`
-matchers.
+| Class | What it does | Surface |
+|---|---|---|
+| `EvalCase` | A single test case — expected tools / output / iteration / duration budgets | `locus.evaluation` · [Evaluation](concepts/evaluation.md) |
+| `EvalRunner` | Runs a list of cases against an agent, returns `EvalReport` | `locus.evaluation` |
+| `EvalResult` | Per-case pass / score / duration + diagnostic checks | `locus.evaluation` |
+| `EvalReport` | Aggregate stats with `summary()` + JSON serialisation | `locus.evaluation` |
 
-## Source pointers
+## Where to next
 
-For depth on any feature, the README headlines link to its source
-directory; canonical entry is `src/locus/__init__.py`.
+- **For first-time visitors**: [Quickstart](how-to/quickstart.md) ships a working agent in five minutes.
+- **For architecture**: [Agent loop](concepts/agent-loop.md) is the canonical reference.
+- **For depth on any feature**: every row in this matrix links to its concept page. Source lives at [`src/locus/`](https://github.com/oracle-samples/locus/tree/main/src/locus); canonical entry is [`src/locus/__init__.py`](https://github.com/oracle-samples/locus/blob/main/src/locus/__init__.py).
diff --git a/docs/concepts/hooks.md b/docs/concepts/hooks.md
index e5c8a79..32fc117 100644
--- a/docs/concepts/hooks.md
+++ b/docs/concepts/hooks.md
@@ -1,20 +1,47 @@
 # Hooks
 
-Hooks observe and modify agent behavior at lifecycle points. Every
-hook inherits `HookProvider` and is registered in a `HookRegistry`.
-Events fire at six phases:
+Hooks are how you **observe and modify** agent behaviour at the
+moments that matter — before / after the run starts, before / after
+each model call, before / after each tool call. Every cross-cutting
+concern that *isn't* the agent's primary task lives here: logging,
+telemetry, retry policy, guardrails, PII redaction, LLM-as-judge tool
+approval.
 
-1. `on_before_invocation` — before the agent starts
-2. `on_after_invocation` — after the agent finishes
-3. `on_before_model_call` — before each model request
-4. `on_after_model_call` — after each model response
-5. `on_before_tool_call` — before each tool runs
-6. `on_after_tool_call` — after each tool completes
+You can use the ones locus ships (covers most production needs out
+of the box) or write your own — a hook is a small subclass with the
+methods it cares about.
 
-## Writing a hook
+## When to write a hook
+
+| You want… | Write a hook |
+|---|---|
+| Log every tool call to your aggregator | ✓ |
+| Add OpenTelemetry spans / metrics | ✓ — use the built-in `TelemetryHook` |
+| Retry model calls with backoff | ✓ — `ModelRetryHook` |
+| Reject tool calls that look dangerous | ✓ — `GuardrailsHook`, `ContentFilterHook`, `SteeringHook` |
+| Add a tool to the registry | use [`tools=[...]` on Agent](tools.md) |
+| Change the system prompt mid-run | hooks can read state but not mutate the prompt; use a [skill](skills.md) instead |
+
+## The six lifecycle phases
+
+A hook can subscribe to any of these. Each method receives a typed,
+write-protected event object.
+
+| Phase | Fires | Useful for |
+|---|---|---|
+| `on_before_invocation` | once, when `agent.run()` starts | initialise per-run state, open spans |
+| `on_after_invocation` | once, after the agent finishes | flush metrics, close spans |
+| `on_before_model_call` | before each request to the model | redact PII, count tokens |
+| `on_after_model_call` | after each response from the model | log usage, retry on empty |
+| `on_before_tool_call` | before each tool body runs | guardrails, audit, approval gates |
+| `on_after_tool_call` | after each tool body completes | log result, update metrics |
+
+## Getting started
+
+### 1. Subclass `HookProvider`
 
 ```python
-from locus.hooks.provider import HookProvider, HookPriority
+from locus.hooks.provider import HookPriority, HookProvider
 
 class AuditHook(HookProvider):
     name = "audit"
@@ -25,58 +52,145 @@ class AuditHook(HookProvider):
 
     async def on_after_tool_call(self, event):
         print(f"← {event.tool_name} = {event.result}")
-
-agent = Agent(..., hooks=[AuditHook()])
 ```
 
-## Priorities
-
-Hooks run in priority order (lower number first for `before_*`,
-reversed for `after_*` so teardown pairs with setup):
-
-| Range | Intended use |
-|---|---|
-| 0–99 | Security (guardrails, PII redaction) |
-| 100–199 | Observability (logging, telemetry) |
-| 200–299 | Business logic |
-| 300+ | Cosmetic |
+Override only the phases you care about. Unimplemented phases inherit
+no-op defaults from the base class.
 
-Use the constants in `HookPriority` instead of magic numbers.
+### 2. Pass to the agent
 
-## Write-protected events
+```python
+agent = Agent(
+    model="oci:openai.gpt-5.5",
+    tools=[search, book_flight],
+    hooks=[AuditHook()],
+)
+```
 
-Event objects are Pydantic models with frozen fields. You cannot
-accidentally mutate them from a hook. Methods that exist to let hooks
-steer the agent — cancelling a tool, retrying a model call — are
-explicit, so the intent is unambiguous.
+### 3. Run
 
-## Built-in hooks
+The hook fires automatically — no further wiring.
 
-Locus ships these out of the box:
+## What you get out of the box
 
-| Hook | What it does |
-|---|---|
-| `LoggingHook` / `StructuredLoggingHook` | Plain or JSON-structured logs at every phase |
-| `TelemetryHook` / `NoOpTelemetryHook` | OpenTelemetry spans + counters + histograms |
-| `ModelRetryHook` | Backoff retries on empty / rate-limited model responses |
-| `GuardrailsHook` / `ContentFilterHook` | PII / SQL / XSS / command-injection regex policies |
-| `SteeringHook` | LLM-as-judge tool approval (a second model votes before each tool call) |
+locus ships these hooks. Composed in this order, they cover most
+production needs without writing custom code.
 
 ```python
 from locus.hooks.builtin import (
-    GuardrailsHook,
-    LoggingHook,
+    LoggingHook, StructuredLoggingHook,
+    TelemetryHook,
     ModelRetryHook,
+    GuardrailsHook, ContentFilterHook,
     SteeringHook,
-    TelemetryHook,
 )
 
 agent = Agent(
-    ...,
+    model="oci:openai.gpt-5.5",
+    tools=[...],
     hooks=[
-        LoggingHook(),
-        ModelRetryHook(max_retries=3),
-        GuardrailsHook(),
+        StructuredLoggingHook(),       # JSON logs at every phase
+        TelemetryHook(),               # OTel spans + metrics + histograms
+        ModelRetryHook(max_retries=3), # backoff on empty / rate-limited responses
+        GuardrailsHook(),              # PII / SQL / XSS / command-injection
+        SteeringHook(approver=second_model),  # LLM-as-judge tool approval
     ],
 )
 ```
+
+### `LoggingHook` / `StructuredLoggingHook`
+
+Plain-text or JSON-structured logs at every lifecycle phase. Drop in
+when you want a paper trail without writing your own logger.
+
+### `TelemetryHook`
+
+OpenTelemetry spans for every model + tool call, counters for tool
+invocations, histograms for latency. Use `NoOpTelemetryHook` when
+you want the API surface but no actual export (useful for tests).
+
+### `ModelRetryHook`
+
+Backoff retries on empty model responses, rate-limit errors, and
+transient connection failures. Configurable `max_retries` and
+`backoff_seconds`. Doesn't intercept your tool calls — only the
+model layer.
+
+### `GuardrailsHook` / `ContentFilterHook`
+
+Regex-based policies on tool inputs (`GuardrailsHook`) and model
+outputs (`ContentFilterHook`). Catches PII, SQL injection patterns,
+shell-command injection, and credit-card-shaped strings. Reject or
+redact at the boundary.
+
+### `SteeringHook` — LLM-as-judge tool approval
+
+A *second model* sees each tool call before it runs and votes
+"approve / reject / rewrite". Use this when the cost of a wrong tool
+call is higher than the cost of a second model round-trip.
+
+```python
+agent = Agent(
+    ...,
+    hooks=[SteeringHook(approver="oci:openai.gpt-5.5")],
+)
+```
+
+## Priorities — the ordering rules
+
+Hooks run in priority order. Lower numbers run first on `before_*`
+phases; the order reverses for `after_*` so teardown pairs with
+setup.
+
+| Range | Intended use |
+|---|---|
+| `0`–`99` | **Security** — guardrails, PII redaction (must run first to short-circuit unsafe calls) |
+| `100`–`199` | **Observability** — logging, telemetry |
+| `200`–`299` | **Business logic** — domain-specific hooks |
+| `300+` | **Cosmetic** — pretty-printing, console UI |
+
+Use the constants in `HookPriority` (e.g. `HookPriority.SECURITY_MAX`,
+`HookPriority.OBSERVABILITY_MIN`) instead of magic numbers — the
+intent is more obvious in code review.
+
+## Write-protected events — by design
+
+Event objects are frozen Pydantic models. You **cannot** accidentally
+mutate them from a hook — try and you get a `ValidationError`. The
+methods that *do* let hooks steer the agent (`event.cancel()`,
+`event.retry()`, `event.replace_arguments(...)`) are explicit and
+named for what they do, so the intent is unambiguous in a review:
+
+```python
+async def on_before_tool_call(self, event):
+    if "DROP TABLE" in str(event.arguments):
+        event.cancel(reason="SQL injection blocked by GuardrailsHook")
+```
+
+Compare to a callback-based system where any code can monkey-patch
+any field; this is intentionally tight.
+
+## Common gotchas
+
+| Symptom | Likely cause |
+|---|---|
+| Hook never fires | Forgot to pass it on `Agent(hooks=[...])`. The `HookRegistry` only sees what you register. |
+| Hook fires in the wrong order | Set `priority` explicitly. The default priority is intentionally mid-range so security hooks always come before yours. |
+| `ValidationError: cannot mutate frozen instance` | You tried to write `event.foo = bar`. Hooks observe, not mutate; use the explicit steering methods. |
+| `on_after_tool_call` doesn't see the result | The tool raised. Check `event.error` instead of `event.result`. |
+| Telemetry spans aren't exported | `TelemetryHook` needs an OTel exporter configured upstream — see [Observability](observability.md). |
+
+## Source and examples
+
+- [`HookProvider` and `HookOrchestrator`](https://github.com/oracle-samples/locus/blob/main/src/locus/hooks/provider.py)
+- [Built-in hooks](https://github.com/oracle-samples/locus/tree/main/src/locus/hooks/builtin)
+- [`tutorial_05_agent_hooks.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_05_agent_hooks.py) — write your first hook.
+- [`tutorial_27_hooks_advanced.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_27_hooks_advanced.py) — guardrails + steering, end to end.
+
+## See also
+
+- [Tools](tools.md) — the things hooks observe.
+- [Events](events.md) — the typed event objects hooks receive.
+- [Safety & guardrails](safety.md) — production policies built on `GuardrailsHook`.
+- [Observability](observability.md) — wiring `TelemetryHook` to your OTel collector.
+- [Retry strategies](retry.md) — how `ModelRetryHook` works under the hood.
diff --git a/docs/concepts/idempotency.md b/docs/concepts/idempotency.md
index 4e19c90..061c136 100644
--- a/docs/concepts/idempotency.md
+++ b/docs/concepts/idempotency.md
@@ -1,11 +1,36 @@
 # Idempotency
 
-The single most important word in production agents is **once**. The
-model is allowed to retry; the side-effect isn't. locus makes that a
-one-keyword decision on the tool.
+> The single most important word in production agents is **once**.
+
+The model is *allowed* to retry. The side effect *isn't*. locus
+makes that distinction a one-keyword decision on the tool, enforced
+inside the ReAct loop. This is a locus-specific primitive — none of
+LangChain / LangGraph / CrewAI / Strands ship it.
+
+If you ever plan to run an agent that **books**, **charges**,
+**emails**, **pages**, or **writes**, this is the most important
+single page on the docs site.
+
+## When to use `idempotent=True`
+
+| Situation | `idempotent=True`? |
+|---|---|
+| Side-effecting tool with real-world cost (charge, email, page, book) | **yes — always** |
+| Database write you can't trivially roll back | **yes** |
+| External service that's already idempotent on its end | yes — locus dedupes the round-trip too |
+| Read-only catalogue lookup | no — re-reads are cheap, leave it to the model |
+| Tool that *intentionally* generates a new entity each call (e.g. `mint_uuid`) | no — that breaks the contract |
+
+## How it works
+
+Inside a single agent run, locus hashes the tool's
+`(name, arguments)` tuple as the model emits each call. **The first
+call with a given key hits the function body** and the result is
+recorded. **Every subsequent call with the same key short-circuits
+to the cached response** without invoking the body.
 
 ```python
-from locus.tools.decorator import tool
+from locus import tool
 
 @tool(idempotent=True)
 def transfer(from_acct: str, to_acct: str, amount: float) -> dict:
@@ -13,44 +38,103 @@ def transfer(from_acct: str, to_acct: str, amount: float) -> dict:
     return ledger.transfer(from_acct, to_acct, amount)
 ```
 
-Inside a single agent run, locus hashes the tool's `(name, kwargs)`
-tuple. The first call hits the body and the result is cached. Every
-subsequent call with identical arguments — whether the model retried,
-got confused, or asked again on a later turn — short-circuits to the
-cached response.
+The argument hash is the trust boundary:
+
+- **Same call**: the model re-emits `transfer("A", "B", 100)` after
+  seeing the receipt → cache hit, body skipped.
+- **Different call**: the model emits `transfer("A", "B", 200)` →
+  different key, body runs.
+
+Caching is keyed on the **canonical JSON form** of the arguments, so
+key order, default values, and whitespace don't matter.
 
 ## Why this matters
 
-- **Booking, billing, payments.** The model that calls `book_flight`
-  twice is more common than you think. Without idempotency you have a
-  duplicate charge and an angry customer.
-- **Outbound side-effects.** `email_cfo`, `page_oncall`, `submit_po` —
-  one and done.
-- **Database writes you can't easily roll back.**
+### Booking, billing, payments
+
+The model that calls `book_flight` twice in one run is more common
+than you think. Sometimes it sees an ambiguous tool result and tries
+again "to be sure". Sometimes the network glitches and the model
+believes the call failed. Without idempotency, you charge the
+customer twice and they're on the phone with their bank.
+
+```python
+@tool(idempotent=True)
+def book_flight(flight_id: str, customer_id: str) -> dict:
+    return billing.charge_and_book(flight_id, customer_id)
+```
+
+The customer gets billed once. Always.
 
-The argument hash is the trust boundary: if the model re-issues the
-*same* call, you fire once. If it changes any argument, that's a new
-call and the body runs.
+### Outbound side-effects
 
-## When to use it
+`email_cfo`, `page_oncall`, `submit_po`, `slack_alert` — anything
+that touches a human or a downstream system. **One and done**.
 
-| Situation | `idempotent=True`? |
+### Database writes you can't roll back
+
+Insert into a journal table, append to a Kafka topic, sign a JWT —
+operations where retrying isn't free. Idempotent tools turn the
+"exactly once" problem into a "not-our-problem-after-the-first-call"
+guarantee.
+
+### Replays after checkpoint resume
+
+When a checkpointer resumes a stalled run, the model may decide to
+re-issue tool calls it's already seen. Idempotent tools see the
+cache pre-populated from the checkpoint and skip the side effect on
+replay. (This requires `tool_executions` to be restored from the
+checkpoint; locus's [native checkpointers](checkpointers.md) handle
+it.)
+
+## What it is *not*
+
+| Concept | Idempotency is… | Idempotency is *not*… |
+|---|---|---|
+| Scope | within a single agent run | cross-run — restart and the cache is gone (use a [checkpointer](checkpointers.md)) |
+| Failure | one fire per identical call | retry — if the body raises, the exception propagates as the cached "result" |
+| Boundary | per-agent | network — two different agents both calling `transfer(a, b, 100)` each fire once |
+
+If you need cross-run idempotency, configure a checkpointer + an
+idempotent server-side endpoint. The combo gives you "the side
+effect runs at most once across all replays of all agents".
+
+## Practical recipe — vendor PO approval
+
+A canonical multi-agent idempotency shape: an agent (or three of
+them, debating) loops over a vendor decision, then writes once.
+
+```python
+@tool(idempotent=True)
+def submit_po(vendor_id: str, line_items: list[dict]) -> dict:
+    return procurement.submit(vendor_id, line_items)
+
+@tool(idempotent=True)
+def email_cfo(po_id: str, summary: str) -> str:
+    return mail.send(to="cfo@org.com", subject=f"PO {po_id}", body=summary)
+```
+
+The agent can iterate ten times reasoning about whether to approve.
+The PO ships once. The CFO email lands once. The model can fail
+mid-run and a checkpointer-backed resume re-issues the same calls;
+the side effects still fire exactly once.
+
+## Common gotchas
+
+| Symptom | Likely cause |
 |---|---|
-| Side-effecting tool with a real-world cost (charge, email, page) | **yes** |
-| Read-only catalogue lookup | no — caching the model's reads is its problem, not yours |
-| Tool that *intentionally* generates a new entity each call (e.g. `mint_uuid`) | no |
-| External service that's already idempotent | yes anyway — locus dedupes the round-trip too |
+| Tool re-fires despite `idempotent=True` | Argument changed between calls. Check that the model isn't mutating ids / amounts between turns. |
+| Idempotent cache survives across runs unexpectedly | It shouldn't — only the checkpointer persists state. If you're seeing this, you're loading state from a checkpoint and don't want to. |
+| Body raised first time, cache returns the exception | This is by design — the failure is part of the "result" of the first call. The model sees the failure and can react. To re-attempt, the model must change an argument. |
+| Read-only lookup tagged `idempotent=True` | Harmless but wasteful — the cache hit savings are negligible vs the read itself. Leave it off. |
 
-## What it is not
+## Source and tutorial
 
-- It's not idempotency *across runs*. Restart the agent and the cache
-  is gone — that's what your **checkpointer** is for.
-- It's not retry. If the body raises, the exception propagates.
-- It's not a network-layer cache. Two different agents calling
-  `transfer(a, b, 100)` each fire once.
+- [`@tool` decorator with idempotency hook](https://github.com/oracle-samples/locus/blob/main/src/locus/tools/decorator.py)
+- [`_find_matching_execution`](https://github.com/oracle-samples/locus/blob/main/src/locus/loop/nodes.py#L114) — where the dedup actually happens, in the ReAct loop's Execute node.
+- [`tutorial_03_tools_and_state.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_03_tools_and_state.py) — walks through `@tool(idempotent=True)` end-to-end.
 
-## Source and tutorials
+## See also
 
-- `src/locus/tools/decorator.py` — the `@tool` decorator and idempotency hook.
-- Tutorial: [`tutorial_03_tools_and_state.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_03_tools_and_state.py)
-  walks through `@tool(idempotent=True)` end-to-end.
+- [Tools](tools.md) — the full `@tool` decorator surface.
+- [Checkpointers](checkpointers.md) — durable runs where idempotency interacts with replay.
diff --git a/docs/concepts/mcp.md b/docs/concepts/mcp.md
index 3a69c9a..f0fec86 100644
--- a/docs/concepts/mcp.md
+++ b/docs/concepts/mcp.md
@@ -1,51 +1,166 @@
-# MCP (both ways)
+# MCP — Model Context Protocol
 
 The [Model Context Protocol](https://modelcontextprotocol.io) is an
-Anthropic-spec interop standard for tools. locus speaks MCP in both
-directions.
+Anthropic-spec interop standard for tools. Define a tool once,
+expose it over MCP, and any MCP-compatible client (Claude Desktop,
+Cline, Strands, another locus agent) can call it. Or consume tools
+from existing MCP servers (filesystem, git, postgres, github,
+sequential-thinking) without writing any glue.
 
-## Consume MCP servers
+**locus speaks MCP both ways**. That's a deliberate differentiator —
+most agent frameworks consume MCP servers but don't expose their own
+tools as MCP. Round-trip means an agent built with locus can be
+either side of the conversation.
 
-`MCPClient` wraps an external MCP server's tools so the agent can call
-them as if they were native locus tools.
+## When to use MCP
+
+| You want… | Use MCP |
+|---|---|
+| Your locus agent to use Anthropic's published filesystem / git / postgres servers | ✓ — `MCPClient` |
+| Your `@tool` library to be callable by Claude Desktop / Cline / other agents | ✓ — `LocusMCPServer` |
+| Two locus agents to share tools across processes / machines | ✓ — works, but [A2A](multi-agent/a2a.md) is the better protocol |
+| In-process multi-agent — share tools by importing | use the [tools](tools.md) directly, not MCP |
+| Deterministic tests | use [Ollama](providers/ollama.md) + plain `@tool` — MCP adds I/O |
+
+## Getting started — consume an MCP server
+
+### 1. Install the MCP extras
+
+```bash
+pip install "locus[mcp]"
+```
+
+### 2. Spawn the server and wrap it with `MCPClient`
 
 ```python
 from locus.integrations.fastmcp import MCPClient
 
-# spawn the MCP server as a subprocess (stdio transport)
-fs = MCPClient.stdio(command=["npx", "-y", "@modelcontextprotocol/server-filesystem", "/data"])
+# Spawn Anthropic's filesystem server as a subprocess (stdio transport):
+fs = MCPClient.stdio(
+    command=["npx", "-y", "@modelcontextprotocol/server-filesystem", "/data"],
+)
+```
+
+`MCPClient.stdio` runs the subprocess, opens an MCP session over its
+stdin/stdout, and discovers what tools the server exposes.
 
-agent = Agent(model=..., tools=[*fs.tools()])  # MCP tools become locus tools
+### 3. Pass the tools straight into an Agent
+
+```python
+from locus import Agent
+
+agent = Agent(
+    model="oci:openai.gpt-5.5",
+    tools=[*fs.tools()],          # MCP tools become locus tools
+    system_prompt="You can read files in /data.",
+)
+result = agent.run_sync("Summarise the README in /data.")
 ```
 
-The client registers every MCP tool with locus's tool registry, with
-schema, descriptions, and call-through plumbing intact.
+`fs.tools()` returns a list of locus `Tool` objects with full
+schemas, descriptions, and call-through plumbing. The agent doesn't
+know they're MCP — they look like any other `@tool`.
 
-## Expose locus tools as MCP
+## Getting started — expose your tools as MCP
 
-`LocusMCPServer` turns a set of locus tools into an MCP server other
-agents can consume.
+### 1. Wrap a tool list in `LocusMCPServer`
 
 ```python
 from locus.integrations.fastmcp import LocusMCPServer
 
 server = LocusMCPServer(tools=[search_vendors, submit_po])
-server.run_stdio()        # or .run_http(port=7400)
 ```
 
-Anthropic Claude, Strands, or any MCP-spec client can now call your
-locus tools.
+### 2. Pick a transport
+
+```python
+server.run_stdio()                    # for desktop clients
+server.run_http(port=7400)            # for HTTP MCP clients
+```
+
+`run_stdio()` is what Claude Desktop, Cline, and most MCP clients
+expect. `run_http()` runs an HTTP MCP server (transport + JSON-RPC)
+that any HTTP MCP client can reach.
+
+### 3. Point a client at it
+
+For Claude Desktop, edit `~/Library/Application Support/Claude/claude_desktop_config.json`:
+
+```json
+{
+  "mcpServers": {
+    "my-locus-tools": {
+      "command": "python",
+      "args": ["-m", "my_package.mcp_server"]
+    }
+  }
+}
+```
+
+Restart Claude Desktop. Your `search_vendors` and `submit_po` tools
+appear in the model's tool list.
+
+## What you get out of the box
+
+### Schema preservation
+
+`@tool`'s docstring + type hints become the MCP tool's name,
+description, and JSON schema — losslessly. The MCP client sees the
+same parameter types, defaults, and descriptions a locus agent
+would.
+
+### Both transports
+
+| Transport | Use case |
+|---|---|
+| **stdio** — process pipes | Desktop clients (Claude Desktop, Cline). The MCP server is spawned as a subprocess. |
+| **HTTP** — JSON-RPC over POST | Browser-side or networked clients. Good for shared tool servers. |
+
+### Idempotency carries through
+
+A tool tagged `@tool(idempotent=True)` keeps that semantic when
+exposed via MCP. The dedup happens locus-side; the MCP client
+doesn't need to know.
 
 ## Round-trip example
 
-A common shape: locus agent A consumes an MCP filesystem server, plus
-a locus agent B exposed as MCP that A can also call. Same client API,
-different transports.
+A common shape: a locus agent A consumes a filesystem MCP server,
+*and* exposes its own tools as MCP for another agent B to consume:
+
+```python
+# Agent A — consumes filesystem, exposes its own analytics tools
+fs = MCPClient.stdio(command=[...])      # consumer side
+analytics = LocusMCPServer(              # producer side
+    tools=[summarise_csv, plot_histogram],
+)
+analytics.run_http(port=7400, in_background=True)
+
+agent_a = Agent(
+    model="oci:openai.gpt-5.5",
+    tools=[*fs.tools(), summarise_csv, plot_histogram],
+)
+```
+
+Same `MCPClient` API on the consumer side, same `LocusMCPServer` on
+the producer side, same tool definitions. The transport is an
+implementation detail.
+
+## Common gotchas
+
+| Symptom | Likely cause |
+|---|---|
+| `MCP server failed to start` | The MCP server subprocess crashed before establishing the session. Run the command manually to see the error. |
+| `Tool 'X' not found in MCP discovery` | The server exposes a different name than you expected. Print `[t.name for t in fs.tools()]` to see the actual list. |
+| `Schema validation failed on call` | MCP tool returned an arg type that doesn't match its declared schema. Common with hand-written MCP servers; the standard ones are fine. |
+| Claude Desktop doesn't show your locus tools | `claude_desktop_config.json` not picked up — check the file lives at the right path and Claude has been restarted. |
+| Hangs on `MCPClient.stdio` startup | The MCP subprocess is waiting for input on stdin (some servers expect a handshake). Pass `wait_for_init=True` and a timeout. |
 
-## Tutorial
+## Source and tutorial
 
-[`tutorial_12_mcp_integration.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_12_mcp_integration.py).
+- [`locus.integrations.fastmcp`](https://github.com/oracle-samples/locus/blob/main/src/locus/integrations/fastmcp.py) — built on FastMCP.
+- [`tutorial_12_mcp_integration.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_12_mcp_integration.py) — consumer + producer end-to-end.
 
-## Source
+## See also
 
-`src/locus/integrations/fastmcp.py` — built on FastMCP.
+- [Tools](tools.md) — the `@tool` decorator MCP wraps.
+- [A2A](multi-agent/a2a.md) — purpose-built protocol for cross-process locus-to-locus agent meshes.
diff --git a/docs/concepts/providers/openai.md b/docs/concepts/providers/openai.md
index d3c21c8..5810501 100644
--- a/docs/concepts/providers/openai.md
+++ b/docs/concepts/providers/openai.md
@@ -33,12 +33,12 @@ That's the only setup. locus reads the env var automatically.
 ```python
 from locus import Agent
 
-agent = Agent(model="openai:gpt-5.5", system_prompt="You are helpful.")
+agent = Agent(model="openai:gpt-5", system_prompt="You are helpful.")
 ```
 
-The string `"openai:gpt-5.5"` does two things: tells locus to use the
+The string `"openai:gpt-5"` does two things: tells locus to use the
 OpenAI provider (`openai:` prefix), and which model id to call
-(`gpt-5.5`). Any model id OpenAI accepts, locus accepts.
+(`gpt-5`). Any model id OpenAI accepts, locus accepts.
 
 ### 3. Run it
 
@@ -55,7 +55,7 @@ without further configuration.
 
 ### Chat completions across the GPT family
 
-Every chat-shaped OpenAI model: `gpt-4o`, `gpt-4.1`, `gpt-5`, `gpt-5.5`,
+Every chat-shaped OpenAI model: `gpt-4o`, `gpt-4.1`, `gpt-5`, `gpt-5`,
 `gpt-image-1`. Vision input (image URLs / base64), audio input, and
 function calling work the same way you'd use them on the OpenAI SDK
 directly — locus just normalises the events the model emits.
@@ -107,7 +107,7 @@ class Answer(BaseModel):
     confidence: float
 
 agent = Agent(
-    model="openai:gpt-5.5",
+    model="openai:gpt-5",
     output_schema=Answer,
     system_prompt="Reply as JSON matching the schema.",
 )
diff --git a/docs/concepts/streaming.md b/docs/concepts/streaming.md
index babc0ce..7f367c3 100644
--- a/docs/concepts/streaming.md
+++ b/docs/concepts/streaming.md
@@ -1,13 +1,46 @@
 # Streaming
 
-Every locus agent emits typed events as it runs. They are real
-classes, not strings — drop them into `match` statements and let the
-type checker verify your handler is exhaustive.
+Every locus agent emits a **typed event stream** as it runs. The
+events aren't strings or `dict[str, Any]` blobs — they're frozen
+Pydantic classes, designed to drop into a `match` statement and let
+your type checker verify the handler is exhaustive.
+
+This is the surface a UI consumes (live token rendering, tool-call
+indicators, reasoning bubbles), the surface telemetry hooks observe,
+and the surface `AgentServer` re-emits over Server-Sent Events for
+browsers.
+
+## When to consume the event stream
+
+| You want… | Use… |
+|---|---|
+| Live token-by-token rendering in a UI | `async for event in agent.run(...)` |
+| The final answer as a single value (tests, scripts, REPL) | `agent.run_sync(prompt).message` — no event handling |
+| Spans / metrics on every model + tool call | install [`TelemetryHook`](hooks.md#telemetryhook) |
+| To stream over HTTP to a browser | [`AgentServer`](server.md) re-emits as SSE |
+
+## Getting started
+
+### 1. Use `agent.run(prompt)` instead of `run_sync`
+
+```python
+async for event in agent.run("Plan a trip to Paris."):
+    print(event)
+```
+
+`agent.run(...)` returns an async iterator. Each iteration yields one
+event in the order it occurred.
+
+### 2. Pattern-match on the event types
 
 ```python
 from locus.core.events import (
-    ThinkEvent, ToolStartEvent, ToolCompleteEvent,
-    ModelChunkEvent, ReflectEvent, TerminateEvent,
+    ThinkEvent,
+    ToolStartEvent,
+    ToolCompleteEvent,
+    ModelChunkEvent,
+    ReflectEvent,
+    TerminateEvent,
 )
 
 async for event in agent.run("Plan a trip to Paris."):
@@ -19,50 +52,126 @@ async for event in agent.run("Plan a trip to Paris."):
         case ToolCompleteEvent(tool_name=n, result=r):
             print(f"   ↳ {r}")
         case ModelChunkEvent(content=c) if c:
-            print(c, end="", flush=True)        # token-level streaming
+            print(c, end="", flush=True)            # token-level streaming
         case ReflectEvent(assessment=a, new_confidence=c):
             print(f"🪞 {a} ({c:.2f})")
         case TerminateEvent(final_message=m):
             print(f"\n✅ {m}")
 ```
 
-## Event taxonomy
+`match` checks every branch against the event class. If you forget a
+branch your IDE underlines it; if you mistype a field name (e.g.
+`reasonng` instead of `reasoning`) you get a static error.
 
-| Event | When |
-|---|---|
-| `ThinkEvent` | Model emits reasoning (extended-thinking models). |
-| `ModelChunkEvent` | Each streamed text chunk. Pipe straight to a UI. |
-| `ToolStartEvent` | Agent decided to call a tool. |
-| `ToolCompleteEvent` | Tool returned (or raised). |
-| `ReflectEvent` | Reflexion loop emitted a self-evaluation. |
-| `GroundingEvent` | Grounding evaluation finished. |
-| `InterruptEvent` | A tool requested human-in-the-loop input. |
-| `TerminateEvent` | The run is done — terminal condition met. |
+## The event taxonomy
+
+| Event | When it fires | Useful for |
+|---|---|---|
+| `ThinkEvent` | The model emits reasoning (extended-thinking models like Claude 4 / o-series) | Render "thinking…" bubbles in a UI |
+| `ModelChunkEvent` | Each streamed text chunk from the model | Token-level live rendering |
+| `ToolStartEvent` | The agent decided to call a tool | Show a "calling X" indicator |
+| `ToolCompleteEvent` | A tool returned (or raised — check `error`) | Show the result inline |
+| `ReflectEvent` | Reflexion emitted a self-evaluation | Show "I'm checking my work" |
+| `GroundingEvent` | Grounding evaluation finished | Show "verifying claims" |
+| `InterruptEvent` | A tool requested human-in-the-loop input | Block on user approval |
+| `TerminateEvent` | The run finished — terminal condition met | Show the final answer |
+
+Every event carries an `event_type` discriminator and a UTC
+`timestamp`, so persisted streams replay deterministically.
+
+## Write-protected — by design
+
+Events are **frozen** Pydantic models. A hook can read every field;
+it **cannot** mutate one. Try and you get a `ValidationError`. If a
+hook wants to steer the agent (cancel a tool, retry a model call),
+it uses an explicit method on the event (`event.cancel()`,
+`event.retry()`, `event.replace_arguments(...)`) — the intent is
+visible in code review.
+
+Why this is important: in callback-based event systems any code can
+silently mutate a field and you find out three hops downstream when
+the value's wrong. locus's frozen events make that impossible.
 
-Every event carries `event_type` and a UTC `timestamp`.
+## Sync wrapper — when you don't need the stream
 
-## Write-protected
+```python
+result = agent.run_sync("What is 2+2?")
+print(result.message)        # 'Four.'
+print(result.metrics.iterations)
+```
+
+`agent.run_sync(prompt)` consumes the event stream internally and
+returns the final `AgentResult`. The events still emit (hooks still
+fire), but you get a single value back. Use this in tests, REPLs,
+and scripts where the trace doesn't matter.
+
+## Practical recipe — render to a terminal UI
+
+```python
+async for event in agent.run("Find Q3 revenue and email it to me."):
+    match event:
+        case ToolStartEvent(tool_name=n):
+            print(f"\n🔧 {n}", end="", flush=True)
+        case ToolCompleteEvent(error=e) if e:
+            print(f" ✗ {e}")
+        case ToolCompleteEvent():
+            print(" ✓")
+        case ModelChunkEvent(content=c) if c:
+            print(c, end="", flush=True)
+        case TerminateEvent():
+            print()
+```
 
-Events are write-protected value objects. A hook *cannot* mutate one;
-the type system enforces it. If a hook needs to influence the run, it
-returns a control directive (e.g. `Cancel`, `Retry`).
+Every event class is a small Pydantic record — there's no hidden
+state. What you see is what gets serialised over SSE, what your
+checkpointer persists, what your structured logger records.
 
-## Sync wrapper
+## SSE over HTTP — for browser UIs
 
-If you don't want to consume events, `agent.run_sync(prompt)` returns
-the final `AgentResult` directly.
+The reference [`AgentServer`](server.md) maps the same event stream
+onto Server-Sent Events. Same `event_type`, same fields, just
+`Content-Type: text/event-stream` over HTTP.
 
-## SSE over HTTP
+```python
+from locus.server import AgentServer
+import uvicorn
+
+server = AgentServer(agent=agent)
+uvicorn.run(server.app, port=8000)
+```
+
+```javascript
+// Browser-side
+const es = new EventSource('/stream?prompt=...');
+es.addEventListener('ModelChunkEvent', (e) => {
+    const { content } = JSON.parse(e.data);
+    document.getElementById('out').innerText += content;
+});
+```
 
-The reference [AgentServer](server.md) maps the same events onto
-Server-Sent Events for browser consumption — same shape, different
-transport.
+## Common gotchas
+
+| Symptom | Likely cause |
+|---|---|
+| `async for` exhausts immediately | You're calling `agent.run_sync()` (sync) instead of `agent.run()` (async). |
+| `ModelChunkEvent`s but no `TerminateEvent` | Generator was cancelled mid-stream. Check for exceptions in the consumer. |
+| Same event fires twice | A hook re-yielded an event it received. Hooks observe, they don't re-emit. |
+| Browser SSE drops every 30s | Default proxy timeout. Set `proxy_read_timeout` higher or have the agent send heartbeats. |
 
 ## Tutorials
 
-- [`tutorial_04_agent_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_04_agent_streaming.py)
-- [`tutorial_21_sse_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_21_sse_streaming.py)
+- [`tutorial_04_agent_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_04_agent_streaming.py) — your first event consumer.
+- [`tutorial_21_sse_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_21_sse_streaming.py) — full SSE wiring against `AgentServer`.
 
 ## Source
 
-`src/locus/streaming/` and `src/locus/core/events.py`.
+- [`locus.core.events`](https://github.com/oracle-samples/locus/blob/main/src/locus/core/events.py) — every event class.
+- [`Agent.run`](https://github.com/oracle-samples/locus/blob/main/src/locus/agent/agent.py) — the iterator that emits them.
+- [`AgentServer`](https://github.com/oracle-samples/locus/tree/main/src/locus/server) — the SSE wrapper.
+
+## See also
+
+- [Events](events.md) — full taxonomy in reference form.
+- [Hooks](hooks.md) — observe the same stream from inside the loop.
+- [Agent Server](server.md) — re-emit over HTTP/SSE.
+- [Graph streaming](graph-streaming.md) — multi-agent state-graph event streams.
diff --git a/docs/concepts/tools.md b/docs/concepts/tools.md
index 9143e15..0d80735 100644
--- a/docs/concepts/tools.md
+++ b/docs/concepts/tools.md
@@ -1,27 +1,69 @@
 # Tools
 
-Tools are the agent's way of affecting the world. You write a regular
-Python function, decorate it, and pass it to `Agent(tools=[...])`. The
-`@tool` decorator introspects the signature and docstring to build a
-JSON-schema description the model can call.
+Tools are how a locus agent affects the world. The model decides
+*"call `search` with query='hnsw'"*; locus runs your `search`
+function, captures the return value, and feeds it back. From your
+side, a tool is **a regular Python function with a `@tool`
+decorator** — locus introspects the signature and docstring to build
+the schema the model sees.
+
+This is the seam most production code touches. Get tools right and
+the rest of the framework gets out of your way.
+
+## When to write a tool
+
+| You want… | Write a tool |
+|---|---|
+| The model to call your API / database / file system | ✓ |
+| Side-effecting actions the model should be able to invoke | ✓ |
+| Read-only lookups (catalogue search, status checks) | ✓ |
+| To mutate the agent's *internal* state (system prompt, config) | use a [hook](hooks.md), not a tool |
+| To intercept *every* tool call (logging, retry) | use a [hook](hooks.md) |
+
+## Getting started
+
+### 1. Decorate a function
 
 ```python
 from locus import tool
 
 @tool
 def search(query: str, limit: int = 10) -> list[str]:
-    """Search the knowledge base for `query`, up to `limit` results."""
+    """Search the knowledge base for ``query``, up to ``limit`` results."""
     return backend.search(query, limit)
 ```
 
-The docstring becomes the tool description. Parameters are taken
-from the signature — type hints drive the JSON schema. Defaults are
-optional parameters.
+The docstring becomes the tool description the model reads. Type
+hints (`str`, `int`, `list[str]`) build the JSON schema. Defaults
+mark optional parameters.
+
+### 2. Pass to the agent
+
+```python
+agent = Agent(model="oci:openai.gpt-5.5", tools=[search])
+```
+
+That's the wiring. The model now sees `search` in its tool list and
+can call it whenever it decides to.
+
+### 3. Run it
+
+```python
+result = agent.run_sync("Find documents about HNSW.")
+```
+
+If the model decides to call `search("hnsw")`, locus invokes your
+function with that argument, captures the return value, and feeds it
+into the next model turn. You write Python; locus handles the
+schema marshalling.
 
-## Idempotent tools
+## What you get out of the box
 
-Some tools have side effects you never want duplicated — bookings,
-transfers, writes. Mark them idempotent:
+### Idempotent tools — the model can retry; the side effect can't
+
+This is locus's flagship tool primitive. Some side-effecting tools
+must run *exactly once* per logical request — bookings, charges,
+emails, paging. Mark them `idempotent=True`:
 
 ```python
 @tool(idempotent=True)
@@ -33,42 +75,133 @@ def book_flight(flight_id: str, customer_id: str) -> dict:
 ```
 
 When the model re-issues a tool call with the same
-`(name, arguments)` that already ran in this agent run, the ReAct
-loop reuses the prior result instead of invoking the function again.
-Useful for defending against:
+`(name, arguments)` tuple that already ran in this agent run, the
+ReAct loop **reuses the prior result instead of invoking the
+function again**. Defends against:
 
-- Models that repeat calls after seeing the result.
-- Network glitches where a call looks failed but actually succeeded.
+- Models that re-emit the same call after seeing the result.
+- Network glitches where a call appears failed but actually succeeded.
 - Users re-prompting "do X" when X has already been done.
+- Replays after a checkpoint resume.
+
+Read the [idempotency concept page](idempotency.md) for the full
+picture and the matching tutorial.
+
+### Sync and async bodies
+
+Both shapes are supported. Async bodies run on the agent's event
+loop directly; sync bodies run in a thread-pool executor so the loop
+is never blocked.
+
+```python
+@tool
+def add(a: int, b: int) -> int:
+    return a + b                        # sync — runs in thread pool
+
+@tool
+async def fetch(url: str) -> str:
+    async with httpx.AsyncClient() as c:
+        return (await c.get(url)).text   # async — runs on the loop
+```
+
+### Parallel by default — fast when the model wants multiple things
 
-This is a Locus-specific primitive; LangChain, LangGraph, and Strands
-do not ship it.
+```python
+agent = Agent(
+    model=...,
+    tools=[search_a, search_b, search_c],
+    tool_execution="concurrent",   # default
+)
+```
+
+When the model emits multiple tool calls in one turn, locus runs
+them concurrently via `asyncio.gather`. Three independent searches
+finish in `max(t1, t2, t3)`, not `t1+t2+t3`.
+
+If your tools have side effects that must be ordered, switch to
+`tool_execution="sequential"`.
 
-## Custom names and descriptions
+### Error handling — tool failures don't crash the agent
 
-Override the defaults via keyword arguments:
+If a tool raises, the executor catches the exception, wraps it as a
+`ToolResult(success=False, error=...)`, and feeds it back into the
+next model turn. The model sees the failure and can react: retry,
+try a different tool, or report to the user.
 
 ```python
-@tool(name="find_customer", description="Look up a customer by email.")
-async def _find(email: str) -> Customer:
+@tool
+def lookup_by_id(id: str) -> dict:
+    record = db.get(id)
+    if record is None:
+        raise ValueError(f"no record with id={id}")
+    return record
+```
+
+The model sees `"no record with id=42"` and decides what to do.
+Behind the scenes, locus chains the original exception as the cause
+on a `ToolExecutionError` for your structured logs.
+
+### Custom names and descriptions
+
+Override the auto-derived defaults when the function name doesn't
+read well to the model:
+
+```python
+@tool(name="find_customer", description="Look up a customer by email address.")
+async def _find_customer_internal(email: str) -> Customer:
     ...
 ```
 
-Both sync and async bodies are supported. Sync bodies run in a
-thread-pool executor so the event loop is not blocked.
+The model sees `find_customer`; your code keeps the internal name.
+
+## Practical recipes
+
+### Read-only lookups
+
+```python
+@tool
+def get_order_status(order_id: str) -> dict:
+    """Return the current status and shipment info for an order."""
+    return orders.get(order_id)
+```
+
+No need for `idempotent=True` — read-only calls are safe to repeat.
+
+### Idempotent writes
+
+```python
+@tool(idempotent=True)
+def submit_po(vendor_id: str, line_items: list[dict]) -> dict:
+    """Submit a purchase order. Re-fires return the cached PO id."""
+    return procurement.submit(vendor_id, line_items)
+```
+
+### A tool that's also exposed via MCP
+
+If you've built a tool you want other agents to reach, expose it
+through `LocusMCPServer` — same `@tool`, no rewrite. See
+[MCP](mcp.md).
+
+## Common gotchas
 
-## Parallel vs sequential execution
+| Symptom | Likely cause |
+|---|---|
+| Model never calls the tool | Description / docstring isn't telling the model when to use it. Be explicit: *"Use this tool when the user asks about X."* |
+| Tool fires twice on the same input | You're seeing the model retry. Add `idempotent=True`. |
+| `TypeError: missing 1 required positional argument` at call time | Function signature has a parameter without a default that you didn't surface in the docstring; the model omitted it. Add a default or explain the parameter. |
+| Tool returns Python objects but the model echoes `<__main__.X object at 0x…>` | Tool return value isn't JSON-serialisable. Return a dict / Pydantic model / list of strings, not arbitrary objects. |
+| Async tool blocks the event loop | The "async" body is calling sync I/O. Wrap the blocking call in `asyncio.to_thread(...)` or use an async client. |
 
-The agent decides based on `config.tool_execution`:
+## Source
 
-- `"concurrent"` (default) — tool calls run in parallel via
-  `asyncio.gather`.
-- `"sequential"` — tool calls run one at a time. Pick this when tool
-  side effects must be ordered.
+- [`@tool` decorator and `Tool` class](https://github.com/oracle-samples/locus/blob/main/src/locus/tools/decorator.py)
+- [`ToolRegistry`](https://github.com/oracle-samples/locus/blob/main/src/locus/tools/registry.py)
+- [Built-in tools](https://github.com/oracle-samples/locus/tree/main/src/locus/tools/builtins) — `get_today_date`, `task_complete`, `ask_user`
 
-## Error handling
+## See also
 
-If a tool raises, the exception is caught at the executor boundary,
-wrapped as a `ToolResult(success=False, error=...)`, and passed to the
-model so it can react. The original exception is chained as the cause
-on a `ToolExecutionError` (see [Errors](errors.md)).
+- [Idempotency](idempotency.md) — the full story on `idempotent=True`.
+- [Hooks](hooks.md) — for cross-cutting concerns (logging, retry, guardrails).
+- [Executors](executors.md) — how concurrent vs sequential tool execution works.
+- [MCP](mcp.md) — expose your tools to other agents over the Model Context Protocol.
+- [Errors](errors.md) — how tool failures surface in the event stream.
diff --git a/docs/img/sequence-26ai.svg b/docs/img/sequence-26ai.svg
index 0173c8e..609a3ed 100644
--- a/docs/img/sequence-26ai.svg
+++ b/docs/img/sequence-26ai.svg
@@ -54,7 +54,7 @@
   <g filter="url(#card)">
     <rect x="700" y="148" width="140" height="44" rx="8" fill="#312D2A"/>
     <text x="770" y="169" text-anchor="middle" font-size="13" font-weight="700" fill="#FFFFFF">OCI GenAI</text>
-    <text x="770" y="184" text-anchor="middle" font-size="10" fill="#D6D3D1" font-family="ui-monospace, monospace">gpt-5.5 · cohere-embed</text>
+    <text x="770" y="184" text-anchor="middle" font-size="10" fill="#D6D3D1" font-family="ui-monospace, monospace">gpt-5 · cohere-embed</text>
   </g>
 
   <g filter="url(#card)">
diff --git a/docs/index.md b/docs/index.md
index e957f6a..63557d8 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -7,21 +7,23 @@ hide:
 <div class="locus-hero" markdown>
 <div class="locus-hero__copy" markdown>
 
-# Build AI workflows that <span class="accent">actually ship</span>
+# Build agents that reason and <span class="accent">solve together.</span>
 
-**Oracle Generative AI · Multi-Agent · Reasoning · Orchestrator SDK.**
+**The Oracle Gen AI Multi-Agent Reasoning SDK.**
 
-Spin up a **swarm** of specialists. Hand a conversation off across an
-**escalation desk**. Run an **orchestrator** of experts in parallel.
-Wire up a **state graph** that loops until confident. Mesh agents
-**across processes** with A2A. Or just ship one self-correcting agent
-that knows when to stop.
+Reasoning lives inside the loop. **Reflexion** evaluates every turn.
+**Grounding** verifies every claim against its source. **Causal**
+traces root cause from symptom.
 
-Six multi-agent shapes. One Oracle-native runtime. Every model on OCI
-the day it lands. The agent stack you'd actually let near a credit
-card.
+Six shapes for six problems. **Compose** linear pipelines.
+**Orchestrate** specialists in parallel. **Swarm** for peer-to-peer
+research. **Handoff** for escalation desks. **StateGraph** loops
+until confident. **Functional** maps across agents. **A2A** meshes
+across processes.
 
-[See what you can build](#what-you-can-build){ .md-button .md-button--primary }
+Every model on Oracle Generative AI the day it lands.
+
+[See what you can build](#six-things-you-can-ship){ .md-button .md-button--primary }
 [GitHub](https://github.com/oracle-samples/locus){ .md-button }
 
 ```bash
@@ -34,7 +36,7 @@ pip install "locus[oci]"
 
 <div class="locus-hero__code" markdown>
 
-```python title="travel_concierge.py"
+```python
 from locus import Agent
 from locus.tools.decorator import tool
 from locus.memory.backends import OCIBucketBackend
@@ -53,7 +55,7 @@ def book_flight(flight_id: str, customer_id: str) -> dict:
     return billing.charge_and_book(flight_id, customer_id)
 
 agent = Agent(
-    model="oci:openai.gpt-5.5",
+    model="oci:openai.gpt-5",
     tools=[search_flights, book_flight],
     system_prompt="You are a travel concierge. Find a flight, then book it.",
     reflexion=True,                                 # self-correct mid-run
@@ -77,70 +79,220 @@ print(result.message)
 </div>
 </div>
 
-## What you can build
+## Six things you can ship
+
+### Claims grounded. Citations real. Hallucinations dropped
+
+**Reflexion** evaluates every turn and feeds the next Think a sharper
+plan. **Grounding** scores each claim against the tool result it came
+from; below-threshold claims get dropped or sent back for re-research.
+**Causal** traces root cause from symptom in incident-triage runs.
+
+```python
+from locus import Agent
+from locus.tools.decorator import tool
+
+@tool
+def search_web(query: str) -> str:
+    """Search the web for facts."""
+    return search_api.query(query)
+
+@tool
+def read_url(url: str) -> str:
+    """Fetch and clean text from a URL."""
+    return http.fetch_text(url)
+
+agent = Agent(
+    model="oci:openai.gpt-5",
+    tools=[search_web, read_url],
+    reflexion=True,    # self-evaluate every turn
+    grounding=True,    # verify claims against tool results
+)
+
+result = agent.run_sync("Summarise the Q3 earnings call. Cite every number.")
+print(result.message)
+print(f"grounding score: {result.grounding_score:.2f}")
+# → grounding score: 0.94 — three claims grounded, one dropped (revenue mix)
+```
+
+→ [Reasoning inside the loop](concepts/reasoning.md) ·
+[Turn on Reflexion + Grounding in one line](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_14_reasoning_patterns.py)
 
-Six concrete workflows. All of them ship in production with locus
-today. None of them require a graph editor, a YAML DAG, or a
-separate orchestration platform.
+### Side effects fire once. Even when the model retries
 
-### Approval workflows that don't double-fire
+The model can re-emit the same call after seeing an ambiguous result,
+after a network glitch, after a checkpointed restart. With
+**`@tool(idempotent=True)`** the body fires exactly once per
+`(name, arguments)` hash. Booking, billing, paging — safe by design.
 
-A vendor PO comes in. Procurement and Compliance debate it against
-your live Oracle 26ai catalogue. They reach a recommendation. A human
-clicks `[y/N]`. The Approval Officer fires `submit_po` and
-`email_cfo` — once, even if the model retries the same call three
-times.
+```python
+from locus import Agent
+from locus.tools.decorator import tool
 
-> *Procurement and Compliance disagree on three of nine vendors. The
-> human approves two. Submit + email fire exactly once. Your CFO is
-> happy.*
+@tool(idempotent=True)
+def submit_po(vendor_id: str, line_items: list[dict]) -> dict:
+    """Submit the PO. Re-fires within the run return the cached receipt."""
+    return procurement.submit(vendor_id, line_items)
 
-### Research crews that catch their own mistakes
+@tool(idempotent=True)
+def email_cfo(po_id: str, body: str) -> str:
+    """Send the CFO note. Same arguments → same delivery."""
+    return mail.send(to="cfo@org.com", subject=f"PO {po_id}", body=body)
 
-An agent reads, summarises, and fact-checks. **Grounding**
-auto-verifies every claim against the source it cited. When a claim
-fails grounding the agent goes back and re-reads. **Reflexion**
-spots loops on wrong premises before they cost you ten turns of
-tokens. You get cited, grounded answers — not hallucinated narratives.
+agent = Agent(
+    model="oci:openai.gpt-5",
+    tools=[search_vendors, submit_po, email_cfo],
+    system_prompt="Approve a vendor; submit the PO; email the CFO.",
+)
 
-### Customer support that survives every deploy
+result = agent.run_sync("Approve Acme for the $42k laptop refresh.")
+# → PO-2847 submitted. CFO emailed once. Three model retries deduped on
+#   the (name, kwargs) hash inside the ReAct loop's Execute node.
+```
 
-Triage decides whether the conversation needs Billing or Shipping.
-The whole transcript hands over. The customer sees one continuous
-reply. The conversation thread is checkpointed to OCI Object Storage,
-so a redeploy mid-chat doesn't lose context. The customer doesn't
-have to re-explain.
+→ [Idempotent tools in the ReAct loop](concepts/idempotency.md) ·
+[Walk through a vendor PO with human approval](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_09_human_in_the_loop.py)
 
-### Autonomous workflows that stop when they should
+### One conversation, many specialists
 
-Compose stop conditions like algebra:
+**Handoff** transfers context, tool history, and confidence from
+specialist to specialist. The customer sees one continuous reply;
+each team ships their specialist on its own schedule, in its own repo.
 
 ```python
-terminate = (ToolCalled("submit") & ConfidenceMet(0.9)) | MaxIterations(15)
+from locus.multiagent.handoff import (
+    create_handoff_agent, create_handoff_manager, HandoffReason,
+)
+
+triage = create_handoff_agent(
+    name="Triage",
+    description="Routes incoming customer issues",
+    system_prompt="Decide: Billing or Shipping. Then hand off.",
+)
+billing = create_handoff_agent(
+    name="Billing",
+    description="Resolves invoices, refunds, charges",
+    system_prompt="Resolve the billing issue end-to-end.",
+)
+shipping = create_handoff_agent(
+    name="Shipping",
+    description="Tracks orders, reroutes shipments",
+    system_prompt="Resolve the shipping issue end-to-end.",
+)
+triage.can_delegate_to = [billing.id, shipping.id]
+
+desk = create_handoff_manager(
+    agents=[triage, billing, shipping],
+    max_chain=5,
+)
+# → [Triage → Billing] "Refunded $129. Confirmation RF-19340."
 ```
 
-The loop stops when the work is actually done — not when the budget
-runs out, not when the agent gives up halfway. Inspect, unit-test,
-audit; termination is just data.
+→ [Handoff with chain-of-custody](concepts/multi-agent/handoff.md) ·
+[Wire a Triage / Billing / Shipping handoff desk](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_16_agent_handoff.py)
 
-### Multi-agent meshes across teams and processes
+### Agent meshes across teams and processes
 
-Your research agent calls a finance agent on another team's service
-over **A2A**. They share one event stream. Each agent advertises an
-`AgentCard` that lists its capability tags; the calling agent fetches
-the card from a known URL and decides whether to delegate. You ship
-one agent at a time, on your team's schedule, in your team's repo —
-and they still talk.
+Each agent publishes an **`AgentCard`** at `/agent-card`. Your research
+agent fetches the card from the Finance team's URL, reads the skills
+list, and decides whether to delegate. HTTP+SSE under the hood, no
+shared infrastructure required.
 
-### Agents that ship to your users on day one
+```python
+import asyncio
+from locus.a2a import A2AClient
+
+async def main():
+    # The Finance team publishes their agent at this URL.
+    finance = A2AClient(url="https://finance.example.com")
+
+    # Discover capabilities (name, description, skills).
+    card = await finance.get_agent_card()
+    print(f"Calling {card.name} — {card.description}")
+    print(f"Skills: {card.skills}")
+
+    # Delegate.
+    answer = await finance.invoke(
+        "Pull Q3 OPEX vs forecast for line items 4100-4250."
+    )
+    print(answer)
+    # → Q3 OPEX: $47M vs forecast $51M (-8%, supply-chain delays).
+
+asyncio.run(main())
+```
+
+→ [A2A — agents across processes](concepts/multi-agent/a2a.md) ·
+[Call another team's agent over A2A](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_34_a2a_protocol.py)
+
+### Stop conditions you can compose
+
+Compose stop conditions with Python's `&` and `|` operators on typed
+classes — `__and__` / `__or__` overloads. Inspectable, unit-testable,
+serialisable. You can grep your codebase for *exactly when* an agent
+decides to stop. The loop ends when the work is done.
+
+```python
+from locus import Agent
+from locus.core.termination import (
+    MaxIterations, ToolCalled, ConfidenceMet, TextMention,
+)
+
+termination = (
+    (ToolCalled("submit_po") & ConfidenceMet(0.9))   # work done + confident
+    | TextMention(r"\bDONE\b")                         # …or model says DONE
+    | MaxIterations(15)                                # …or safety cap
+)
+
+agent = Agent(
+    model="oci:openai.gpt-5",
+    tools=[search_vendors, submit_po],
+    termination=termination,
+)
+
+result = agent.run_sync("Approve and submit the laptop PO.")
+print(result.termination_reason)
+# → ToolCalled('submit_po') and ConfidenceMet(0.92)
+```
+
+→ [Termination algebra](concepts/termination.md) ·
+[Compose stop conditions like algebra](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_37_termination.py)
+
+### Day-one production deployment
+
+**`AgentServer`** wraps any agent as a FastAPI app: `POST /invoke`,
+`POST /stream` for SSE, `GET`/`DELETE /threads/{id}` with per-principal
+persistence — two API keys can't read each other's threads. Ship to
+OKE, Container Instances, OCI Functions, or anywhere FastAPI runs.
+
+```python
+import os
+from locus import Agent
+from locus.memory.backends import oci_bucket_checkpointer
+from locus.server import AgentServer
+
+agent = Agent(
+    model="oci:openai.gpt-5",
+    tools=[lookup_invoice, refund],
+    checkpointer=oci_bucket_checkpointer(
+        bucket_name="support-threads",
+        namespace="<your-tenancy>",
+    ),
+)
+
+server = AgentServer(
+    agent=agent,
+    api_key=os.environ["LOCUS_SERVER_API_KEY"],
+)
+server.run(host="0.0.0.0", port=8080)
+
+# $ curl -X POST http://localhost:8080/invoke \
+#       -H "Authorization: Bearer $LOCUS_SERVER_API_KEY" \
+#       -d '{"prompt":"Refund order ORD-42","thread_id":"user-c42"}'
+# → {"message": "Refunded $129. Confirmation RF-19340.", "thread_id": "user-c42"}
+```
 
-`AgentServer` is a drop-in FastAPI app: `POST /invoke` for synchronous
-runs, `POST /stream` for SSE-streamed events, `GET` / `DELETE
-/threads/{id}` for per-thread persistence (scoped to the bearer
-principal so two API keys can't read each other's conversations).
-Native to Oracle Generative AI — every model the day OCI ships it.
-Two transports, one auth surface, zero glue between laptop and
-production.
+→ [Agent Server — drop-in FastAPI app](concepts/server.md) ·
+[Deploy a locus agent as a FastAPI service](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_28_agent_server.py)
 
 ## The locus agent loop
 
@@ -294,7 +446,7 @@ def book_flight(flight_id: str, customer_id: str) -> dict:
     return billing.charge_and_book(flight_id, customer_id)
 
 agent = Agent(
-    model="oci:openai.gpt-5.5",
+    model="oci:openai.gpt-5",
     tools=[book_flight],
     system_prompt="You are a travel concierge. Book the flight the user asks for.",
 )
diff --git a/docs/stylesheets/locus.css b/docs/stylesheets/locus.css
index f7798e4..1f722de 100644
--- a/docs/stylesheets/locus.css
+++ b/docs/stylesheets/locus.css
@@ -198,11 +198,18 @@
   grid-template-columns: 1.05fr 1fr;
   gap: 2.4rem;
   align-items: start;          /* both columns top-align — no dead vertical space */
-  margin: 1rem 0 1.4rem;
-  padding: 1.6rem 0 0.8rem;
+  margin: -0.4rem 0 1.4rem;
+  padding: 0.4rem 0 0.8rem;
   isolation: isolate;
   overflow: hidden;
 }
+/* Tighten the gap between the tabs strip and the hero on the home page —
+   the default Material content padding leaves dead space above the H1. */
+.md-content__inner:has(> .locus-hero),
+.md-content__inner:has(> div > .locus-hero) {
+  padding-top: 0.2rem;
+  margin-top: 0;
+}
 /* Oracle-red soft glow behind the H1 — the warm spotlight */
 .md-typeset .locus-hero::before {
   content: "";
@@ -238,7 +245,7 @@
   font-weight: 800;
   text-transform: lowercase;
   color: var(--locus-ink);
-  margin: 0 0 1.4rem;
+  margin: 0 0 0.6rem;
 }
 .md-typeset .locus-hero h1 .accent {
   color: var(--or-red);
@@ -692,6 +699,35 @@
   letter-spacing: -0.01em;
 }
 
+/* ---------------------------------------------------------------------------
+   Oracle-distinctive callout — used on the Capabilities page to highlight
+   wedge features. Oracle-red border + warm sand tint, red star icon.
+   Trigger with: `!!! oracle-distinctive "Distinctive to locus"`
+   --------------------------------------------------------------------------- */
+:root {
+  --md-admonition-icon--oracle-distinctive: url('data:image/svg+xml;charset=utf-8,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M12 17.27 18.18 21l-1.64-7.03L22 9.24l-7.19-.61L12 2 9.19 8.63 2 9.24l5.46 4.73L5.82 21z"/></svg>');
+}
+.md-typeset .admonition.oracle-distinctive,
+.md-typeset details.oracle-distinctive {
+  border-color: var(--or-red);
+  background: linear-gradient(135deg,
+    rgba(199, 70, 52, 0.04) 0%,
+    rgba(240, 204, 113, 0.04) 100%);
+}
+.md-typeset .oracle-distinctive > .admonition-title,
+.md-typeset .oracle-distinctive > summary {
+  background-color: rgba(199, 70, 52, 0.08);
+  color: var(--or-red-deep);
+  font-weight: 700;
+  letter-spacing: -0.01em;
+}
+.md-typeset .oracle-distinctive > .admonition-title::before,
+.md-typeset .oracle-distinctive > summary::before {
+  background-color: var(--or-red);
+  -webkit-mask-image: var(--md-admonition-icon--oracle-distinctive);
+          mask-image: var(--md-admonition-icon--oracle-distinctive);
+}
+
 /* ---------------------------------------------------------------------------
    Diagrams — responsive sizing for the agent-loop SVG and similar.
    --------------------------------------------------------------------------- */
@@ -709,6 +745,22 @@
   max-width: 920px;
 }
 
+/* The architecture diagrams (agent-loop, multi-agent-patterns, the
+   per-pattern SVGs under img/patterns/, and the architecture / sequence
+   topologies) are authored against a light background. In dark mode the
+   dark-grey strokes and labels vanish, so render them on a near-white
+   card. */
+[data-md-color-scheme="slate"] .md-typeset img[src*="agent-loop"],
+[data-md-color-scheme="slate"] .md-typeset img[src*="multi-agent-patterns"],
+[data-md-color-scheme="slate"] .md-typeset img[src*="img/patterns/"],
+[data-md-color-scheme="slate"] .md-typeset img[src*="architecture"],
+[data-md-color-scheme="slate"] .md-typeset img[src*="sequence-26ai"],
+[data-md-color-scheme="slate"] .md-typeset img.diagram {
+  background-color: #FBF9F8;
+  padding: 1rem;
+  border-radius: 0.5rem;
+}
+
 /* ---------------------------------------------------------------------------
    Dark-mode logo handling.
    - Oracle wordmark (black-fill) → invert filter flips to white.
diff --git a/mkdocs.yml b/mkdocs.yml
index bdb3465..bfb28d1 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -170,4 +170,4 @@ nav:
   - Checkpointers: api/checkpointers.md
   - Tools: api/tools.md
   - Events: api/events.md
-- Feature matrix: FEATURES.md
+- Capabilities: FEATURES.md