From 9db72dcad04a4ede6073a05b896458f3b859c7db Mon Sep 17 00:00:00 2001 From: Federico Kamelhar Date: Fri, 1 May 2026 20:55:17 -0400 Subject: [PATCH] docs: rewrite tools/idempotency/hooks/streaming/mcp + new homepage hero + Capabilities + dark-mode diagrams MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Concept-page rewrites — same template as the provider pages: pitch / when to pick / getting started / capabilities (each runnable) / gotchas / source. Pages sized 74→156, 56→138, 82→148, 68→140, 51→139. Homepage rewrite: - H1 → "Build agents that reason and solve together." (verb-led, two beats, captures both reasoning + multi-agent pillars). - Subhead → "The Oracle Gen AI Multi-Agent Reasoning SDK." (matches the logo wordmark, Oracle-anchored category claim). - Body → reasoning lives inside the loop (Reflexion / Grounding / Causal) + six-shape verb chain (Compose / Orchestrate / Swarm / Handoff / StateGraph / Functional / A2A) + Oracle proof closer. - "Six things you can ship" section — developer-narrative ordering (Reason → Act → Coordinate → Mesh → Decide → Deploy), benefit-led H3 titles, 14–20-line code blocks with real imports + commented output, descriptive concept + tutorial links (no opaque "Tutorial 03" labels). - H1 margin tightened (1.4rem → 0.6rem) so subhead sits closer. - Drop title='travel_concierge.py' from the hero code block. Capabilities page (renamed from "Feature matrix"): - Honest naming — it's a capability list, not a comparison matrix. - New top-of-page Oracle-distinctive admonition (Oracle-red border + sand-tinted gradient + red star icon, defined in CSS) calls out the six wedge features at a glance. - Every section uses the same 3-column table (Feature / What it does / Surface + concept link). Drops the prose dumps. - mkdocs.yml nav label updated. Dark-mode diagrams: - Architecture SVGs (agent-loop, multi-agent-patterns, the per-pattern SVGs under img/patterns/, architecture, sequence-26ai) were authored against a light background; dark-mode rendering hid the dark-grey strokes. Added a near-white card in slate scheme. Model id fix mirrored: openai.gpt-5.5 → openai.gpt-5 in homepage code blocks, README, and openai provider doc. (Same fix shipping in PR #38 across the rest of the codebase.) Signed-off-by: Federico Kamelhar --- README.md | 25 +-- docs/FEATURES.md | 200 ++++++++++++++-------- docs/concepts/hooks.md | 206 +++++++++++++++++----- docs/concepts/idempotency.md | 152 +++++++++++++---- docs/concepts/mcp.md | 163 +++++++++++++++--- docs/concepts/providers/openai.md | 10 +- docs/concepts/streaming.md | 173 +++++++++++++++---- docs/concepts/tools.md | 203 ++++++++++++++++++---- docs/img/sequence-26ai.svg | 2 +- docs/index.md | 272 +++++++++++++++++++++++------- docs/stylesheets/locus.css | 58 ++++++- mkdocs.yml | 2 +- 12 files changed, 1146 insertions(+), 320 deletions(-) diff --git a/README.md b/README.md index ce67082..507836b 100644 --- a/README.md +++ b/README.md @@ -3,8 +3,8 @@

- Build AI workflows that actually ship.
- Oracle Generative AI · Multi-Agent · Reasoning · Orchestrator SDK. + Build agents that reason and solve together.
+ The Oracle Gen AI Multi-Agent Reasoning SDK.

@@ -26,14 +26,17 @@ --- -Spin up a **swarm** of specialists. Hand a conversation off across an -**escalation desk**. Run an **orchestrator** of experts in parallel. -Wire up a **state graph** that loops until confident. Mesh agents -**across processes** with A2A. Or just ship one self-correcting agent -that knows when to stop. +Reasoning lives inside the loop. **Reflexion** evaluates every turn. +**Grounding** verifies every claim against its source. **Causal** +traces root cause from symptom. -Six multi-agent shapes plus A2A. One Oracle-native runtime. Every -model on OCI the day it lands. +Six shapes for six problems. **Compose** linear pipelines. +**Orchestrate** specialists in parallel. **Swarm** for peer-to-peer +research. **Handoff** for escalation desks. **StateGraph** loops +until confident. **Functional** maps across agents. **A2A** meshes +across processes. + +Every model on Oracle Generative AI the day it lands. ```bash pip install "locus[oci]" @@ -58,7 +61,7 @@ def book_flight(flight_id: str, customer_id: str) -> dict: return billing.charge_and_book(flight_id, customer_id) agent = Agent( - model="oci:openai.gpt-5.5", + model="oci:openai.gpt-5", tools=[search_flights, book_flight], system_prompt="You are a travel concierge. Find a flight, then book it.", reflexion=True, # self-correct mid-run @@ -191,7 +194,7 @@ def book_meeting(date: str, attendees: list[str]) -> dict: return calendar.book(date, attendees) agent = Agent( - model="oci:openai.gpt-5.5", + model="oci:openai.gpt-5", tools=[get_today_date, book_meeting], system_prompt="You are a scheduling assistant.", ) diff --git a/docs/FEATURES.md b/docs/FEATURES.md index a38f4c0..76d038f 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -1,93 +1,157 @@ -# Locus feature matrix +# Capabilities -What ships in `locus`, grouped by area. +Everything `locus` ships, what it does, and where to find it. + +!!! oracle-distinctive "Distinctive to locus" + These ship as core primitives, inside the ReAct loop — not as + middleware, plugins, or third-party libraries: + + - **Idempotent tools** — `@tool(idempotent=True)` dedupes on `(name, args)` inside the loop. No double-charge, double-book, double-page. + - **Reasoning loop nodes** — Reflexion, Grounding, Causal as first-class + Think → Execute → **Reflect** → Think nodes, not bolted-on libraries. + - **GSAR** — typed-grounding layer from [arXiv:2604.23366](https://arxiv.org/abs/2604.23366) with four-way claim partition + tiered replanning. + - **Termination algebra** — `MaxIterations(10) | TextMention("DONE") & ConfidenceMet(0.9)` is real Python (`__or__` / `__and__` operator overloads). + - **Six multi-agent shapes plus A2A** — Composition, Orchestrator, Swarm, Handoff, StateGraph, Functional + A2A for cross-process meshes. + - **OCI Generative AI day-zero** — two transports (V1 and native SDK), auto-routed by model id. ## Agent core -| Feature | Surface | -|---|---| -| `Agent` + `AgentConfig` + `AgentResult` | `locus.agent` | -| Composable termination algebra (`MaxIterations \| ToolCalled & ConfidenceMet`) | `locus.core.termination` | -| Idempotent tools — `@tool(idempotent=True)` dedupes repeat calls | `locus.tools.decorator` | -| Reflexion (`reflexion=True`) + Grounding (`grounding=True`) | `locus.reasoning` | -| Causal chains (standalone graph builder) | `locus.reasoning.causal.CausalChain` | -| Cancel signal (thread-safe `agent.cancel()`) | `Agent.cancel` | -| Interrupts + resume (HITL) | `agent.run` yields `InterruptEvent`; `agent.resume(...)` | -| Structured output (`output_schema=` Pydantic) | `locus.agent.config`, `locus.core.structured` | -| Hooks lifecycle (before/after × invocation × tool × model + iteration) | `locus.hooks.provider` | -| Plugin bundling (hooks + tools as one unit) | `locus.hooks.plugin` | - -## Memory - -| Feature | Backends | -|---|---| -| Native checkpointers | `MemoryCheckpointer`, `FileCheckpointer`, `HTTPCheckpointer`, `OCIBucketBackend` | -| Storage-backed (auto-wrapped via `StorageBackendAdapter`) | `SQLiteBackend`, `RedisBackend`, `PostgreSQLBackend`, `OpenSearchBackend`, `OracleBackend` | -| Conversation managers | `SlidingWindowManager`, `SummarizingManager`, `LLMCompactor` | -| Long-term key-value store with optimistic locking (`version` counter) | `locus.memory.store` | +| Feature | What it does | Surface | +|---|---|---| +| **Agent** + `AgentConfig` + `AgentResult` | The Think → Execute → Reflect → Terminate loop | `locus.agent` · [Agent loop](concepts/agent-loop.md) | +| **Termination algebra** | Compose stop conditions with `&` and `\|` operator overloads | `locus.core.termination` · [Termination](concepts/termination.md) | +| **Idempotent tools** | `@tool(idempotent=True)` dedupes repeat calls inside the loop — exactly-once side effects | `locus.tools.decorator` · [Idempotency](concepts/idempotency.md) | +| **Reflexion** | Self-evaluation node in the ReAct cycle; rewrites the next turn when the last one was wrong | `Agent(reflexion=True)` · [Reasoning](concepts/reasoning.md) | +| **Grounding** | LLM-as-judge claim verification against tool results; below-threshold triggers replanning | `Agent(grounding=True)` · [Reasoning](concepts/reasoning.md) | +| **Causal chains** | Cause-effect graph builder with cycle/contradiction detection | `locus.reasoning.causal.CausalChain` · [Reasoning](concepts/reasoning.md) | +| **GSAR** | Typed-grounding safety layer (arXiv:2604.23366) — four-way claim partition + tiered replanning | `Agent(gsar=GSARConfig(...))` · [GSAR](concepts/gsar.md) | +| **Cancel** | Thread-safe abort during a run; emits `TerminateEvent` with reason | `agent.cancel()` · [Agent loop](concepts/agent-loop.md) | +| **Interrupts (HITL)** | Pause via `InterruptEvent`; resume with `agent.resume(...)` | `locus.core.interrupt` · [Interrupts](concepts/interrupts.md) | +| **Structured output** | Pass `output_schema=` (Pydantic), final answer is parsed into a typed instance | `locus.agent.config`, `locus.core.structured` · [Structured output](concepts/structured-output.md) | +| **Hooks** | before/after × invocation × tool × model lifecycle observation + steering | `locus.hooks.provider` · [Hooks](concepts/hooks.md) | +| **Plugins** | Bundle hooks + tools as one drop-in unit | `locus.hooks.plugin` · [Hooks](concepts/hooks.md) | + +## Multi-agent + +| Shape | What it does | Surface | +|---|---|---| +| **Composition** | Linear chain · fan-out + merge — the simplest multi-agent shape | `locus.multiagent.composition` · [Composition](concepts/multi-agent/composition.md) | +| **Orchestrator** | One coordinator dispatches specialists in parallel | `locus.multiagent.orchestrator` · [Orchestrator](concepts/multi-agent/orchestrator.md) | +| **Swarm** | Open-ended peer-to-peer collaboration | `locus.multiagent.swarm` · [Swarm](concepts/multi-agent/swarm.md) | +| **Handoff** | Specialist-to-specialist context transfer with chain-of-custody | `locus.multiagent.handoff` · [Handoff](concepts/multi-agent/handoff.md) | +| **StateGraph** | Cycles, conditional edges, subgraphs — when DAG isn't enough | `locus.multiagent.graph` · [StateGraph](concepts/multi-agent/graph.md) | +| **Functional API** | Map / reduce over agents with `@task` and `@entrypoint` | `locus.multiagent.functional` · [Functional](concepts/multi-agent/functional.md) | +| **A2A** | Cross-process agent meshes — `AgentCard` discovery + HTTP/SSE transport | `locus.a2a` · [A2A](concepts/multi-agent/a2a.md) | + +## Reasoning + +| Feature | What it does | Surface | +|---|---|---| +| **Reflexion** | After each turn, the agent self-evaluates and re-plans on wrong premises | `Agent(reflexion=True)` · [Reasoning](concepts/reasoning.md) | +| **Grounding** | LLM-as-judge over claims vs the tool results that produced them | `Agent(grounding=True)` · [Reasoning](concepts/reasoning.md) | +| **Causal** | Build a cause-effect graph from the trace; surface contradictions | `build_causal_chain()` · [Reasoning](concepts/reasoning.md) | +| **GSAR** | Typed claim partition (cited / supported / unsupported / mismatched) + `proceed`/`regenerate`/`replan`/`abstain` decision | `Agent(gsar=GSARConfig(...))` · [GSAR](concepts/gsar.md) | ## Tools -| Feature | Surface | -|---|---| -| `@tool` decorator with auto JSON-Schema | `locus.tools.decorator` | -| Sequential / Concurrent / CircuitBreaker executors | `locus.tools.executor` | -| Tool-result store offload (large outputs) | `locus.tools.result_storage` | -| MCP — client + server | `locus.integrations.fastmcp` | -| Path/URL safety helpers | `locus.tools.path_safety`, `locus.tools.url_safety` | +| Feature | What it does | Surface | +|---|---|---| +| `@tool` decorator | Function → JSON-Schema-typed tool the model can call | `locus.tools.decorator` · [Tools](concepts/tools.md) | +| Idempotent dedup | `@tool(idempotent=True)` skips repeat calls (same args) in the loop | `locus.tools.decorator` · [Idempotency](concepts/idempotency.md) | +| **Sequential executor** | Run tool calls one at a time | `locus.tools.executor` · [Executors](concepts/executors.md) | +| **Concurrent executor** | Run tool calls in parallel | `locus.tools.executor` · [Executors](concepts/executors.md) | +| **CircuitBreaker executor** | Auto-disable a tool after N failures | `locus.tools.executor` · [Executors](concepts/executors.md) | +| Result-store offload | Move large tool results to object storage; agent sees a pointer | `locus.tools.result_storage` | +| Path / URL safety | Validate filesystem and network access from tool args | `locus.tools.path_safety`, `locus.tools.url_safety` · [Safety](concepts/safety.md) | +| **MCP — client + server** | Talk to / be talked to by Anthropic-spec MCP servers | `locus.integrations.fastmcp` · [MCP](concepts/mcp.md) | + +## Memory — checkpointer backends + +| Backend | Best for | Surface | +|---|---|---| +| `MemoryCheckpointer` | Tests, REPL — in-process dict | `locus.memory.backends.memory` · [Checkpointers](concepts/checkpointers.md) | +| `FileCheckpointer` | Local dev — JSON files on disk | `locus.memory.backends.file` | +| `HTTPCheckpointer` | A remote checkpoint service you already run | `locus.memory.backends.http` | +| **`OCIBucketBackend`** | OCI-native, lifecycle policies, region replication | `locus.memory.backends.oci_bucket` | +| `SQLiteBackend` | Single-process durability | `locus.memory.backends.sqlite` | +| `RedisBackend` | Multi-replica, fast, TTLs | `locus.memory.backends.redis` | +| `PostgreSQLBackend` | Production DB with metadata queries | `locus.memory.backends.postgresql` | +| `OpenSearchBackend` | Full-text search across past runs | `locus.memory.backends.opensearch` | +| `OracleBackend` | Oracle DB with JSON queries | `locus.memory.backends.oracle` | + +## Memory — context management + +| Feature | What it does | Surface | +|---|---|---| +| `SlidingWindowManager` | Keeps the last N messages; drops the rest | `locus.memory.compactor` · [Conversation management](concepts/conversation-management.md) | +| `SummarizingManager` | LLM rollup of older turns | `locus.memory.compactor` | +| **`LLMCompactor`** | Budget-aware compaction with head + tail protection | `locus.memory.compactor` | +| Long-term key-value store | Cross-run user prefs / results with optimistic-locking `version` counter | `locus.memory.store` | ## Hooks (built-in) -`LoggingHook`, `StructuredLoggingHook`, `TelemetryHook` (OpenTelemetry), -`NoOpTelemetryHook`, `ModelRetryHook`, `GuardrailsHook`, -`ContentFilterHook`, `SteeringHook` — all import from -`locus.hooks.builtin`. +| Hook | What it does | Import | +|---|---|---| +| `LoggingHook` / `StructuredLoggingHook` | Stdlib / structured-JSON logs of every event | `locus.hooks.builtin` · [Observability](concepts/observability.md) | +| **`TelemetryHook`** | OpenTelemetry traces + metrics (counters, histograms) | `locus.hooks.builtin` | +| `NoOpTelemetryHook` | Opt-out variant for tests | `locus.hooks.builtin` | +| `ModelRetryHook` | Auto-retry model calls on throttle/empty with exponential back-off | `locus.hooks.builtin` · [Retry](concepts/retry.md) | +| **`GuardrailsHook`** | Block dangerous tools, redact PII, enforce content/topic policies | `locus.hooks.builtin` · [Safety](concepts/safety.md) | +| `ContentFilterHook` | Standalone content moderation | `locus.hooks.builtin` | +| **`SteeringHook`** | LLM-as-judge approval gate on every tool call | `locus.hooks.builtin` · [Safety](concepts/safety.md) | -## Multi-agent +## Streaming + Server -`SequentialPipeline` / `ParallelPipeline` / `LoopAgent` -(plus `sequential()`, `parallel()`, `loop()` helpers); `Orchestrator` + -`Specialist`; `Swarm` + `SharedContext`; `Handoff` + `HandoffAgent`; -`StateGraph` (cycles, conditional edges, subgraphs); Functional API -(`@task` / `@entrypoint`); `A2AServer` + `A2AClient` + `AgentCard`. +| Feature | What it does | Surface | +|---|---|---| +| **Typed events** | Frozen Pydantic events for `match`-statement consumers | `locus.core.events` · [Events](concepts/events.md) | +| `StructuredStream` | Incremental Pydantic-partial parsing during streaming | `locus.core.structured` | +| Console + SSE handlers | Render to terminal or stream over Server-Sent Events | `locus.core.events` · [Streaming](concepts/streaming.md) | +| **`AgentServer`** | Drop-in FastAPI app: `/invoke`, `/stream`, `/threads/{id}`, `/health` | `locus.server` · [Agent Server](concepts/server.md) | +| Per-principal threads | Bearer-token auth + thread-id namespacing prevents cross-tenant leaks | `AgentServer(api_key=...)` · [Agent Server](concepts/server.md) | +| Graph streaming | Multi-agent state-graph event streams | `locus.multiagent.graph` · [Graph streaming](concepts/graph-streaming.md) | ## RAG -Seven vector stores under `locus.rag.stores`: Chroma, in-memory, -OpenSearch, Oracle 26ai, pgvector, Pinecone, Qdrant. Embeddings: -`OCIEmbeddings`, `OpenAIEmbeddings`. Multimodal processors: -`TextProcessor`, `ImageProcessor`, `PDFProcessor`, `AudioProcessor`, -`MultimodalProcessor`. +| Component | Options | Surface | +|---|---|---| +| Vector stores | Oracle 26ai · OpenSearch · pgvector · Qdrant · Pinecone · Chroma · in-memory | `locus.rag.stores` · [RAG](concepts/rag.md) | +| Embeddings | `OCIEmbeddings` (Cohere) · `OpenAIEmbeddings` | `locus.rag.embeddings` | +| Multimodal processors | Text · PDF (text + OCR) · Image (OCR) · Audio (transcription) | `locus.rag.multimodal` | +| Tool wiring | `create_rag_tool(retriever)` exposes the retriever as a `@tool` | `locus.rag.tools` | -## Streaming + Server +## Models -Typed events (`ThinkEvent`, `ModelChunkEvent`, `ToolStartEvent`, -`ToolCompleteEvent`, `ReflectEvent`, `GroundingEvent`, `InterruptEvent`, -`TerminateEvent`); `StructuredStream` (incremental Pydantic partials); -console + SSE handlers; `AgentServer` with `/invoke`, `/stream`, -`GET /threads/{id}`, `DELETE /threads/{id}`, `/health` and -bearer-principal-scoped thread namespaces. +| Provider | Models | Surface | +|---|---|---| +| **OCI Generative AI — V1 transport** | `openai.*`, `meta.*`, `xai.*`, `google.*`, `mistral.*` on OCI | `locus.models.providers.oci.openai_compat` · [OCI](concepts/providers/oci.md) | +| **OCI Generative AI — SDK transport** | Cohere `command-r-*` series — proprietary chat shape | `locus.models.providers.oci.OCIModel` · [OCI](concepts/providers/oci.md) | +| OpenAI | All commercial models (gpt-5, o-series, etc) | `locus.models.providers.openai` · [OpenAI](concepts/providers/openai.md) | +| Anthropic | Claude 4 / 4.5 / 4.7 / 4.8 — direct API | `locus.models.providers.anthropic` · [Anthropic](concepts/providers/anthropic.md) | +| Ollama | Local models | `locus.models.providers.ollama` · [Ollama](concepts/providers/ollama.md) | +| Auto-routing | `get_model("oci:openai.gpt-5")` picks transport from id | `locus.models.registry.get_model` | +| Decorators | Failover · pooled · cached · rate-limited wrappers over any provider | `locus.models.decorators` | ## Skills + Playbooks -Three-tier skill disclosure (`SkillsPlugin`); `PlaybookEnforcer` with -YAML / JSON / Python loaders; `Skill.from_directory()` activation. - -## Models - -`OpenAIModel`, `AnthropicModel`, `OllamaModel`, `OCIModel` (native SDK -transport for Cohere R-series), `OCIOpenAIModel` (`/openai/v1` for -openai.*/ meta.* / xai.*/ google.* / mistral.* on OCI). `get_model()` -auto-routes by model id. Failover, pooled, caching, rate-limit -decorators included. +| Feature | What it does | Surface | +|---|---|---| +| **Skills** | AgentSkills.io progressive disclosure (catalog → instructions → resources) | `locus.skills.SkillsPlugin` · [Skills](concepts/skills.md) | +| `Skill.from_directory()` | Load a folder of `SKILL.md` bundles | `locus.skills.models.Skill` | +| **Playbooks** | Numbered execution plans with per-step `PlaybookEnforcer` | `locus.playbooks` · [Playbooks](concepts/playbooks.md) | +| YAML / JSON / Python loaders | Author playbooks in any of three formats | `locus.playbooks.loader` | ## Evaluation -`EvalCase`, `EvalRunner`, `EvalReport`, `EvalResult` — pass/score/duration -reporting, custom evaluators, `expected_tools` / `expected_output_contains` -matchers. +| Class | What it does | Surface | +|---|---|---| +| `EvalCase` | A single test case — expected tools / output / iteration / duration budgets | `locus.evaluation` · [Evaluation](concepts/evaluation.md) | +| `EvalRunner` | Runs a list of cases against an agent, returns `EvalReport` | `locus.evaluation` | +| `EvalResult` | Per-case pass / score / duration + diagnostic checks | `locus.evaluation` | +| `EvalReport` | Aggregate stats with `summary()` + JSON serialisation | `locus.evaluation` | -## Source pointers +## Where to next -For depth on any feature, the README headlines link to its source -directory; canonical entry is `src/locus/__init__.py`. +- **For first-time visitors**: [Quickstart](how-to/quickstart.md) ships a working agent in five minutes. +- **For architecture**: [Agent loop](concepts/agent-loop.md) is the canonical reference. +- **For depth on any feature**: every row in this matrix links to its concept page. Source lives at [`src/locus/`](https://github.com/oracle-samples/locus/tree/main/src/locus); canonical entry is [`src/locus/__init__.py`](https://github.com/oracle-samples/locus/blob/main/src/locus/__init__.py). diff --git a/docs/concepts/hooks.md b/docs/concepts/hooks.md index e5c8a79..32fc117 100644 --- a/docs/concepts/hooks.md +++ b/docs/concepts/hooks.md @@ -1,20 +1,47 @@ # Hooks -Hooks observe and modify agent behavior at lifecycle points. Every -hook inherits `HookProvider` and is registered in a `HookRegistry`. -Events fire at six phases: +Hooks are how you **observe and modify** agent behaviour at the +moments that matter — before / after the run starts, before / after +each model call, before / after each tool call. Every cross-cutting +concern that *isn't* the agent's primary task lives here: logging, +telemetry, retry policy, guardrails, PII redaction, LLM-as-judge tool +approval. -1. `on_before_invocation` — before the agent starts -2. `on_after_invocation` — after the agent finishes -3. `on_before_model_call` — before each model request -4. `on_after_model_call` — after each model response -5. `on_before_tool_call` — before each tool runs -6. `on_after_tool_call` — after each tool completes +You can use the ones locus ships (covers most production needs out +of the box) or write your own — a hook is a small subclass with the +methods it cares about. -## Writing a hook +## When to write a hook + +| You want… | Write a hook | +|---|---| +| Log every tool call to your aggregator | ✓ | +| Add OpenTelemetry spans / metrics | ✓ — use the built-in `TelemetryHook` | +| Retry model calls with backoff | ✓ — `ModelRetryHook` | +| Reject tool calls that look dangerous | ✓ — `GuardrailsHook`, `ContentFilterHook`, `SteeringHook` | +| Add a tool to the registry | use [`tools=[...]` on Agent](tools.md) | +| Change the system prompt mid-run | hooks can read state but not mutate the prompt; use a [skill](skills.md) instead | + +## The six lifecycle phases + +A hook can subscribe to any of these. Each method receives a typed, +write-protected event object. + +| Phase | Fires | Useful for | +|---|---|---| +| `on_before_invocation` | once, when `agent.run()` starts | initialise per-run state, open spans | +| `on_after_invocation` | once, after the agent finishes | flush metrics, close spans | +| `on_before_model_call` | before each request to the model | redact PII, count tokens | +| `on_after_model_call` | after each response from the model | log usage, retry on empty | +| `on_before_tool_call` | before each tool body runs | guardrails, audit, approval gates | +| `on_after_tool_call` | after each tool body completes | log result, update metrics | + +## Getting started + +### 1. Subclass `HookProvider` ```python -from locus.hooks.provider import HookProvider, HookPriority +from locus.hooks.provider import HookPriority, HookProvider class AuditHook(HookProvider): name = "audit" @@ -25,58 +52,145 @@ class AuditHook(HookProvider): async def on_after_tool_call(self, event): print(f"← {event.tool_name} = {event.result}") - -agent = Agent(..., hooks=[AuditHook()]) ``` -## Priorities - -Hooks run in priority order (lower number first for `before_*`, -reversed for `after_*` so teardown pairs with setup): - -| Range | Intended use | -|---|---| -| 0–99 | Security (guardrails, PII redaction) | -| 100–199 | Observability (logging, telemetry) | -| 200–299 | Business logic | -| 300+ | Cosmetic | +Override only the phases you care about. Unimplemented phases inherit +no-op defaults from the base class. -Use the constants in `HookPriority` instead of magic numbers. +### 2. Pass to the agent -## Write-protected events +```python +agent = Agent( + model="oci:openai.gpt-5.5", + tools=[search, book_flight], + hooks=[AuditHook()], +) +``` -Event objects are Pydantic models with frozen fields. You cannot -accidentally mutate them from a hook. Methods that exist to let hooks -steer the agent — cancelling a tool, retrying a model call — are -explicit, so the intent is unambiguous. +### 3. Run -## Built-in hooks +The hook fires automatically — no further wiring. -Locus ships these out of the box: +## What you get out of the box -| Hook | What it does | -|---|---| -| `LoggingHook` / `StructuredLoggingHook` | Plain or JSON-structured logs at every phase | -| `TelemetryHook` / `NoOpTelemetryHook` | OpenTelemetry spans + counters + histograms | -| `ModelRetryHook` | Backoff retries on empty / rate-limited model responses | -| `GuardrailsHook` / `ContentFilterHook` | PII / SQL / XSS / command-injection regex policies | -| `SteeringHook` | LLM-as-judge tool approval (a second model votes before each tool call) | +locus ships these hooks. Composed in this order, they cover most +production needs without writing custom code. ```python from locus.hooks.builtin import ( - GuardrailsHook, - LoggingHook, + LoggingHook, StructuredLoggingHook, + TelemetryHook, ModelRetryHook, + GuardrailsHook, ContentFilterHook, SteeringHook, - TelemetryHook, ) agent = Agent( - ..., + model="oci:openai.gpt-5.5", + tools=[...], hooks=[ - LoggingHook(), - ModelRetryHook(max_retries=3), - GuardrailsHook(), + StructuredLoggingHook(), # JSON logs at every phase + TelemetryHook(), # OTel spans + metrics + histograms + ModelRetryHook(max_retries=3), # backoff on empty / rate-limited responses + GuardrailsHook(), # PII / SQL / XSS / command-injection + SteeringHook(approver=second_model), # LLM-as-judge tool approval ], ) ``` + +### `LoggingHook` / `StructuredLoggingHook` + +Plain-text or JSON-structured logs at every lifecycle phase. Drop in +when you want a paper trail without writing your own logger. + +### `TelemetryHook` + +OpenTelemetry spans for every model + tool call, counters for tool +invocations, histograms for latency. Use `NoOpTelemetryHook` when +you want the API surface but no actual export (useful for tests). + +### `ModelRetryHook` + +Backoff retries on empty model responses, rate-limit errors, and +transient connection failures. Configurable `max_retries` and +`backoff_seconds`. Doesn't intercept your tool calls — only the +model layer. + +### `GuardrailsHook` / `ContentFilterHook` + +Regex-based policies on tool inputs (`GuardrailsHook`) and model +outputs (`ContentFilterHook`). Catches PII, SQL injection patterns, +shell-command injection, and credit-card-shaped strings. Reject or +redact at the boundary. + +### `SteeringHook` — LLM-as-judge tool approval + +A *second model* sees each tool call before it runs and votes +"approve / reject / rewrite". Use this when the cost of a wrong tool +call is higher than the cost of a second model round-trip. + +```python +agent = Agent( + ..., + hooks=[SteeringHook(approver="oci:openai.gpt-5.5")], +) +``` + +## Priorities — the ordering rules + +Hooks run in priority order. Lower numbers run first on `before_*` +phases; the order reverses for `after_*` so teardown pairs with +setup. + +| Range | Intended use | +|---|---| +| `0`–`99` | **Security** — guardrails, PII redaction (must run first to short-circuit unsafe calls) | +| `100`–`199` | **Observability** — logging, telemetry | +| `200`–`299` | **Business logic** — domain-specific hooks | +| `300+` | **Cosmetic** — pretty-printing, console UI | + +Use the constants in `HookPriority` (e.g. `HookPriority.SECURITY_MAX`, +`HookPriority.OBSERVABILITY_MIN`) instead of magic numbers — the +intent is more obvious in code review. + +## Write-protected events — by design + +Event objects are frozen Pydantic models. You **cannot** accidentally +mutate them from a hook — try and you get a `ValidationError`. The +methods that *do* let hooks steer the agent (`event.cancel()`, +`event.retry()`, `event.replace_arguments(...)`) are explicit and +named for what they do, so the intent is unambiguous in a review: + +```python +async def on_before_tool_call(self, event): + if "DROP TABLE" in str(event.arguments): + event.cancel(reason="SQL injection blocked by GuardrailsHook") +``` + +Compare to a callback-based system where any code can monkey-patch +any field; this is intentionally tight. + +## Common gotchas + +| Symptom | Likely cause | +|---|---| +| Hook never fires | Forgot to pass it on `Agent(hooks=[...])`. The `HookRegistry` only sees what you register. | +| Hook fires in the wrong order | Set `priority` explicitly. The default priority is intentionally mid-range so security hooks always come before yours. | +| `ValidationError: cannot mutate frozen instance` | You tried to write `event.foo = bar`. Hooks observe, not mutate; use the explicit steering methods. | +| `on_after_tool_call` doesn't see the result | The tool raised. Check `event.error` instead of `event.result`. | +| Telemetry spans aren't exported | `TelemetryHook` needs an OTel exporter configured upstream — see [Observability](observability.md). | + +## Source and examples + +- [`HookProvider` and `HookOrchestrator`](https://github.com/oracle-samples/locus/blob/main/src/locus/hooks/provider.py) +- [Built-in hooks](https://github.com/oracle-samples/locus/tree/main/src/locus/hooks/builtin) +- [`tutorial_05_agent_hooks.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_05_agent_hooks.py) — write your first hook. +- [`tutorial_27_hooks_advanced.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_27_hooks_advanced.py) — guardrails + steering, end to end. + +## See also + +- [Tools](tools.md) — the things hooks observe. +- [Events](events.md) — the typed event objects hooks receive. +- [Safety & guardrails](safety.md) — production policies built on `GuardrailsHook`. +- [Observability](observability.md) — wiring `TelemetryHook` to your OTel collector. +- [Retry strategies](retry.md) — how `ModelRetryHook` works under the hood. diff --git a/docs/concepts/idempotency.md b/docs/concepts/idempotency.md index 4e19c90..061c136 100644 --- a/docs/concepts/idempotency.md +++ b/docs/concepts/idempotency.md @@ -1,11 +1,36 @@ # Idempotency -The single most important word in production agents is **once**. The -model is allowed to retry; the side-effect isn't. locus makes that a -one-keyword decision on the tool. +> The single most important word in production agents is **once**. + +The model is *allowed* to retry. The side effect *isn't*. locus +makes that distinction a one-keyword decision on the tool, enforced +inside the ReAct loop. This is a locus-specific primitive — none of +LangChain / LangGraph / CrewAI / Strands ship it. + +If you ever plan to run an agent that **books**, **charges**, +**emails**, **pages**, or **writes**, this is the most important +single page on the docs site. + +## When to use `idempotent=True` + +| Situation | `idempotent=True`? | +|---|---| +| Side-effecting tool with real-world cost (charge, email, page, book) | **yes — always** | +| Database write you can't trivially roll back | **yes** | +| External service that's already idempotent on its end | yes — locus dedupes the round-trip too | +| Read-only catalogue lookup | no — re-reads are cheap, leave it to the model | +| Tool that *intentionally* generates a new entity each call (e.g. `mint_uuid`) | no — that breaks the contract | + +## How it works + +Inside a single agent run, locus hashes the tool's +`(name, arguments)` tuple as the model emits each call. **The first +call with a given key hits the function body** and the result is +recorded. **Every subsequent call with the same key short-circuits +to the cached response** without invoking the body. ```python -from locus.tools.decorator import tool +from locus import tool @tool(idempotent=True) def transfer(from_acct: str, to_acct: str, amount: float) -> dict: @@ -13,44 +38,103 @@ def transfer(from_acct: str, to_acct: str, amount: float) -> dict: return ledger.transfer(from_acct, to_acct, amount) ``` -Inside a single agent run, locus hashes the tool's `(name, kwargs)` -tuple. The first call hits the body and the result is cached. Every -subsequent call with identical arguments — whether the model retried, -got confused, or asked again on a later turn — short-circuits to the -cached response. +The argument hash is the trust boundary: + +- **Same call**: the model re-emits `transfer("A", "B", 100)` after + seeing the receipt → cache hit, body skipped. +- **Different call**: the model emits `transfer("A", "B", 200)` → + different key, body runs. + +Caching is keyed on the **canonical JSON form** of the arguments, so +key order, default values, and whitespace don't matter. ## Why this matters -- **Booking, billing, payments.** The model that calls `book_flight` - twice is more common than you think. Without idempotency you have a - duplicate charge and an angry customer. -- **Outbound side-effects.** `email_cfo`, `page_oncall`, `submit_po` — - one and done. -- **Database writes you can't easily roll back.** +### Booking, billing, payments + +The model that calls `book_flight` twice in one run is more common +than you think. Sometimes it sees an ambiguous tool result and tries +again "to be sure". Sometimes the network glitches and the model +believes the call failed. Without idempotency, you charge the +customer twice and they're on the phone with their bank. + +```python +@tool(idempotent=True) +def book_flight(flight_id: str, customer_id: str) -> dict: + return billing.charge_and_book(flight_id, customer_id) +``` + +The customer gets billed once. Always. -The argument hash is the trust boundary: if the model re-issues the -*same* call, you fire once. If it changes any argument, that's a new -call and the body runs. +### Outbound side-effects -## When to use it +`email_cfo`, `page_oncall`, `submit_po`, `slack_alert` — anything +that touches a human or a downstream system. **One and done**. -| Situation | `idempotent=True`? | +### Database writes you can't roll back + +Insert into a journal table, append to a Kafka topic, sign a JWT — +operations where retrying isn't free. Idempotent tools turn the +"exactly once" problem into a "not-our-problem-after-the-first-call" +guarantee. + +### Replays after checkpoint resume + +When a checkpointer resumes a stalled run, the model may decide to +re-issue tool calls it's already seen. Idempotent tools see the +cache pre-populated from the checkpoint and skip the side effect on +replay. (This requires `tool_executions` to be restored from the +checkpoint; locus's [native checkpointers](checkpointers.md) handle +it.) + +## What it is *not* + +| Concept | Idempotency is… | Idempotency is *not*… | +|---|---|---| +| Scope | within a single agent run | cross-run — restart and the cache is gone (use a [checkpointer](checkpointers.md)) | +| Failure | one fire per identical call | retry — if the body raises, the exception propagates as the cached "result" | +| Boundary | per-agent | network — two different agents both calling `transfer(a, b, 100)` each fire once | + +If you need cross-run idempotency, configure a checkpointer + an +idempotent server-side endpoint. The combo gives you "the side +effect runs at most once across all replays of all agents". + +## Practical recipe — vendor PO approval + +A canonical multi-agent idempotency shape: an agent (or three of +them, debating) loops over a vendor decision, then writes once. + +```python +@tool(idempotent=True) +def submit_po(vendor_id: str, line_items: list[dict]) -> dict: + return procurement.submit(vendor_id, line_items) + +@tool(idempotent=True) +def email_cfo(po_id: str, summary: str) -> str: + return mail.send(to="cfo@org.com", subject=f"PO {po_id}", body=summary) +``` + +The agent can iterate ten times reasoning about whether to approve. +The PO ships once. The CFO email lands once. The model can fail +mid-run and a checkpointer-backed resume re-issues the same calls; +the side effects still fire exactly once. + +## Common gotchas + +| Symptom | Likely cause | |---|---| -| Side-effecting tool with a real-world cost (charge, email, page) | **yes** | -| Read-only catalogue lookup | no — caching the model's reads is its problem, not yours | -| Tool that *intentionally* generates a new entity each call (e.g. `mint_uuid`) | no | -| External service that's already idempotent | yes anyway — locus dedupes the round-trip too | +| Tool re-fires despite `idempotent=True` | Argument changed between calls. Check that the model isn't mutating ids / amounts between turns. | +| Idempotent cache survives across runs unexpectedly | It shouldn't — only the checkpointer persists state. If you're seeing this, you're loading state from a checkpoint and don't want to. | +| Body raised first time, cache returns the exception | This is by design — the failure is part of the "result" of the first call. The model sees the failure and can react. To re-attempt, the model must change an argument. | +| Read-only lookup tagged `idempotent=True` | Harmless but wasteful — the cache hit savings are negligible vs the read itself. Leave it off. | -## What it is not +## Source and tutorial -- It's not idempotency *across runs*. Restart the agent and the cache - is gone — that's what your **checkpointer** is for. -- It's not retry. If the body raises, the exception propagates. -- It's not a network-layer cache. Two different agents calling - `transfer(a, b, 100)` each fire once. +- [`@tool` decorator with idempotency hook](https://github.com/oracle-samples/locus/blob/main/src/locus/tools/decorator.py) +- [`_find_matching_execution`](https://github.com/oracle-samples/locus/blob/main/src/locus/loop/nodes.py#L114) — where the dedup actually happens, in the ReAct loop's Execute node. +- [`tutorial_03_tools_and_state.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_03_tools_and_state.py) — walks through `@tool(idempotent=True)` end-to-end. -## Source and tutorials +## See also -- `src/locus/tools/decorator.py` — the `@tool` decorator and idempotency hook. -- Tutorial: [`tutorial_03_tools_and_state.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_03_tools_and_state.py) - walks through `@tool(idempotent=True)` end-to-end. +- [Tools](tools.md) — the full `@tool` decorator surface. +- [Checkpointers](checkpointers.md) — durable runs where idempotency interacts with replay. diff --git a/docs/concepts/mcp.md b/docs/concepts/mcp.md index 3a69c9a..f0fec86 100644 --- a/docs/concepts/mcp.md +++ b/docs/concepts/mcp.md @@ -1,51 +1,166 @@ -# MCP (both ways) +# MCP — Model Context Protocol The [Model Context Protocol](https://modelcontextprotocol.io) is an -Anthropic-spec interop standard for tools. locus speaks MCP in both -directions. +Anthropic-spec interop standard for tools. Define a tool once, +expose it over MCP, and any MCP-compatible client (Claude Desktop, +Cline, Strands, another locus agent) can call it. Or consume tools +from existing MCP servers (filesystem, git, postgres, github, +sequential-thinking) without writing any glue. -## Consume MCP servers +**locus speaks MCP both ways**. That's a deliberate differentiator — +most agent frameworks consume MCP servers but don't expose their own +tools as MCP. Round-trip means an agent built with locus can be +either side of the conversation. -`MCPClient` wraps an external MCP server's tools so the agent can call -them as if they were native locus tools. +## When to use MCP + +| You want… | Use MCP | +|---|---| +| Your locus agent to use Anthropic's published filesystem / git / postgres servers | ✓ — `MCPClient` | +| Your `@tool` library to be callable by Claude Desktop / Cline / other agents | ✓ — `LocusMCPServer` | +| Two locus agents to share tools across processes / machines | ✓ — works, but [A2A](multi-agent/a2a.md) is the better protocol | +| In-process multi-agent — share tools by importing | use the [tools](tools.md) directly, not MCP | +| Deterministic tests | use [Ollama](providers/ollama.md) + plain `@tool` — MCP adds I/O | + +## Getting started — consume an MCP server + +### 1. Install the MCP extras + +```bash +pip install "locus[mcp]" +``` + +### 2. Spawn the server and wrap it with `MCPClient` ```python from locus.integrations.fastmcp import MCPClient -# spawn the MCP server as a subprocess (stdio transport) -fs = MCPClient.stdio(command=["npx", "-y", "@modelcontextprotocol/server-filesystem", "/data"]) +# Spawn Anthropic's filesystem server as a subprocess (stdio transport): +fs = MCPClient.stdio( + command=["npx", "-y", "@modelcontextprotocol/server-filesystem", "/data"], +) +``` + +`MCPClient.stdio` runs the subprocess, opens an MCP session over its +stdin/stdout, and discovers what tools the server exposes. -agent = Agent(model=..., tools=[*fs.tools()]) # MCP tools become locus tools +### 3. Pass the tools straight into an Agent + +```python +from locus import Agent + +agent = Agent( + model="oci:openai.gpt-5.5", + tools=[*fs.tools()], # MCP tools become locus tools + system_prompt="You can read files in /data.", +) +result = agent.run_sync("Summarise the README in /data.") ``` -The client registers every MCP tool with locus's tool registry, with -schema, descriptions, and call-through plumbing intact. +`fs.tools()` returns a list of locus `Tool` objects with full +schemas, descriptions, and call-through plumbing. The agent doesn't +know they're MCP — they look like any other `@tool`. -## Expose locus tools as MCP +## Getting started — expose your tools as MCP -`LocusMCPServer` turns a set of locus tools into an MCP server other -agents can consume. +### 1. Wrap a tool list in `LocusMCPServer` ```python from locus.integrations.fastmcp import LocusMCPServer server = LocusMCPServer(tools=[search_vendors, submit_po]) -server.run_stdio() # or .run_http(port=7400) ``` -Anthropic Claude, Strands, or any MCP-spec client can now call your -locus tools. +### 2. Pick a transport + +```python +server.run_stdio() # for desktop clients +server.run_http(port=7400) # for HTTP MCP clients +``` + +`run_stdio()` is what Claude Desktop, Cline, and most MCP clients +expect. `run_http()` runs an HTTP MCP server (transport + JSON-RPC) +that any HTTP MCP client can reach. + +### 3. Point a client at it + +For Claude Desktop, edit `~/Library/Application Support/Claude/claude_desktop_config.json`: + +```json +{ + "mcpServers": { + "my-locus-tools": { + "command": "python", + "args": ["-m", "my_package.mcp_server"] + } + } +} +``` + +Restart Claude Desktop. Your `search_vendors` and `submit_po` tools +appear in the model's tool list. + +## What you get out of the box + +### Schema preservation + +`@tool`'s docstring + type hints become the MCP tool's name, +description, and JSON schema — losslessly. The MCP client sees the +same parameter types, defaults, and descriptions a locus agent +would. + +### Both transports + +| Transport | Use case | +|---|---| +| **stdio** — process pipes | Desktop clients (Claude Desktop, Cline). The MCP server is spawned as a subprocess. | +| **HTTP** — JSON-RPC over POST | Browser-side or networked clients. Good for shared tool servers. | + +### Idempotency carries through + +A tool tagged `@tool(idempotent=True)` keeps that semantic when +exposed via MCP. The dedup happens locus-side; the MCP client +doesn't need to know. ## Round-trip example -A common shape: locus agent A consumes an MCP filesystem server, plus -a locus agent B exposed as MCP that A can also call. Same client API, -different transports. +A common shape: a locus agent A consumes a filesystem MCP server, +*and* exposes its own tools as MCP for another agent B to consume: + +```python +# Agent A — consumes filesystem, exposes its own analytics tools +fs = MCPClient.stdio(command=[...]) # consumer side +analytics = LocusMCPServer( # producer side + tools=[summarise_csv, plot_histogram], +) +analytics.run_http(port=7400, in_background=True) + +agent_a = Agent( + model="oci:openai.gpt-5.5", + tools=[*fs.tools(), summarise_csv, plot_histogram], +) +``` + +Same `MCPClient` API on the consumer side, same `LocusMCPServer` on +the producer side, same tool definitions. The transport is an +implementation detail. + +## Common gotchas + +| Symptom | Likely cause | +|---|---| +| `MCP server failed to start` | The MCP server subprocess crashed before establishing the session. Run the command manually to see the error. | +| `Tool 'X' not found in MCP discovery` | The server exposes a different name than you expected. Print `[t.name for t in fs.tools()]` to see the actual list. | +| `Schema validation failed on call` | MCP tool returned an arg type that doesn't match its declared schema. Common with hand-written MCP servers; the standard ones are fine. | +| Claude Desktop doesn't show your locus tools | `claude_desktop_config.json` not picked up — check the file lives at the right path and Claude has been restarted. | +| Hangs on `MCPClient.stdio` startup | The MCP subprocess is waiting for input on stdin (some servers expect a handshake). Pass `wait_for_init=True` and a timeout. | -## Tutorial +## Source and tutorial -[`tutorial_12_mcp_integration.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_12_mcp_integration.py). +- [`locus.integrations.fastmcp`](https://github.com/oracle-samples/locus/blob/main/src/locus/integrations/fastmcp.py) — built on FastMCP. +- [`tutorial_12_mcp_integration.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_12_mcp_integration.py) — consumer + producer end-to-end. -## Source +## See also -`src/locus/integrations/fastmcp.py` — built on FastMCP. +- [Tools](tools.md) — the `@tool` decorator MCP wraps. +- [A2A](multi-agent/a2a.md) — purpose-built protocol for cross-process locus-to-locus agent meshes. diff --git a/docs/concepts/providers/openai.md b/docs/concepts/providers/openai.md index d3c21c8..5810501 100644 --- a/docs/concepts/providers/openai.md +++ b/docs/concepts/providers/openai.md @@ -33,12 +33,12 @@ That's the only setup. locus reads the env var automatically. ```python from locus import Agent -agent = Agent(model="openai:gpt-5.5", system_prompt="You are helpful.") +agent = Agent(model="openai:gpt-5", system_prompt="You are helpful.") ``` -The string `"openai:gpt-5.5"` does two things: tells locus to use the +The string `"openai:gpt-5"` does two things: tells locus to use the OpenAI provider (`openai:` prefix), and which model id to call -(`gpt-5.5`). Any model id OpenAI accepts, locus accepts. +(`gpt-5`). Any model id OpenAI accepts, locus accepts. ### 3. Run it @@ -55,7 +55,7 @@ without further configuration. ### Chat completions across the GPT family -Every chat-shaped OpenAI model: `gpt-4o`, `gpt-4.1`, `gpt-5`, `gpt-5.5`, +Every chat-shaped OpenAI model: `gpt-4o`, `gpt-4.1`, `gpt-5`, `gpt-5`, `gpt-image-1`. Vision input (image URLs / base64), audio input, and function calling work the same way you'd use them on the OpenAI SDK directly — locus just normalises the events the model emits. @@ -107,7 +107,7 @@ class Answer(BaseModel): confidence: float agent = Agent( - model="openai:gpt-5.5", + model="openai:gpt-5", output_schema=Answer, system_prompt="Reply as JSON matching the schema.", ) diff --git a/docs/concepts/streaming.md b/docs/concepts/streaming.md index babc0ce..7f367c3 100644 --- a/docs/concepts/streaming.md +++ b/docs/concepts/streaming.md @@ -1,13 +1,46 @@ # Streaming -Every locus agent emits typed events as it runs. They are real -classes, not strings — drop them into `match` statements and let the -type checker verify your handler is exhaustive. +Every locus agent emits a **typed event stream** as it runs. The +events aren't strings or `dict[str, Any]` blobs — they're frozen +Pydantic classes, designed to drop into a `match` statement and let +your type checker verify the handler is exhaustive. + +This is the surface a UI consumes (live token rendering, tool-call +indicators, reasoning bubbles), the surface telemetry hooks observe, +and the surface `AgentServer` re-emits over Server-Sent Events for +browsers. + +## When to consume the event stream + +| You want… | Use… | +|---|---| +| Live token-by-token rendering in a UI | `async for event in agent.run(...)` | +| The final answer as a single value (tests, scripts, REPL) | `agent.run_sync(prompt).message` — no event handling | +| Spans / metrics on every model + tool call | install [`TelemetryHook`](hooks.md#telemetryhook) | +| To stream over HTTP to a browser | [`AgentServer`](server.md) re-emits as SSE | + +## Getting started + +### 1. Use `agent.run(prompt)` instead of `run_sync` + +```python +async for event in agent.run("Plan a trip to Paris."): + print(event) +``` + +`agent.run(...)` returns an async iterator. Each iteration yields one +event in the order it occurred. + +### 2. Pattern-match on the event types ```python from locus.core.events import ( - ThinkEvent, ToolStartEvent, ToolCompleteEvent, - ModelChunkEvent, ReflectEvent, TerminateEvent, + ThinkEvent, + ToolStartEvent, + ToolCompleteEvent, + ModelChunkEvent, + ReflectEvent, + TerminateEvent, ) async for event in agent.run("Plan a trip to Paris."): @@ -19,50 +52,126 @@ async for event in agent.run("Plan a trip to Paris."): case ToolCompleteEvent(tool_name=n, result=r): print(f" ↳ {r}") case ModelChunkEvent(content=c) if c: - print(c, end="", flush=True) # token-level streaming + print(c, end="", flush=True) # token-level streaming case ReflectEvent(assessment=a, new_confidence=c): print(f"🪞 {a} ({c:.2f})") case TerminateEvent(final_message=m): print(f"\n✅ {m}") ``` -## Event taxonomy +`match` checks every branch against the event class. If you forget a +branch your IDE underlines it; if you mistype a field name (e.g. +`reasonng` instead of `reasoning`) you get a static error. -| Event | When | -|---|---| -| `ThinkEvent` | Model emits reasoning (extended-thinking models). | -| `ModelChunkEvent` | Each streamed text chunk. Pipe straight to a UI. | -| `ToolStartEvent` | Agent decided to call a tool. | -| `ToolCompleteEvent` | Tool returned (or raised). | -| `ReflectEvent` | Reflexion loop emitted a self-evaluation. | -| `GroundingEvent` | Grounding evaluation finished. | -| `InterruptEvent` | A tool requested human-in-the-loop input. | -| `TerminateEvent` | The run is done — terminal condition met. | +## The event taxonomy + +| Event | When it fires | Useful for | +|---|---|---| +| `ThinkEvent` | The model emits reasoning (extended-thinking models like Claude 4 / o-series) | Render "thinking…" bubbles in a UI | +| `ModelChunkEvent` | Each streamed text chunk from the model | Token-level live rendering | +| `ToolStartEvent` | The agent decided to call a tool | Show a "calling X" indicator | +| `ToolCompleteEvent` | A tool returned (or raised — check `error`) | Show the result inline | +| `ReflectEvent` | Reflexion emitted a self-evaluation | Show "I'm checking my work" | +| `GroundingEvent` | Grounding evaluation finished | Show "verifying claims" | +| `InterruptEvent` | A tool requested human-in-the-loop input | Block on user approval | +| `TerminateEvent` | The run finished — terminal condition met | Show the final answer | + +Every event carries an `event_type` discriminator and a UTC +`timestamp`, so persisted streams replay deterministically. + +## Write-protected — by design + +Events are **frozen** Pydantic models. A hook can read every field; +it **cannot** mutate one. Try and you get a `ValidationError`. If a +hook wants to steer the agent (cancel a tool, retry a model call), +it uses an explicit method on the event (`event.cancel()`, +`event.retry()`, `event.replace_arguments(...)`) — the intent is +visible in code review. + +Why this is important: in callback-based event systems any code can +silently mutate a field and you find out three hops downstream when +the value's wrong. locus's frozen events make that impossible. -Every event carries `event_type` and a UTC `timestamp`. +## Sync wrapper — when you don't need the stream -## Write-protected +```python +result = agent.run_sync("What is 2+2?") +print(result.message) # 'Four.' +print(result.metrics.iterations) +``` + +`agent.run_sync(prompt)` consumes the event stream internally and +returns the final `AgentResult`. The events still emit (hooks still +fire), but you get a single value back. Use this in tests, REPLs, +and scripts where the trace doesn't matter. + +## Practical recipe — render to a terminal UI + +```python +async for event in agent.run("Find Q3 revenue and email it to me."): + match event: + case ToolStartEvent(tool_name=n): + print(f"\n🔧 {n}", end="", flush=True) + case ToolCompleteEvent(error=e) if e: + print(f" ✗ {e}") + case ToolCompleteEvent(): + print(" ✓") + case ModelChunkEvent(content=c) if c: + print(c, end="", flush=True) + case TerminateEvent(): + print() +``` -Events are write-protected value objects. A hook *cannot* mutate one; -the type system enforces it. If a hook needs to influence the run, it -returns a control directive (e.g. `Cancel`, `Retry`). +Every event class is a small Pydantic record — there's no hidden +state. What you see is what gets serialised over SSE, what your +checkpointer persists, what your structured logger records. -## Sync wrapper +## SSE over HTTP — for browser UIs -If you don't want to consume events, `agent.run_sync(prompt)` returns -the final `AgentResult` directly. +The reference [`AgentServer`](server.md) maps the same event stream +onto Server-Sent Events. Same `event_type`, same fields, just +`Content-Type: text/event-stream` over HTTP. -## SSE over HTTP +```python +from locus.server import AgentServer +import uvicorn + +server = AgentServer(agent=agent) +uvicorn.run(server.app, port=8000) +``` + +```javascript +// Browser-side +const es = new EventSource('/stream?prompt=...'); +es.addEventListener('ModelChunkEvent', (e) => { + const { content } = JSON.parse(e.data); + document.getElementById('out').innerText += content; +}); +``` -The reference [AgentServer](server.md) maps the same events onto -Server-Sent Events for browser consumption — same shape, different -transport. +## Common gotchas + +| Symptom | Likely cause | +|---|---| +| `async for` exhausts immediately | You're calling `agent.run_sync()` (sync) instead of `agent.run()` (async). | +| `ModelChunkEvent`s but no `TerminateEvent` | Generator was cancelled mid-stream. Check for exceptions in the consumer. | +| Same event fires twice | A hook re-yielded an event it received. Hooks observe, they don't re-emit. | +| Browser SSE drops every 30s | Default proxy timeout. Set `proxy_read_timeout` higher or have the agent send heartbeats. | ## Tutorials -- [`tutorial_04_agent_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_04_agent_streaming.py) -- [`tutorial_21_sse_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_21_sse_streaming.py) +- [`tutorial_04_agent_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_04_agent_streaming.py) — your first event consumer. +- [`tutorial_21_sse_streaming.py`](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_21_sse_streaming.py) — full SSE wiring against `AgentServer`. ## Source -`src/locus/streaming/` and `src/locus/core/events.py`. +- [`locus.core.events`](https://github.com/oracle-samples/locus/blob/main/src/locus/core/events.py) — every event class. +- [`Agent.run`](https://github.com/oracle-samples/locus/blob/main/src/locus/agent/agent.py) — the iterator that emits them. +- [`AgentServer`](https://github.com/oracle-samples/locus/tree/main/src/locus/server) — the SSE wrapper. + +## See also + +- [Events](events.md) — full taxonomy in reference form. +- [Hooks](hooks.md) — observe the same stream from inside the loop. +- [Agent Server](server.md) — re-emit over HTTP/SSE. +- [Graph streaming](graph-streaming.md) — multi-agent state-graph event streams. diff --git a/docs/concepts/tools.md b/docs/concepts/tools.md index 9143e15..0d80735 100644 --- a/docs/concepts/tools.md +++ b/docs/concepts/tools.md @@ -1,27 +1,69 @@ # Tools -Tools are the agent's way of affecting the world. You write a regular -Python function, decorate it, and pass it to `Agent(tools=[...])`. The -`@tool` decorator introspects the signature and docstring to build a -JSON-schema description the model can call. +Tools are how a locus agent affects the world. The model decides +*"call `search` with query='hnsw'"*; locus runs your `search` +function, captures the return value, and feeds it back. From your +side, a tool is **a regular Python function with a `@tool` +decorator** — locus introspects the signature and docstring to build +the schema the model sees. + +This is the seam most production code touches. Get tools right and +the rest of the framework gets out of your way. + +## When to write a tool + +| You want… | Write a tool | +|---|---| +| The model to call your API / database / file system | ✓ | +| Side-effecting actions the model should be able to invoke | ✓ | +| Read-only lookups (catalogue search, status checks) | ✓ | +| To mutate the agent's *internal* state (system prompt, config) | use a [hook](hooks.md), not a tool | +| To intercept *every* tool call (logging, retry) | use a [hook](hooks.md) | + +## Getting started + +### 1. Decorate a function ```python from locus import tool @tool def search(query: str, limit: int = 10) -> list[str]: - """Search the knowledge base for `query`, up to `limit` results.""" + """Search the knowledge base for ``query``, up to ``limit`` results.""" return backend.search(query, limit) ``` -The docstring becomes the tool description. Parameters are taken -from the signature — type hints drive the JSON schema. Defaults are -optional parameters. +The docstring becomes the tool description the model reads. Type +hints (`str`, `int`, `list[str]`) build the JSON schema. Defaults +mark optional parameters. + +### 2. Pass to the agent + +```python +agent = Agent(model="oci:openai.gpt-5.5", tools=[search]) +``` + +That's the wiring. The model now sees `search` in its tool list and +can call it whenever it decides to. + +### 3. Run it + +```python +result = agent.run_sync("Find documents about HNSW.") +``` + +If the model decides to call `search("hnsw")`, locus invokes your +function with that argument, captures the return value, and feeds it +into the next model turn. You write Python; locus handles the +schema marshalling. -## Idempotent tools +## What you get out of the box -Some tools have side effects you never want duplicated — bookings, -transfers, writes. Mark them idempotent: +### Idempotent tools — the model can retry; the side effect can't + +This is locus's flagship tool primitive. Some side-effecting tools +must run *exactly once* per logical request — bookings, charges, +emails, paging. Mark them `idempotent=True`: ```python @tool(idempotent=True) @@ -33,42 +75,133 @@ def book_flight(flight_id: str, customer_id: str) -> dict: ``` When the model re-issues a tool call with the same -`(name, arguments)` that already ran in this agent run, the ReAct -loop reuses the prior result instead of invoking the function again. -Useful for defending against: +`(name, arguments)` tuple that already ran in this agent run, the +ReAct loop **reuses the prior result instead of invoking the +function again**. Defends against: -- Models that repeat calls after seeing the result. -- Network glitches where a call looks failed but actually succeeded. +- Models that re-emit the same call after seeing the result. +- Network glitches where a call appears failed but actually succeeded. - Users re-prompting "do X" when X has already been done. +- Replays after a checkpoint resume. + +Read the [idempotency concept page](idempotency.md) for the full +picture and the matching tutorial. + +### Sync and async bodies + +Both shapes are supported. Async bodies run on the agent's event +loop directly; sync bodies run in a thread-pool executor so the loop +is never blocked. + +```python +@tool +def add(a: int, b: int) -> int: + return a + b # sync — runs in thread pool + +@tool +async def fetch(url: str) -> str: + async with httpx.AsyncClient() as c: + return (await c.get(url)).text # async — runs on the loop +``` + +### Parallel by default — fast when the model wants multiple things -This is a Locus-specific primitive; LangChain, LangGraph, and Strands -do not ship it. +```python +agent = Agent( + model=..., + tools=[search_a, search_b, search_c], + tool_execution="concurrent", # default +) +``` + +When the model emits multiple tool calls in one turn, locus runs +them concurrently via `asyncio.gather`. Three independent searches +finish in `max(t1, t2, t3)`, not `t1+t2+t3`. + +If your tools have side effects that must be ordered, switch to +`tool_execution="sequential"`. -## Custom names and descriptions +### Error handling — tool failures don't crash the agent -Override the defaults via keyword arguments: +If a tool raises, the executor catches the exception, wraps it as a +`ToolResult(success=False, error=...)`, and feeds it back into the +next model turn. The model sees the failure and can react: retry, +try a different tool, or report to the user. ```python -@tool(name="find_customer", description="Look up a customer by email.") -async def _find(email: str) -> Customer: +@tool +def lookup_by_id(id: str) -> dict: + record = db.get(id) + if record is None: + raise ValueError(f"no record with id={id}") + return record +``` + +The model sees `"no record with id=42"` and decides what to do. +Behind the scenes, locus chains the original exception as the cause +on a `ToolExecutionError` for your structured logs. + +### Custom names and descriptions + +Override the auto-derived defaults when the function name doesn't +read well to the model: + +```python +@tool(name="find_customer", description="Look up a customer by email address.") +async def _find_customer_internal(email: str) -> Customer: ... ``` -Both sync and async bodies are supported. Sync bodies run in a -thread-pool executor so the event loop is not blocked. +The model sees `find_customer`; your code keeps the internal name. + +## Practical recipes + +### Read-only lookups + +```python +@tool +def get_order_status(order_id: str) -> dict: + """Return the current status and shipment info for an order.""" + return orders.get(order_id) +``` + +No need for `idempotent=True` — read-only calls are safe to repeat. + +### Idempotent writes + +```python +@tool(idempotent=True) +def submit_po(vendor_id: str, line_items: list[dict]) -> dict: + """Submit a purchase order. Re-fires return the cached PO id.""" + return procurement.submit(vendor_id, line_items) +``` + +### A tool that's also exposed via MCP + +If you've built a tool you want other agents to reach, expose it +through `LocusMCPServer` — same `@tool`, no rewrite. See +[MCP](mcp.md). + +## Common gotchas -## Parallel vs sequential execution +| Symptom | Likely cause | +|---|---| +| Model never calls the tool | Description / docstring isn't telling the model when to use it. Be explicit: *"Use this tool when the user asks about X."* | +| Tool fires twice on the same input | You're seeing the model retry. Add `idempotent=True`. | +| `TypeError: missing 1 required positional argument` at call time | Function signature has a parameter without a default that you didn't surface in the docstring; the model omitted it. Add a default or explain the parameter. | +| Tool returns Python objects but the model echoes `<__main__.X object at 0x…>` | Tool return value isn't JSON-serialisable. Return a dict / Pydantic model / list of strings, not arbitrary objects. | +| Async tool blocks the event loop | The "async" body is calling sync I/O. Wrap the blocking call in `asyncio.to_thread(...)` or use an async client. | -The agent decides based on `config.tool_execution`: +## Source -- `"concurrent"` (default) — tool calls run in parallel via - `asyncio.gather`. -- `"sequential"` — tool calls run one at a time. Pick this when tool - side effects must be ordered. +- [`@tool` decorator and `Tool` class](https://github.com/oracle-samples/locus/blob/main/src/locus/tools/decorator.py) +- [`ToolRegistry`](https://github.com/oracle-samples/locus/blob/main/src/locus/tools/registry.py) +- [Built-in tools](https://github.com/oracle-samples/locus/tree/main/src/locus/tools/builtins) — `get_today_date`, `task_complete`, `ask_user` -## Error handling +## See also -If a tool raises, the exception is caught at the executor boundary, -wrapped as a `ToolResult(success=False, error=...)`, and passed to the -model so it can react. The original exception is chained as the cause -on a `ToolExecutionError` (see [Errors](errors.md)). +- [Idempotency](idempotency.md) — the full story on `idempotent=True`. +- [Hooks](hooks.md) — for cross-cutting concerns (logging, retry, guardrails). +- [Executors](executors.md) — how concurrent vs sequential tool execution works. +- [MCP](mcp.md) — expose your tools to other agents over the Model Context Protocol. +- [Errors](errors.md) — how tool failures surface in the event stream. diff --git a/docs/img/sequence-26ai.svg b/docs/img/sequence-26ai.svg index 0173c8e..609a3ed 100644 --- a/docs/img/sequence-26ai.svg +++ b/docs/img/sequence-26ai.svg @@ -54,7 +54,7 @@ OCI GenAI - gpt-5.5 · cohere-embed + gpt-5 · cohere-embed diff --git a/docs/index.md b/docs/index.md index e957f6a..63557d8 100644 --- a/docs/index.md +++ b/docs/index.md @@ -7,21 +7,23 @@ hide:

-# Build AI workflows that actually ship +# Build agents that reason and solve together. -**Oracle Generative AI · Multi-Agent · Reasoning · Orchestrator SDK.** +**The Oracle Gen AI Multi-Agent Reasoning SDK.** -Spin up a **swarm** of specialists. Hand a conversation off across an -**escalation desk**. Run an **orchestrator** of experts in parallel. -Wire up a **state graph** that loops until confident. Mesh agents -**across processes** with A2A. Or just ship one self-correcting agent -that knows when to stop. +Reasoning lives inside the loop. **Reflexion** evaluates every turn. +**Grounding** verifies every claim against its source. **Causal** +traces root cause from symptom. -Six multi-agent shapes. One Oracle-native runtime. Every model on OCI -the day it lands. The agent stack you'd actually let near a credit -card. +Six shapes for six problems. **Compose** linear pipelines. +**Orchestrate** specialists in parallel. **Swarm** for peer-to-peer +research. **Handoff** for escalation desks. **StateGraph** loops +until confident. **Functional** maps across agents. **A2A** meshes +across processes. -[See what you can build](#what-you-can-build){ .md-button .md-button--primary } +Every model on Oracle Generative AI the day it lands. + +[See what you can build](#six-things-you-can-ship){ .md-button .md-button--primary } [GitHub](https://github.com/oracle-samples/locus){ .md-button } ```bash @@ -34,7 +36,7 @@ pip install "locus[oci]"
-```python title="travel_concierge.py" +```python from locus import Agent from locus.tools.decorator import tool from locus.memory.backends import OCIBucketBackend @@ -53,7 +55,7 @@ def book_flight(flight_id: str, customer_id: str) -> dict: return billing.charge_and_book(flight_id, customer_id) agent = Agent( - model="oci:openai.gpt-5.5", + model="oci:openai.gpt-5", tools=[search_flights, book_flight], system_prompt="You are a travel concierge. Find a flight, then book it.", reflexion=True, # self-correct mid-run @@ -77,70 +79,220 @@ print(result.message)
-## What you can build +## Six things you can ship + +### Claims grounded. Citations real. Hallucinations dropped + +**Reflexion** evaluates every turn and feeds the next Think a sharper +plan. **Grounding** scores each claim against the tool result it came +from; below-threshold claims get dropped or sent back for re-research. +**Causal** traces root cause from symptom in incident-triage runs. + +```python +from locus import Agent +from locus.tools.decorator import tool + +@tool +def search_web(query: str) -> str: + """Search the web for facts.""" + return search_api.query(query) + +@tool +def read_url(url: str) -> str: + """Fetch and clean text from a URL.""" + return http.fetch_text(url) + +agent = Agent( + model="oci:openai.gpt-5", + tools=[search_web, read_url], + reflexion=True, # self-evaluate every turn + grounding=True, # verify claims against tool results +) + +result = agent.run_sync("Summarise the Q3 earnings call. Cite every number.") +print(result.message) +print(f"grounding score: {result.grounding_score:.2f}") +# → grounding score: 0.94 — three claims grounded, one dropped (revenue mix) +``` + +→ [Reasoning inside the loop](concepts/reasoning.md) · +[Turn on Reflexion + Grounding in one line](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_14_reasoning_patterns.py) -Six concrete workflows. All of them ship in production with locus -today. None of them require a graph editor, a YAML DAG, or a -separate orchestration platform. +### Side effects fire once. Even when the model retries -### Approval workflows that don't double-fire +The model can re-emit the same call after seeing an ambiguous result, +after a network glitch, after a checkpointed restart. With +**`@tool(idempotent=True)`** the body fires exactly once per +`(name, arguments)` hash. Booking, billing, paging — safe by design. -A vendor PO comes in. Procurement and Compliance debate it against -your live Oracle 26ai catalogue. They reach a recommendation. A human -clicks `[y/N]`. The Approval Officer fires `submit_po` and -`email_cfo` — once, even if the model retries the same call three -times. +```python +from locus import Agent +from locus.tools.decorator import tool -> *Procurement and Compliance disagree on three of nine vendors. The -> human approves two. Submit + email fire exactly once. Your CFO is -> happy.* +@tool(idempotent=True) +def submit_po(vendor_id: str, line_items: list[dict]) -> dict: + """Submit the PO. Re-fires within the run return the cached receipt.""" + return procurement.submit(vendor_id, line_items) -### Research crews that catch their own mistakes +@tool(idempotent=True) +def email_cfo(po_id: str, body: str) -> str: + """Send the CFO note. Same arguments → same delivery.""" + return mail.send(to="cfo@org.com", subject=f"PO {po_id}", body=body) -An agent reads, summarises, and fact-checks. **Grounding** -auto-verifies every claim against the source it cited. When a claim -fails grounding the agent goes back and re-reads. **Reflexion** -spots loops on wrong premises before they cost you ten turns of -tokens. You get cited, grounded answers — not hallucinated narratives. +agent = Agent( + model="oci:openai.gpt-5", + tools=[search_vendors, submit_po, email_cfo], + system_prompt="Approve a vendor; submit the PO; email the CFO.", +) -### Customer support that survives every deploy +result = agent.run_sync("Approve Acme for the $42k laptop refresh.") +# → PO-2847 submitted. CFO emailed once. Three model retries deduped on +# the (name, kwargs) hash inside the ReAct loop's Execute node. +``` -Triage decides whether the conversation needs Billing or Shipping. -The whole transcript hands over. The customer sees one continuous -reply. The conversation thread is checkpointed to OCI Object Storage, -so a redeploy mid-chat doesn't lose context. The customer doesn't -have to re-explain. +→ [Idempotent tools in the ReAct loop](concepts/idempotency.md) · +[Walk through a vendor PO with human approval](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_09_human_in_the_loop.py) -### Autonomous workflows that stop when they should +### One conversation, many specialists -Compose stop conditions like algebra: +**Handoff** transfers context, tool history, and confidence from +specialist to specialist. The customer sees one continuous reply; +each team ships their specialist on its own schedule, in its own repo. ```python -terminate = (ToolCalled("submit") & ConfidenceMet(0.9)) | MaxIterations(15) +from locus.multiagent.handoff import ( + create_handoff_agent, create_handoff_manager, HandoffReason, +) + +triage = create_handoff_agent( + name="Triage", + description="Routes incoming customer issues", + system_prompt="Decide: Billing or Shipping. Then hand off.", +) +billing = create_handoff_agent( + name="Billing", + description="Resolves invoices, refunds, charges", + system_prompt="Resolve the billing issue end-to-end.", +) +shipping = create_handoff_agent( + name="Shipping", + description="Tracks orders, reroutes shipments", + system_prompt="Resolve the shipping issue end-to-end.", +) +triage.can_delegate_to = [billing.id, shipping.id] + +desk = create_handoff_manager( + agents=[triage, billing, shipping], + max_chain=5, +) +# → [Triage → Billing] "Refunded $129. Confirmation RF-19340." ``` -The loop stops when the work is actually done — not when the budget -runs out, not when the agent gives up halfway. Inspect, unit-test, -audit; termination is just data. +→ [Handoff with chain-of-custody](concepts/multi-agent/handoff.md) · +[Wire a Triage / Billing / Shipping handoff desk](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_16_agent_handoff.py) -### Multi-agent meshes across teams and processes +### Agent meshes across teams and processes -Your research agent calls a finance agent on another team's service -over **A2A**. They share one event stream. Each agent advertises an -`AgentCard` that lists its capability tags; the calling agent fetches -the card from a known URL and decides whether to delegate. You ship -one agent at a time, on your team's schedule, in your team's repo — -and they still talk. +Each agent publishes an **`AgentCard`** at `/agent-card`. Your research +agent fetches the card from the Finance team's URL, reads the skills +list, and decides whether to delegate. HTTP+SSE under the hood, no +shared infrastructure required. -### Agents that ship to your users on day one +```python +import asyncio +from locus.a2a import A2AClient + +async def main(): + # The Finance team publishes their agent at this URL. + finance = A2AClient(url="https://finance.example.com") + + # Discover capabilities (name, description, skills). + card = await finance.get_agent_card() + print(f"Calling {card.name} — {card.description}") + print(f"Skills: {card.skills}") + + # Delegate. + answer = await finance.invoke( + "Pull Q3 OPEX vs forecast for line items 4100-4250." + ) + print(answer) + # → Q3 OPEX: $47M vs forecast $51M (-8%, supply-chain delays). + +asyncio.run(main()) +``` + +→ [A2A — agents across processes](concepts/multi-agent/a2a.md) · +[Call another team's agent over A2A](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_34_a2a_protocol.py) + +### Stop conditions you can compose + +Compose stop conditions with Python's `&` and `|` operators on typed +classes — `__and__` / `__or__` overloads. Inspectable, unit-testable, +serialisable. You can grep your codebase for *exactly when* an agent +decides to stop. The loop ends when the work is done. + +```python +from locus import Agent +from locus.core.termination import ( + MaxIterations, ToolCalled, ConfidenceMet, TextMention, +) + +termination = ( + (ToolCalled("submit_po") & ConfidenceMet(0.9)) # work done + confident + | TextMention(r"\bDONE\b") # …or model says DONE + | MaxIterations(15) # …or safety cap +) + +agent = Agent( + model="oci:openai.gpt-5", + tools=[search_vendors, submit_po], + termination=termination, +) + +result = agent.run_sync("Approve and submit the laptop PO.") +print(result.termination_reason) +# → ToolCalled('submit_po') and ConfidenceMet(0.92) +``` + +→ [Termination algebra](concepts/termination.md) · +[Compose stop conditions like algebra](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_37_termination.py) + +### Day-one production deployment + +**`AgentServer`** wraps any agent as a FastAPI app: `POST /invoke`, +`POST /stream` for SSE, `GET`/`DELETE /threads/{id}` with per-principal +persistence — two API keys can't read each other's threads. Ship to +OKE, Container Instances, OCI Functions, or anywhere FastAPI runs. + +```python +import os +from locus import Agent +from locus.memory.backends import oci_bucket_checkpointer +from locus.server import AgentServer + +agent = Agent( + model="oci:openai.gpt-5", + tools=[lookup_invoice, refund], + checkpointer=oci_bucket_checkpointer( + bucket_name="support-threads", + namespace="", + ), +) + +server = AgentServer( + agent=agent, + api_key=os.environ["LOCUS_SERVER_API_KEY"], +) +server.run(host="0.0.0.0", port=8080) + +# $ curl -X POST http://localhost:8080/invoke \ +# -H "Authorization: Bearer $LOCUS_SERVER_API_KEY" \ +# -d '{"prompt":"Refund order ORD-42","thread_id":"user-c42"}' +# → {"message": "Refunded $129. Confirmation RF-19340.", "thread_id": "user-c42"} +``` -`AgentServer` is a drop-in FastAPI app: `POST /invoke` for synchronous -runs, `POST /stream` for SSE-streamed events, `GET` / `DELETE -/threads/{id}` for per-thread persistence (scoped to the bearer -principal so two API keys can't read each other's conversations). -Native to Oracle Generative AI — every model the day OCI ships it. -Two transports, one auth surface, zero glue between laptop and -production. +→ [Agent Server — drop-in FastAPI app](concepts/server.md) · +[Deploy a locus agent as a FastAPI service](https://github.com/oracle-samples/locus/blob/main/examples/tutorial_28_agent_server.py) ## The locus agent loop @@ -294,7 +446,7 @@ def book_flight(flight_id: str, customer_id: str) -> dict: return billing.charge_and_book(flight_id, customer_id) agent = Agent( - model="oci:openai.gpt-5.5", + model="oci:openai.gpt-5", tools=[book_flight], system_prompt="You are a travel concierge. Book the flight the user asks for.", ) diff --git a/docs/stylesheets/locus.css b/docs/stylesheets/locus.css index f7798e4..1f722de 100644 --- a/docs/stylesheets/locus.css +++ b/docs/stylesheets/locus.css @@ -198,11 +198,18 @@ grid-template-columns: 1.05fr 1fr; gap: 2.4rem; align-items: start; /* both columns top-align — no dead vertical space */ - margin: 1rem 0 1.4rem; - padding: 1.6rem 0 0.8rem; + margin: -0.4rem 0 1.4rem; + padding: 0.4rem 0 0.8rem; isolation: isolate; overflow: hidden; } +/* Tighten the gap between the tabs strip and the hero on the home page — + the default Material content padding leaves dead space above the H1. */ +.md-content__inner:has(> .locus-hero), +.md-content__inner:has(> div > .locus-hero) { + padding-top: 0.2rem; + margin-top: 0; +} /* Oracle-red soft glow behind the H1 — the warm spotlight */ .md-typeset .locus-hero::before { content: ""; @@ -238,7 +245,7 @@ font-weight: 800; text-transform: lowercase; color: var(--locus-ink); - margin: 0 0 1.4rem; + margin: 0 0 0.6rem; } .md-typeset .locus-hero h1 .accent { color: var(--or-red); @@ -692,6 +699,35 @@ letter-spacing: -0.01em; } +/* --------------------------------------------------------------------------- + Oracle-distinctive callout — used on the Capabilities page to highlight + wedge features. Oracle-red border + warm sand tint, red star icon. + Trigger with: `!!! oracle-distinctive "Distinctive to locus"` + --------------------------------------------------------------------------- */ +:root { + --md-admonition-icon--oracle-distinctive: url('data:image/svg+xml;charset=utf-8,'); +} +.md-typeset .admonition.oracle-distinctive, +.md-typeset details.oracle-distinctive { + border-color: var(--or-red); + background: linear-gradient(135deg, + rgba(199, 70, 52, 0.04) 0%, + rgba(240, 204, 113, 0.04) 100%); +} +.md-typeset .oracle-distinctive > .admonition-title, +.md-typeset .oracle-distinctive > summary { + background-color: rgba(199, 70, 52, 0.08); + color: var(--or-red-deep); + font-weight: 700; + letter-spacing: -0.01em; +} +.md-typeset .oracle-distinctive > .admonition-title::before, +.md-typeset .oracle-distinctive > summary::before { + background-color: var(--or-red); + -webkit-mask-image: var(--md-admonition-icon--oracle-distinctive); + mask-image: var(--md-admonition-icon--oracle-distinctive); +} + /* --------------------------------------------------------------------------- Diagrams — responsive sizing for the agent-loop SVG and similar. --------------------------------------------------------------------------- */ @@ -709,6 +745,22 @@ max-width: 920px; } +/* The architecture diagrams (agent-loop, multi-agent-patterns, the + per-pattern SVGs under img/patterns/, and the architecture / sequence + topologies) are authored against a light background. In dark mode the + dark-grey strokes and labels vanish, so render them on a near-white + card. */ +[data-md-color-scheme="slate"] .md-typeset img[src*="agent-loop"], +[data-md-color-scheme="slate"] .md-typeset img[src*="multi-agent-patterns"], +[data-md-color-scheme="slate"] .md-typeset img[src*="img/patterns/"], +[data-md-color-scheme="slate"] .md-typeset img[src*="architecture"], +[data-md-color-scheme="slate"] .md-typeset img[src*="sequence-26ai"], +[data-md-color-scheme="slate"] .md-typeset img.diagram { + background-color: #FBF9F8; + padding: 1rem; + border-radius: 0.5rem; +} + /* --------------------------------------------------------------------------- Dark-mode logo handling. - Oracle wordmark (black-fill) → invert filter flips to white. diff --git a/mkdocs.yml b/mkdocs.yml index bdb3465..bfb28d1 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -170,4 +170,4 @@ nav: - Checkpointers: api/checkpointers.md - Tools: api/tools.md - Events: api/events.md -- Feature matrix: FEATURES.md +- Capabilities: FEATURES.md