docs(agent-sdk): add Ollama LLM streaming example#133
Conversation
Add `02-ollama.ts` to the agent-sdk TypeScript examples — step 2 of the example ladder after `01-echo.ts`. Same `AgentService` shape as the echo agent; the only change is the `onPrompt` body, which forwards each prompt to a local Ollama (`/api/generate`, plain `fetch`, no new deps) and streams the model's reply back token by token via repeated `response.send(...)`. Model is a `MODEL` constant (default `llama3.2`), overridable via `OLLAMA_MODEL`; host via `OLLAMA_URL`. README updated with the new row and a short run note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Claude finished @M64GitHub's task in 1m 50s —— View job PR #133 Review — docs(agent-sdk): add Ollama LLM streaming example
SummaryThe overall shape is correct. The streaming logic — Two real issues found (both have inline suggestions):
README update is clean — table row and run note are accurate and well-formatted. |
| body: JSON.stringify({ model: MODEL, prompt, stream: true }), | ||
| }); | ||
| if (!res.ok || res.body === null) { | ||
| throw new Error(`Ollama request failed: ${res.status} ${res.statusText}`); |
There was a problem hiding this comment.
Missing Content-Type: application/json header. Ollama happens to work without it in practice, but HTTP semantics require the header for a POST with a JSON body — some Ollama versions or proxies in front of it may reject the request.
| throw new Error(`Ollama request failed: ${res.status} ${res.statusText}`); | |
| body: JSON.stringify({ model: MODEL, prompt, stream: true }), | |
| headers: { "Content-Type": "application/json" }, | |
| }); |
There was a problem hiding this comment.
Fixed in ffc628e — added headers: { "Content-Type": "application/json" } to the POST.
| buffer = lines.pop() ?? ""; | ||
| for (const line of lines) { | ||
| if (line.trim() === "") continue; | ||
| yield (JSON.parse(line) as { response?: string }).response ?? ""; |
There was a problem hiding this comment.
The done packet Ollama sends looks like {"done":true,"response":"","done_reason":"stop",...}. The response field is an empty string, so the generator yields "" one extra time and the handler calls response.send("") — sending a vacuous ResponseChunk over the wire. Not protocol-breaking, but worth filtering:
| yield (JSON.parse(line) as { response?: string }).response ?? ""; | |
| const parsed = JSON.parse(line) as { response?: string; done?: boolean }; | |
| if (parsed.done) break; | |
| const token = parsed.response ?? ""; | |
| if (token) yield token; |
There was a problem hiding this comment.
Fixed in ffc628e. Verified against a live Ollama: the old path emitted 2 chunks (1 empty, the done packet); the fix yields only non-empty response text, so 1 chunk / 0 empties. To be explicit for anyone reading this thread — send("") was not the §6.5 stream terminator: it encodes to a 29-byte {"type":"response","data":""} chunk, whereas the SDK's terminator is a genuinely zero-byte body emitted automatically after the handler returns. So no early close; this was just trimming a vacuous chunk (also keeps the protocol visualizer clean for the demo).
- Send `Content-Type: application/json` on the Ollama POST — correct HTTP
semantics; some Ollama versions / fronting proxies may reject without it.
- Skip Ollama's final `{"done":true,"response":""}` packet so the handler
never streams a vacuous `{"type":"response","data":""}` chunk to the
caller (verified against a live Ollama: trailing empty chunk eliminated).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@claude - please review again |
|
Claude finished @M64GitHub's task in 1m 19s —— View job PR #133 Re-Review — docs(agent-sdk): add Ollama LLM streaming example
Both previous issues are resolved ✅1. headers: { "Content-Type": "application/json" },2. Spurious empty chunk from const token = (JSON.parse(line) as { response?: string }).response ?? "";
if (token) yield token;This filters at the generator level rather than checking One new issue foundREADME copy-paste bug — wrong subject in The nats req agents.prompt.echo.<you>.main "hello!" \The Ollama agent registers as nats req agents.prompt.ollama.<you>.main "hello!" \The expected output block at OverallThe implementation itself ( |
The `02-ollama` section was inserted above echo's "drive it with `nats req`" block, stranding echo's example (and its `agents.prompt.echo...` subject + `echo: hello!` output) under the Ollama heading. Restore echo's drive/output block to the echo section, and give the ollama section its own correct `agents.prompt.ollama...` invocation with a representative streamed output. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Good catch — fixed in 1ad9ec8. Root cause: the Fix: echo's drive/output block is back under the echo run section, and |
What
Adds
agent-sdk/typescript/examples/02-ollama.ts— step 2 of the agent-sdk example ladder, after01-echo.ts.It's deliberately the echo agent with one thing changed: the
AgentServicesetup, connection resolution, and shutdown are identical; only theonPromptbody differs. Instead of a singleecho: <prompt>reply, it forwards the prompt to a local Ollama and streams the model's answer back token by token.The single helper,
ollamaTokens(), is an async generator that hits/api/generatewith plainfetch(no new dependencies) and reassembles Ollama's newline-delimited JSON into tokens. The SDK owns stream termination, so the handler just returns when the loop ends.Config
MODELconstant, defaultllama3.2, overridable viaOLLAMA_MODELOLLAMA_URL(defaulthttp://localhost:11434)Context
Built as conference-talk material (reThinkConn) demonstrating the Synadia Agent Protocol in three small steps: echo → LLM prompt/response → tool calls. This is step 2; step 3 (tool calls) to follow.
Verification
bun run typecheck,bun run lint,bun run format:check— cleanREADME updated with the new row and a run note. No SDK surface, wire-format, or dependency changes — examples-only.
🤖 Generated with Claude Code