diff --git a/CHANGELOG.md b/CHANGELOG.md index 58f9918..2405f88 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -50,6 +50,97 @@ `actions/record_stop` against the Telnyx Call Control API when `recording=True` was passed; the README copy lagged behind the code. +### Fixed + +- **Python: pipeline mode crashed immediately on stream start with + `AttributeError: 'ClientConnection' object has no attribute 'closed'` + (#111, #113).** Three WS-liveness checks + (`libraries/python/getpatter/stream_handler.py:2192` / + `:2230` and + `libraries/python/getpatter/providers/elevenlabs_ws_tts.py:453`) + still used the legacy `websockets<11` `.closed` property, but Patter + pins `websockets>=14,<16` in + `libraries/python/pyproject.toml` where `.closed` was removed in v12. + Promoted the existing `_is_parked_ws_alive` helper out of + `stream_handler.py` into + `libraries/python/getpatter/utils/ws.py` as `is_ws_alive`, and + re-used it at every call site. Handles modern (`state`, + `close_code`), legacy (`closed`), and unknown shapes; never defaults + to "alive" on unknown shapes so a dead socket can't be handed to the + live adapter. 8 new unit tests in + `libraries/python/tests/test_utils_ws.py`. Thanks + [@knowsuchagency](https://github.com/knowsuchagency). + +- **Python: pipeline mode did not inject the built-in `transfer_call` + / `end_call` tools into the `LLMLoop`, so pipeline agents could not + initiate a handoff or hangup no matter what the system prompt said + (#110, #115).** Realtime mode had been injecting both built-ins at + `libraries/python/getpatter/stream_handler.py:997` + (`agent_tools + [TRANSFER_CALL_TOOL, END_CALL_TOOL]`), but the + pipeline path at + `libraries/python/getpatter/stream_handler.py:2426` was passing + through only the user-provided tools. Added + `_augment_with_builtin_handoff_tools` that builds handler closures + with the `(arguments, call_context)` signature expected by + `ToolExecutor._invoke_handler` and wires them to the existing + telephony-level `_transfer_fn` / `_hangup_fn` already attached to + `PipelineStreamHandler`. Built-ins are skipped when the + corresponding telephony fn is missing (keeps the non-telephony test + harness path clean). Verified end-to-end against `gpt-4o-mini` on + Twilio: caller says "transfer me", LLM emits + `transfer_call({"number": "+1..."})`, `_twilio_transfer` fires, the + call bridges. 6 new unit tests in + `libraries/python/tests/test_pipeline_builtin_tools.py`. Thanks + [@knowsuchagency](https://github.com/knowsuchagency). + +- **TypeScript: pipeline mode missing built-in `transfer_call` / + `end_call` tools — parity fix for #115.** Both `new LLMLoop(...)` + call sites in `libraries/typescript/src/stream-handler.ts:1891` and + `:1906` were passing `agent.tools` through unchanged; the built-ins + shipped in `server.ts` (now exported as `TRANSFER_CALL_TOOL` / + `END_CALL_TOOL`) were only injected into the Realtime path at + `server.ts:374`. Added `augmentWithBuiltinHandoffTools` in + `libraries/typescript/src/stream-handler.ts` that mirrors the Python + helper: appends the two built-ins with handler closures that + validate E.164 / default `reason` and dispatch to the existing + telephony bridge methods (`this.deps.bridge.transferCall` / + `endCall`). 8 new unit tests in + `libraries/typescript/tests/pipeline-builtin-tools.test.ts`. Closes + the parity gap surfaced by #115. + +- **Docs: `docs/typescript-sdk/events.mdx` advertised the same + non-existent `phone.events.on(PatterEventType.X, handler)` API as + the Python events page — TypeScript parity fix for #114.** The TS + `Patter` class never exposed an `.events` attribute; `EventBus` is + instantiated per `StreamHandler`. Replaced the broken `EventBus` + section with documentation of the APIs that actually exist on the + TypeScript `Patter` class: **Speech-edge events** via the attribute + setters (`onUserSpeechStarted` / `onUserSpeechEnded` / + `onUserSpeechEos` / `onAgentSpeechStarted` / `onAgentSpeechEnded` / + `onLlmToken` / `onAudioOut`, proxied to `this.speechEvents` at + `libraries/typescript/src/client.ts:241-330`) and **Tool events via + `onTranscript`** (tool invocations surface with `role === "tool"`, + `tool_name`, `tool_args`, `tool_result` — payload defined at + `libraries/typescript/src/stream-handler.ts:2988-3010`). + +- **Docs: `docs/python-sdk/events.mdx` advertised a non-existent + `phone.events.on(PatterEventType.X, handler)` API that crashed + immediately with `AttributeError: 'Patter' object has no attribute + 'events'` (#112, #114).** The `_EventBus` is instantiated per + `StreamHandler` (`libraries/python/getpatter/stream_handler.py:517`) + and never exposed on the `Patter` class. Replaced the broken + `EventBus` section with documentation of the APIs that actually + exist: **Speech-edge events** via the attribute setters on `Patter` + (`on_user_speech_started` / `on_user_speech_ended` / + `on_user_speech_eos` / `on_agent_speech_started` / + `on_agent_speech_ended` / `on_llm_token` / `on_audio_out`, proxied + to `self.speech_events` at + `libraries/python/getpatter/client.py:351-410`) and **Tool events + via `on_transcript`** (tool invocations surface with `role="tool"`, + `tool_name`, `tool_args`, `tool_result` — payload defined at + `libraries/python/getpatter/stream_handler.py:929`). Thanks + [@knowsuchagency](https://github.com/knowsuchagency). + ## 0.6.2 (2026-05-25) ### Added diff --git a/docs/typescript-sdk/events.mdx b/docs/typescript-sdk/events.mdx index c95b935..397e30d 100644 --- a/docs/typescript-sdk/events.mdx +++ b/docs/typescript-sdk/events.mdx @@ -18,7 +18,7 @@ Patter emits events at key moments during a call. Register callbacks in `serve() | `onMessage` | User transcript ready for response (pipeline mode) | `serve()` | | `onMetrics` | After each conversational turn completes | `serve()` | -For **fine-grained pipeline observability** (every interim transcript, every LLM chunk, every TTS chunk, every tool start) subscribe to the [EventBus](#eventbus) below — it complements these callbacks rather than replacing them. +For **fine-grained pipeline observability** see the [Speech-edge events](#speech-edge-events) and [Tool events via onTranscript](#tool-events-via-ontranscript) sections below — they complement the lifecycle callbacks rather than replacing them. For **mutating prompts and responses** (RAG augmentation, output validation, PII redaction) use [PipelineHooks](#pipelinehooks-beforellm--afterllm) — they sit *inside* the LLM step rather than firing alongside it. @@ -241,26 +241,75 @@ await phone.serve({ } ``` -## EventBus +## Speech-edge events -The `EventBus` exposes fine-grained pipeline events that don't have first-class callbacks. Subscribe with `events.on(eventType, handler)` from anywhere you have a `Patter` reference. +For turn-taking, TTFT measurement, and barge-in / interrupt observability, set the speech-edge callbacks directly on the `Patter` instance. They proxy to a per-process `SpeechEvents` dispatcher and fire from any in-flight call. ```typescript -import { Patter, PatterEventType } from "getpatter"; +phone.onUserSpeechEos = async (ev) => { + // Committed end-of-utterance — anchor TTFT here. + console.log(`EOS via ${ev.trigger} at ${ev.timestamp_ms}ms`); +}; + +phone.onLlmToken = async (ev) => { + // First LLM token of the turn — TTFT marker. + console.log(`TTFT, model=${ev.model}, t=${ev.timestamp_ms}ms`); +}; -phone.events.on(PatterEventType.TRANSCRIPT_PARTIAL, (ev) => console.log("partial:", ev.text)); -phone.events.on(PatterEventType.LLM_CHUNK, (ev) => logChunk(ev.call_id, ev.text)); +phone.onAgentSpeechEnded = async (ev) => { + const status = ev.interrupted ? "interrupted" : "completed"; + console.log(`Turn ${ev.turn_idx} ${status}`); +}; + +phone.onUserSpeechStarted = async (ev) => { /* raw VAD positive edge */ }; +phone.onUserSpeechEnded = async (ev) => { /* raw VAD trailing edge */ }; +phone.onAgentSpeechStarted = async (ev) => { /* first wire-time audio chunk */ }; +phone.onAudioOut = async (ev) => { /* first TTS audio bytes produced */ }; ``` -| `PatterEventType` | Fires | +| Attribute | Fires | |---|---| -| `TRANSCRIPT_PARTIAL` | Every interim STT result (before endpointing). | -| `TRANSCRIPT_FINAL` | Every final STT result (after endpointing). Same payload as `onTranscript`. | -| `LLM_CHUNK` | Every streamed LLM token / chunk. | -| `TTS_CHUNK` | Every TTS audio chunk written to the carrier. | -| `TOOL_CALL_STARTED` | Tool dispatched (paired with the existing `tool_call_completed` you can observe via `onCallEnd`). | +| `onUserSpeechStarted` | Raw VAD positive edge (caller begins speaking). | +| `onUserSpeechEnded` | Raw VAD trailing edge (caller stops speaking). | +| `onUserSpeechEos` | Committed end-of-utterance — anchor TTFT here. | +| `onAgentSpeechStarted` | First wire-time agent audio chunk — turn-start marker for the caller. | +| `onAgentSpeechEnded` | Last agent audio chunk. Payload includes `interrupted` flag for barge-in. | +| `onLlmToken` | First LLM token of the turn — TTFT marker. | +| `onAudioOut` | First TTS audio bytes produced — TTS warmup signal. | + +Callbacks are async. Throwing inside a callback logs the error but does not interrupt the call. + +## Tool events via `onTranscript` -Handlers are non-blocking (fire-and-forget). Throwing inside a handler logs the error but does not interrupt the call. +Tool invocations (including the built-in `transfer_call` and `end_call`) surface through the same `onTranscript` callback you pass to `phone.serve(...)`. Filter on `role === "tool"` to handle them: + +```typescript +await phone.serve({ + agent, + onTranscript: async (ev) => { + if (ev.role === "tool") { + console.log( + `tool=${ev.tool_name} ` + + `args=${JSON.stringify(ev.tool_args)} ` + + `result=${ev.tool_result}`, + ); + } else { + console.log(`[${ev.role}] ${ev.text}`); + } + }, +}); +``` + +The event payload for tool calls carries: + +| Key | Type | Notes | +|---|---|---| +| `role` | `"tool"` | Always `"tool"` for tool events. | +| `tool_name` | `string` | The tool that was dispatched. | +| `tool_args` | `Record` | Arguments emitted by the LLM. | +| `tool_result` | `string \| null` | Result returned by the tool handler (truncated for log readability). | +| `call_id` | `string` | The active call ID. | +| `text` | `string` | Pre-formatted `tool_name(args) → result` string. | --- diff --git a/libraries/typescript/package-lock.json b/libraries/typescript/package-lock.json index 7052d90..84af80e 100644 --- a/libraries/typescript/package-lock.json +++ b/libraries/typescript/package-lock.json @@ -1,12 +1,12 @@ { "name": "getpatter", - "version": "0.6.1", + "version": "0.6.2", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "getpatter", - "version": "0.6.1", + "version": "0.6.2", "license": "MIT", "dependencies": { "express": "^5.2.1", diff --git a/libraries/typescript/src/server.ts b/libraries/typescript/src/server.ts index c3b71e8..3f95489 100644 --- a/libraries/typescript/src/server.ts +++ b/libraries/typescript/src/server.ts @@ -60,7 +60,7 @@ export interface LocalConfig { type AIAdapter = OpenAIRealtimeAdapter | ElevenLabsConvAIAdapter; -const TRANSFER_CALL_TOOL = { +export const TRANSFER_CALL_TOOL = { name: 'transfer_call', description: 'Transfer the call to a human agent at the specified phone number', parameters: { @@ -75,7 +75,7 @@ const TRANSFER_CALL_TOOL = { }, }; -const END_CALL_TOOL = { +export const END_CALL_TOOL = { name: 'end_call', description: 'End the current phone call. Use when the conversation is complete or the user says goodbye.', parameters: { diff --git a/libraries/typescript/src/stream-handler.ts b/libraries/typescript/src/stream-handler.ts index 8dd7d02..efa55b5 100644 --- a/libraries/typescript/src/stream-handler.ts +++ b/libraries/typescript/src/stream-handler.ts @@ -23,7 +23,7 @@ import { MCPManager } from './tools/mcp-client'; import type { AgentOptions, Guardrail, HookContext, PipelineMessageHandler, ToolDefinition, VADProvider } from './types'; import type { MetricsStore } from './dashboard/store'; import { getLogger } from './logger'; -import { validateTwilioSid } from './server'; +import { validateTwilioSid, TRANSFER_CALL_TOOL, END_CALL_TOOL } from './server'; import type { ProviderPricing } from './pricing'; import { SentenceChunker } from './sentence-chunker'; import { PipelineHookExecutor } from './pipeline-hooks'; @@ -117,6 +117,56 @@ function isValidE164(number: string): boolean { return /^\+[1-9]\d{6,14}$/.test(number); } +/** + * Augment a tool list with the built-in `transfer_call` / `end_call` tools, + * wired to the telephony-level transfer / hangup callbacks. Used by pipeline + * mode to match the Realtime path's tool surface (Realtime injects the same + * two built-ins at `server.ts` and dispatches them via the bridge in this + * file's tool dispatcher around line 3100). Without this the pipeline LLM + * never sees the built-ins and cannot initiate a transfer or hangup + * regardless of system-prompt instructions. Parity with Python helper + * `_augment_with_builtin_handoff_tools` in `stream_handler.py`. + * + * Built-ins are skipped when the corresponding callback is missing (keeps + * non-telephony test harnesses clean). User-provided tools keep their + * original order; the built-ins are appended. + */ +export function augmentWithBuiltinHandoffTools( + userTools: ToolDefinition[] | null | undefined, + callbacks: { + transferCall?: (number: string) => Promise; + endCall?: (reason: string) => Promise; + }, +): ToolDefinition[] { + const out: ToolDefinition[] = [...(userTools ?? [])]; + if (callbacks.transferCall) { + const transferCall = callbacks.transferCall; + out.push({ + ...TRANSFER_CALL_TOOL, + handler: async (args: Record): Promise => { + const number = typeof args.number === 'string' ? args.number : ''; + if (!isValidE164(number)) { + return JSON.stringify({ error: 'Invalid phone number format', status: 'rejected' }); + } + await transferCall(number); + return JSON.stringify({ status: 'transferring', to: number }); + }, + }); + } + if (callbacks.endCall) { + const endCall = callbacks.endCall; + out.push({ + ...END_CALL_TOOL, + handler: async (args: Record): Promise => { + const reason = typeof args.reason === 'string' ? args.reason : 'conversation_complete'; + await endCall(reason); + return JSON.stringify({ status: 'ending', reason }); + }, + }); + } + return out; +} + /** * Short words / phrases that Whisper (and, less often, Deepgram) routinely * emit when fed silence or TTS echo on mulaw 8 kHz. Dropping them as turns @@ -1888,11 +1938,23 @@ export class StreamHandler { } // eslint-disable-next-line @typescript-eslint/no-explicit-any const providerModel = (this.deps.agent.llm as any)?.model ?? ''; + // Inject the built-in transfer_call / end_call tools — parity with the + // Realtime path which injects them at `server.ts` and dispatches via + // the bridge in this file's tool dispatcher. Without this, pipeline-mode + // LLMs never see the built-ins and can't initiate a handoff or hangup + // no matter what the system prompt says. + const augmentedTools = augmentWithBuiltinHandoffTools( + this.deps.agent.tools as ToolDefinition[] | null | undefined, + { + transferCall: (number) => this.deps.bridge.transferCall(this.callId, number), + endCall: () => this.deps.bridge.endCall(this.callId, this.ws), + }, + ); this.llmLoop = new LLMLoop( '', // apiKey unused when llmProvider is supplied providerModel, // propagate so calculateLlmCost can match the price row resolvedPrompt, - this.deps.agent.tools as ToolDefinition[] | undefined, + augmentedTools, this.deps.agent.llm, this.deps.agent.disablePhonePreamble ?? false, ); @@ -1903,11 +1965,18 @@ export class StreamHandler { } else if (!this.deps.onMessage && this.deps.config.openaiKey) { let llmModel = this.deps.agent.model || 'gpt-4o-mini'; if (llmModel.includes('realtime')) llmModel = 'gpt-4o-mini'; + const augmentedTools = augmentWithBuiltinHandoffTools( + this.deps.agent.tools as ToolDefinition[] | null | undefined, + { + transferCall: (number) => this.deps.bridge.transferCall(this.callId, number), + endCall: () => this.deps.bridge.endCall(this.callId, this.ws), + }, + ); this.llmLoop = new LLMLoop( this.deps.config.openaiKey, llmModel, resolvedPrompt, - this.deps.agent.tools as ToolDefinition[] | undefined, + augmentedTools, undefined, this.deps.agent.disablePhonePreamble ?? false, ); diff --git a/libraries/typescript/tests/pipeline-builtin-tools.test.ts b/libraries/typescript/tests/pipeline-builtin-tools.test.ts new file mode 100644 index 0000000..c3fd156 --- /dev/null +++ b/libraries/typescript/tests/pipeline-builtin-tools.test.ts @@ -0,0 +1,127 @@ +/** + * Regression for upstream issue #110 (Python PR #115) — TypeScript parity. + * + * Pipeline mode previously passed only the user-provided tools to `LLMLoop` + * — the built-in `transfer_call` / `end_call` tools that the Realtime path + * injects were missing, so pipeline LLMs could never initiate a handoff or + * hangup regardless of the system prompt. + * + * These tests exercise the `augmentWithBuiltinHandoffTools` helper that + * bolts the built-ins onto the tool list with handler closures wired to the + * telephony-level transfer / hangup callbacks. + */ +import { describe, it, expect } from 'vitest'; +import { augmentWithBuiltinHandoffTools } from '../src/stream-handler'; +import type { ToolDefinition } from '../src/types'; + +describe('augmentWithBuiltinHandoffTools', () => { + it('appends transfer_call and end_call when both callbacks present', () => { + const tools = augmentWithBuiltinHandoffTools(null, { + transferCall: async () => {}, + endCall: async () => {}, + }); + expect(tools.map((t) => t.name)).toEqual(['transfer_call', 'end_call']); + expect(typeof tools[0].handler).toBe('function'); + expect(typeof tools[1].handler).toBe('function'); + }); + + it('preserves user tools order with built-ins appended', () => { + const userTools: ToolDefinition[] = [ + { name: 'lookup_customer', description: '', parameters: { type: 'object' } }, + { name: 'send_sms', description: '', parameters: { type: 'object' } }, + ]; + const tools = augmentWithBuiltinHandoffTools(userTools, { + transferCall: async () => {}, + endCall: async () => {}, + }); + expect(tools.map((t) => t.name)).toEqual([ + 'lookup_customer', + 'send_sms', + 'transfer_call', + 'end_call', + ]); + }); + + it('skips built-ins when callbacks are missing', () => { + const userTools: ToolDefinition[] = [ + { name: 'lookup_customer', description: '', parameters: {} }, + ]; + const tools = augmentWithBuiltinHandoffTools(userTools, {}); + expect(tools.map((t) => t.name)).toEqual(['lookup_customer']); + }); + + it('skips only the built-in whose callback is missing', () => { + const tools = augmentWithBuiltinHandoffTools(null, { + endCall: async () => {}, + }); + expect(tools.map((t) => t.name)).toEqual(['end_call']); + }); + + it('transfer handler dispatches the number to transferCall', async () => { + const captured: string[] = []; + const tools = augmentWithBuiltinHandoffTools(null, { + transferCall: async (n) => { + captured.push(n); + }, + }); + const handler = tools[0].handler; + if (typeof handler !== 'function') throw new Error('handler missing'); + const result = await (handler as (a: Record, c: Record) => Promise)( + { number: '+14155551234' }, + { call_id: 'CAtest' }, + ); + expect(captured).toEqual(['+14155551234']); + expect(result).toContain('transferring'); + expect(result).toContain('+14155551234'); + }); + + it('transfer handler rejects invalid E.164 without dispatching', async () => { + const captured: string[] = []; + const tools = augmentWithBuiltinHandoffTools(null, { + transferCall: async (n) => { + captured.push(n); + }, + }); + const handler = tools[0].handler; + if (typeof handler !== 'function') throw new Error('handler missing'); + const result = await (handler as (a: Record, c: Record) => Promise)( + { number: 'not-a-number' }, + { call_id: 'CAtest' }, + ); + expect(captured).toEqual([]); + expect(result).toContain('rejected'); + }); + + it('end_call handler dispatches with default reason', async () => { + const reasons: string[] = []; + const tools = augmentWithBuiltinHandoffTools(null, { + endCall: async (r) => { + reasons.push(r); + }, + }); + const handler = tools[0].handler; + if (typeof handler !== 'function') throw new Error('handler missing'); + const result = await (handler as (a: Record, c: Record) => Promise)( + {}, + { call_id: 'CAtest' }, + ); + expect(reasons).toEqual(['conversation_complete']); + expect(result).toContain('ending'); + }); + + it('end_call handler passes through user-supplied reason', async () => { + const reasons: string[] = []; + const tools = augmentWithBuiltinHandoffTools(null, { + endCall: async (r) => { + reasons.push(r); + }, + }); + const handler = tools[0].handler; + if (typeof handler !== 'function') throw new Error('handler missing'); + await (handler as (a: Record, c: Record) => Promise)( + { reason: 'user_requested' }, + { call_id: 'CAtest' }, + ); + expect(reasons).toEqual(['user_requested']); + }); +});