Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,97 @@
`actions/record_stop` against the Telnyx Call Control API when
`recording=True` was passed; the README copy lagged behind the code.

### Fixed

- **Python: pipeline mode crashed immediately on stream start with
`AttributeError: 'ClientConnection' object has no attribute 'closed'`
(#111, #113).** Three WS-liveness checks
(`libraries/python/getpatter/stream_handler.py:2192` /
`:2230` and
`libraries/python/getpatter/providers/elevenlabs_ws_tts.py:453`)
still used the legacy `websockets<11` `.closed` property, but Patter
pins `websockets>=14,<16` in
`libraries/python/pyproject.toml` where `.closed` was removed in v12.
Promoted the existing `_is_parked_ws_alive` helper out of
`stream_handler.py` into
`libraries/python/getpatter/utils/ws.py` as `is_ws_alive`, and
re-used it at every call site. Handles modern (`state`,
`close_code`), legacy (`closed`), and unknown shapes; never defaults
to "alive" on unknown shapes so a dead socket can't be handed to the
live adapter. 8 new unit tests in
`libraries/python/tests/test_utils_ws.py`. Thanks
[@knowsuchagency](https://github.com/knowsuchagency).

- **Python: pipeline mode did not inject the built-in `transfer_call`
/ `end_call` tools into the `LLMLoop`, so pipeline agents could not
initiate a handoff or hangup no matter what the system prompt said
(#110, #115).** Realtime mode had been injecting both built-ins at
`libraries/python/getpatter/stream_handler.py:997`
(`agent_tools + [TRANSFER_CALL_TOOL, END_CALL_TOOL]`), but the
pipeline path at
`libraries/python/getpatter/stream_handler.py:2426` was passing
through only the user-provided tools. Added
`_augment_with_builtin_handoff_tools` that builds handler closures
with the `(arguments, call_context)` signature expected by
`ToolExecutor._invoke_handler` and wires them to the existing
telephony-level `_transfer_fn` / `_hangup_fn` already attached to
`PipelineStreamHandler`. Built-ins are skipped when the
corresponding telephony fn is missing (keeps the non-telephony test
harness path clean). Verified end-to-end against `gpt-4o-mini` on
Twilio: caller says "transfer me", LLM emits
`transfer_call({"number": "+1..."})`, `_twilio_transfer` fires, the
call bridges. 6 new unit tests in
`libraries/python/tests/test_pipeline_builtin_tools.py`. Thanks
[@knowsuchagency](https://github.com/knowsuchagency).

- **TypeScript: pipeline mode missing built-in `transfer_call` /
`end_call` tools — parity fix for #115.** Both `new LLMLoop(...)`
call sites in `libraries/typescript/src/stream-handler.ts:1891` and
`:1906` were passing `agent.tools` through unchanged; the built-ins
shipped in `server.ts` (now exported as `TRANSFER_CALL_TOOL` /
`END_CALL_TOOL`) were only injected into the Realtime path at
`server.ts:374`. Added `augmentWithBuiltinHandoffTools` in
`libraries/typescript/src/stream-handler.ts` that mirrors the Python
helper: appends the two built-ins with handler closures that
validate E.164 / default `reason` and dispatch to the existing
telephony bridge methods (`this.deps.bridge.transferCall` /
`endCall`). 8 new unit tests in
`libraries/typescript/tests/pipeline-builtin-tools.test.ts`. Closes
the parity gap surfaced by #115.

- **Docs: `docs/typescript-sdk/events.mdx` advertised the same
non-existent `phone.events.on(PatterEventType.X, handler)` API as
the Python events page — TypeScript parity fix for #114.** The TS
`Patter` class never exposed an `.events` attribute; `EventBus` is
instantiated per `StreamHandler`. Replaced the broken `EventBus`
section with documentation of the APIs that actually exist on the
TypeScript `Patter` class: **Speech-edge events** via the attribute
setters (`onUserSpeechStarted` / `onUserSpeechEnded` /
`onUserSpeechEos` / `onAgentSpeechStarted` / `onAgentSpeechEnded` /
`onLlmToken` / `onAudioOut`, proxied to `this.speechEvents` at
`libraries/typescript/src/client.ts:241-330`) and **Tool events via
`onTranscript`** (tool invocations surface with `role === "tool"`,
`tool_name`, `tool_args`, `tool_result` — payload defined at
`libraries/typescript/src/stream-handler.ts:2988-3010`).

- **Docs: `docs/python-sdk/events.mdx` advertised a non-existent
`phone.events.on(PatterEventType.X, handler)` API that crashed
immediately with `AttributeError: 'Patter' object has no attribute
'events'` (#112, #114).** The `_EventBus` is instantiated per
`StreamHandler` (`libraries/python/getpatter/stream_handler.py:517`)
and never exposed on the `Patter` class. Replaced the broken
`EventBus` section with documentation of the APIs that actually
exist: **Speech-edge events** via the attribute setters on `Patter`
(`on_user_speech_started` / `on_user_speech_ended` /
`on_user_speech_eos` / `on_agent_speech_started` /
`on_agent_speech_ended` / `on_llm_token` / `on_audio_out`, proxied
to `self.speech_events` at
`libraries/python/getpatter/client.py:351-410`) and **Tool events
via `on_transcript`** (tool invocations surface with `role="tool"`,
`tool_name`, `tool_args`, `tool_result` — payload defined at
`libraries/python/getpatter/stream_handler.py:929`). Thanks
[@knowsuchagency](https://github.com/knowsuchagency).

## 0.6.2 (2026-05-25)

### Added
Expand Down
75 changes: 62 additions & 13 deletions docs/typescript-sdk/events.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Patter emits events at key moments during a call. Register callbacks in `serve()
| `onMessage` | User transcript ready for response (pipeline mode) | `serve()` |
| `onMetrics` | After each conversational turn completes | `serve()` |

For **fine-grained pipeline observability** (every interim transcript, every LLM chunk, every TTS chunk, every tool start) subscribe to the [EventBus](#eventbus) below — it complements these callbacks rather than replacing them.
For **fine-grained pipeline observability** see the [Speech-edge events](#speech-edge-events) and [Tool events via onTranscript](#tool-events-via-ontranscript) sections below — they complement the lifecycle callbacks rather than replacing them.

For **mutating prompts and responses** (RAG augmentation, output validation, PII redaction) use [PipelineHooks](#pipelinehooks-beforellm--afterllm) — they sit *inside* the LLM step rather than firing alongside it.

Expand Down Expand Up @@ -241,26 +241,75 @@ await phone.serve({
}
```

## EventBus
## Speech-edge events

The `EventBus` exposes fine-grained pipeline events that don't have first-class callbacks. Subscribe with `events.on(eventType, handler)` from anywhere you have a `Patter` reference.
For turn-taking, TTFT measurement, and barge-in / interrupt observability, set the speech-edge callbacks directly on the `Patter` instance. They proxy to a per-process `SpeechEvents` dispatcher and fire from any in-flight call.

```typescript
import { Patter, PatterEventType } from "getpatter";
phone.onUserSpeechEos = async (ev) => {
// Committed end-of-utterance — anchor TTFT here.
console.log(`EOS via ${ev.trigger} at ${ev.timestamp_ms}ms`);
};

phone.onLlmToken = async (ev) => {
// First LLM token of the turn — TTFT marker.
console.log(`TTFT, model=${ev.model}, t=${ev.timestamp_ms}ms`);
};

phone.events.on(PatterEventType.TRANSCRIPT_PARTIAL, (ev) => console.log("partial:", ev.text));
phone.events.on(PatterEventType.LLM_CHUNK, (ev) => logChunk(ev.call_id, ev.text));
phone.onAgentSpeechEnded = async (ev) => {
const status = ev.interrupted ? "interrupted" : "completed";
console.log(`Turn ${ev.turn_idx} ${status}`);
};

phone.onUserSpeechStarted = async (ev) => { /* raw VAD positive edge */ };
phone.onUserSpeechEnded = async (ev) => { /* raw VAD trailing edge */ };
phone.onAgentSpeechStarted = async (ev) => { /* first wire-time audio chunk */ };
phone.onAudioOut = async (ev) => { /* first TTS audio bytes produced */ };
```

| `PatterEventType` | Fires |
| Attribute | Fires |
|---|---|
| `TRANSCRIPT_PARTIAL` | Every interim STT result (before endpointing). |
| `TRANSCRIPT_FINAL` | Every final STT result (after endpointing). Same payload as `onTranscript`. |
| `LLM_CHUNK` | Every streamed LLM token / chunk. |
| `TTS_CHUNK` | Every TTS audio chunk written to the carrier. |
| `TOOL_CALL_STARTED` | Tool dispatched (paired with the existing `tool_call_completed` you can observe via `onCallEnd`). |
| `onUserSpeechStarted` | Raw VAD positive edge (caller begins speaking). |
| `onUserSpeechEnded` | Raw VAD trailing edge (caller stops speaking). |
| `onUserSpeechEos` | Committed end-of-utterance — anchor TTFT here. |
| `onAgentSpeechStarted` | First wire-time agent audio chunk — turn-start marker for the caller. |
| `onAgentSpeechEnded` | Last agent audio chunk. Payload includes `interrupted` flag for barge-in. |
| `onLlmToken` | First LLM token of the turn — TTFT marker. |
| `onAudioOut` | First TTS audio bytes produced — TTS warmup signal. |

Callbacks are async. Throwing inside a callback logs the error but does not interrupt the call.

## Tool events via `onTranscript`

Handlers are non-blocking (fire-and-forget). Throwing inside a handler logs the error but does not interrupt the call.
Tool invocations (including the built-in `transfer_call` and `end_call`) surface through the same `onTranscript` callback you pass to `phone.serve(...)`. Filter on `role === "tool"` to handle them:

```typescript
await phone.serve({
agent,
onTranscript: async (ev) => {
if (ev.role === "tool") {
console.log(
`tool=${ev.tool_name} ` +
`args=${JSON.stringify(ev.tool_args)} ` +
`result=${ev.tool_result}`,
);
} else {
console.log(`[${ev.role}] ${ev.text}`);
}
},
});
```

The event payload for tool calls carries:

| Key | Type | Notes |
|---|---|---|
| `role` | `"tool"` | Always `"tool"` for tool events. |
| `tool_name` | `string` | The tool that was dispatched. |
| `tool_args` | `Record<string, unknown>` | Arguments emitted by the LLM. |
| `tool_result` | `string \| null` | Result returned by the tool handler (truncated for log readability). |
| `call_id` | `string` | The active call ID. |
| `text` | `string` | Pre-formatted `tool_name(args) → result` string. |

---

Expand Down
4 changes: 2 additions & 2 deletions libraries/typescript/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions libraries/typescript/src/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ export interface LocalConfig {

type AIAdapter = OpenAIRealtimeAdapter | ElevenLabsConvAIAdapter;

const TRANSFER_CALL_TOOL = {
export const TRANSFER_CALL_TOOL = {
name: 'transfer_call',
description: 'Transfer the call to a human agent at the specified phone number',
parameters: {
Expand All @@ -75,7 +75,7 @@ const TRANSFER_CALL_TOOL = {
},
};

const END_CALL_TOOL = {
export const END_CALL_TOOL = {
name: 'end_call',
description: 'End the current phone call. Use when the conversation is complete or the user says goodbye.',
parameters: {
Expand Down
75 changes: 72 additions & 3 deletions libraries/typescript/src/stream-handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ import { MCPManager } from './tools/mcp-client';
import type { AgentOptions, Guardrail, HookContext, PipelineMessageHandler, ToolDefinition, VADProvider } from './types';
import type { MetricsStore } from './dashboard/store';
import { getLogger } from './logger';
import { validateTwilioSid } from './server';
import { validateTwilioSid, TRANSFER_CALL_TOOL, END_CALL_TOOL } from './server';
import type { ProviderPricing } from './pricing';
import { SentenceChunker } from './sentence-chunker';
import { PipelineHookExecutor } from './pipeline-hooks';
Expand Down Expand Up @@ -117,6 +117,56 @@ function isValidE164(number: string): boolean {
return /^\+[1-9]\d{6,14}$/.test(number);
}

/**
* Augment a tool list with the built-in `transfer_call` / `end_call` tools,
* wired to the telephony-level transfer / hangup callbacks. Used by pipeline
* mode to match the Realtime path's tool surface (Realtime injects the same
* two built-ins at `server.ts` and dispatches them via the bridge in this
* file's tool dispatcher around line 3100). Without this the pipeline LLM
* never sees the built-ins and cannot initiate a transfer or hangup
* regardless of system-prompt instructions. Parity with Python helper
* `_augment_with_builtin_handoff_tools` in `stream_handler.py`.
*
* Built-ins are skipped when the corresponding callback is missing (keeps
* non-telephony test harnesses clean). User-provided tools keep their
* original order; the built-ins are appended.
*/
export function augmentWithBuiltinHandoffTools(
userTools: ToolDefinition[] | null | undefined,
callbacks: {
transferCall?: (number: string) => Promise<void>;
endCall?: (reason: string) => Promise<void>;
},
): ToolDefinition[] {
const out: ToolDefinition[] = [...(userTools ?? [])];
if (callbacks.transferCall) {
const transferCall = callbacks.transferCall;
out.push({
...TRANSFER_CALL_TOOL,
handler: async (args: Record<string, unknown>): Promise<string> => {
const number = typeof args.number === 'string' ? args.number : '';
if (!isValidE164(number)) {
return JSON.stringify({ error: 'Invalid phone number format', status: 'rejected' });
}
await transferCall(number);
return JSON.stringify({ status: 'transferring', to: number });
},
});
}
if (callbacks.endCall) {
const endCall = callbacks.endCall;
out.push({
...END_CALL_TOOL,
handler: async (args: Record<string, unknown>): Promise<string> => {
const reason = typeof args.reason === 'string' ? args.reason : 'conversation_complete';
await endCall(reason);
return JSON.stringify({ status: 'ending', reason });
},
});
}
return out;
}

/**
* Short words / phrases that Whisper (and, less often, Deepgram) routinely
* emit when fed silence or TTS echo on mulaw 8 kHz. Dropping them as turns
Expand Down Expand Up @@ -1888,11 +1938,23 @@ export class StreamHandler {
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const providerModel = (this.deps.agent.llm as any)?.model ?? '';
// Inject the built-in transfer_call / end_call tools — parity with the
// Realtime path which injects them at `server.ts` and dispatches via
// the bridge in this file's tool dispatcher. Without this, pipeline-mode
// LLMs never see the built-ins and can't initiate a handoff or hangup
// no matter what the system prompt says.
const augmentedTools = augmentWithBuiltinHandoffTools(
this.deps.agent.tools as ToolDefinition[] | null | undefined,
{
transferCall: (number) => this.deps.bridge.transferCall(this.callId, number),
endCall: () => this.deps.bridge.endCall(this.callId, this.ws),
},
);
this.llmLoop = new LLMLoop(
'', // apiKey unused when llmProvider is supplied
providerModel, // propagate so calculateLlmCost can match the price row
resolvedPrompt,
this.deps.agent.tools as ToolDefinition[] | undefined,
augmentedTools,
this.deps.agent.llm,
this.deps.agent.disablePhonePreamble ?? false,
);
Expand All @@ -1903,11 +1965,18 @@ export class StreamHandler {
} else if (!this.deps.onMessage && this.deps.config.openaiKey) {
let llmModel = this.deps.agent.model || 'gpt-4o-mini';
if (llmModel.includes('realtime')) llmModel = 'gpt-4o-mini';
const augmentedTools = augmentWithBuiltinHandoffTools(
this.deps.agent.tools as ToolDefinition[] | null | undefined,
{
transferCall: (number) => this.deps.bridge.transferCall(this.callId, number),
endCall: () => this.deps.bridge.endCall(this.callId, this.ws),
},
);
this.llmLoop = new LLMLoop(
this.deps.config.openaiKey,
llmModel,
resolvedPrompt,
this.deps.agent.tools as ToolDefinition[] | undefined,
augmentedTools,
undefined,
this.deps.agent.disablePhonePreamble ?? false,
);
Expand Down
Loading
Loading