[BOT ISSUE] Anthropic integration does not capture time_to_first_token metric

## Summary

The Anthropic (`anthropic`) integration does not capture the `time_to_first_token` metric in either streaming or non-streaming paths. Both OpenAI tracers in this same SDK (chat completions and responses) capture this metric consistently.

## What is missing

**1. No `startTime` field on `messagesTracer`**

The `messagesTracer` struct in `trace/contrib/anthropic/messages.go` (lines 20–24) has no `startTime` field:

```go
type messagesTracer struct {
    cfg       *middlewareConfig
    streaming bool
    metadata  map[string]any
}
```

Compare with the OpenAI `chatCompletionsTracer` in `trace/contrib/openai/chatcompletions.go` (lines 20–25):

```go
type chatCompletionsTracer struct {
    cfg       *middlewareConfig
    streaming bool
    metadata  map[string]any
    startTime time.Time  // <-- missing from Anthropic
}
```

**2. Streaming path does not track first chunk arrival**

In the Anthropic streaming handler `parseStreamingResponse` (messages.go lines 115–189), there is no tracking of when the first SSE data chunk arrives. The OpenAI streaming handler (chatcompletions.go lines 109–166) captures this:

```go
if timeToFirstToken == 0 {
    timeToFirstToken = time.Since(ct.startTime)
}
```

and writes it to metrics:

```go
metrics["time_to_first_token"] = timeToFirstToken.Seconds()
```

The Anthropic handler has no equivalent logic.

**3. Non-streaming path also missing**

The OpenAI non-streaming handler records TTFT as full response latency (chatcompletions.go line 277), providing a consistent metric across modes. The Anthropic non-streaming handler (`parseResponse` / `handleMessageResponse`, lines 314–369) does not record any timing metric.

**4. OpenAI has test coverage for TTFT; Anthropic does not**

The OpenAI integration has explicit test assertions for `time_to_first_token` (traceopenai_test.go lines 247–252, 344; chatcompletions_test.go line 1092). The Anthropic test suite has no TTFT assertions.

## Impact

`time_to_first_token` is a key latency metric for LLM observability, especially for streaming use cases. Users tracing Anthropic calls through Braintrust see token counts and content but not TTFT, while equivalent OpenAI traces include it. This creates an inconsistent observability experience across providers.

## Braintrust docs status

Braintrust docs state the Anthropic integration provides "metric collection (including cached tokens)" during streaming. TTFT is not explicitly mentioned for any provider. The Braintrust observability docs mention "Token counts, latency, and cost" as viewable metrics. Status: **unclear** (latency metrics are mentioned generically but TTFT is not called out by name).

## Upstream sources

- Anthropic streaming API docs: https://docs.anthropic.com/en/api/messages-streaming
- Anthropic streaming uses SSE with event types `message_start`, `content_block_start`, `content_block_delta`, `message_delta`, etc.
- The first `content_block_delta` event marks the arrival of the first generated token, making TTFT measurable from the same SSE stream already being parsed

## Braintrust docs sources

- https://www.braintrust.dev/docs/integrations/ai-providers/anthropic (mentions streaming metric collection)
- https://www.braintrust.dev/docs/observability (mentions "latency" as a viewable metric)

## Local repo files inspected

- `trace/contrib/anthropic/messages.go` — `messagesTracer` struct (no `startTime`), `parseStreamingResponse` (no TTFT), `handleMessageResponse` (no TTFT)
- `trace/contrib/openai/chatcompletions.go` — reference implementation with `startTime` and `time_to_first_token` in both paths
- `trace/contrib/openai/responses.go` — reference implementation with `startTime` and `time_to_first_token` in both paths
- `trace/contrib/openai/traceopenai_test.go` — TTFT test assertions (lines 247–252, 344)
- `trace/contrib/openai/chatcompletions_test.go` — TTFT test assertion (line 1092)
- `trace/contrib/anthropic/traceanthropic_test.go` — no TTFT assertions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BOT ISSUE] Anthropic integration does not capture time_to_first_token metric #53

Summary

What is missing

Impact

Braintrust docs status

Upstream sources

Braintrust docs sources

Local repo files inspected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BOT ISSUE] Anthropic integration does not capture time_to_first_token metric #53

Description

Summary

What is missing

Impact

Braintrust docs status

Upstream sources

Braintrust docs sources

Local repo files inspected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions