Skip to content

Commit 13e7f11

Browse files
etiennepclaude
andcommitted
otlp: refactor config API and deprecate legacy handler
This commit improves the OTLP handler API and prepares for v6 by deprecating the legacy alpha handler implementation. Breaking Changes: - Rename SDKConfig.Endpoint to EndpointURL to clarify it requires a full URL with scheme (http:// or https://) - Use WithEndpointURL instead of WithEndpoint to avoid known gRPC bug when using http:// scheme - Remove enforced defaults for ExportInterval and ExportTimeout, allowing SDK to use its own defaults (60s and 30s respectively) Deprecations: - Deprecate otlp.Handler (Alpha since 2022, minimal usage) - Deprecate otlp.HTTPClient - Deprecate otlp.NewHTTPClient() All will be removed in v6.0.0. Migration path provided in deprecation notices with code examples. Improvements: - Add Example_fullyConfiguredByEnvironment showing empty SDKConfig usage - Enhance IMPLEMENTATION_NOTES.md with user-focused temporality explanation - Clarify instrument caching is internal implementation detail - Update all examples and tests to use EndpointURL with proper scheme - Add blank lines around markdown lists to fix linting warnings Documentation: - Update HISTORY.md with comprehensive v5.9.0 release notes - Document breaking changes and migration path - Add notes about exponential histograms and temporality configuration - Update all code examples throughout documentation All tests pass. No functional changes to SDKHandler behavior. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent 82249fe commit 13e7f11

8 files changed

Lines changed: 262 additions & 97 deletions

File tree

HISTORY.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# History
22

3-
### v5.9.0 (February 6, 2026)
3+
### v5.9.0 (February 19, 2026)
44

55
Add full OpenTelemetry OTLP exporter support with official SDK integration.
66

7-
**New Feature: OpenTelemetry OTLP Exporter**
7+
**New Feature: OpenTelemetry OTLP Exporter (SDKHandler)**
88

99
The `otlp` package now includes a production-ready `SDKHandler` that uses the
1010
official OpenTelemetry SDK with comprehensive support for modern observability
@@ -17,6 +17,10 @@ requirements:
1717
- **Automatic Resource Detection**: Built-in support for AWS (EC2, ECS, EKS, Lambda),
1818
GCP (Compute Engine), Azure (VM), Kubernetes, host, and process metadata
1919
- **All Metric Types**: Counter, Gauge, and Histogram with proper semantics
20+
- **Exponential Histograms**: Optional support for exponential histogram aggregation
21+
with configurable bucket size and scale
22+
- **Temporality Configuration**: Configurable metric temporality (cumulative or delta)
23+
with cumulative as the default for Prometheus compatibility
2024
- **Tag Preservation**: Automatic conversion of stats tags to OpenTelemetry attributes
2125
- **Production Ready**: Thread-safe instrument caching, proper context handling,
2226
and comprehensive error handling
@@ -40,19 +44,33 @@ stats.Register(handler)
4044

4145
// Or with explicit configuration
4246
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
43-
Protocol: otlp.ProtocolGRPC,
44-
Endpoint: "localhost:4317",
47+
Protocol: otlp.ProtocolGRPC,
48+
EndpointURL: "http://localhost:4317",
4549
})
4650
```
4751

4852
**Implementation Details:**
4953

50-
- Gauges use `UpDownCounter` with delta calculation to maintain absolute value
51-
semantics (workaround until stable OTel SDK adds Gauge instrument)
54+
- Gauges use native `Float64Gauge` instrument for instantaneous value recording
5255
- Background context for metric recording to prevent context cancellation issues
53-
- Lock-free reads for instrument lookup in the hot path
56+
- Efficient two-level locking pattern for instrument caching (read locks in hot path)
57+
- Cumulative temporality by default (Prometheus-compatible)
5458
- Comprehensive documentation including cloud resource detector examples
5559

60+
**Breaking Changes:**
61+
62+
- Config field renamed: `Endpoint``EndpointURL` (must include `http://` or `https://` scheme)
63+
- SDK defaults are now used instead of hardcoded values (ExportInterval: 60s, ExportTimeout: 30s)
64+
65+
**Deprecated:**
66+
67+
- `otlp.Handler` is now deprecated in favor of `otlp.SDKHandler` (will be removed in v6.0.0)
68+
- `otlp.HTTPClient` is now deprecated in favor of `otlp.SDKHandler` with `ProtocolHTTPProtobuf` (will be removed in v6.0.0)
69+
- `otlp.NewHTTPClient()` is now deprecated (will be removed in v6.0.0)
70+
71+
The legacy `Handler` has been marked as Alpha since 2022 and has minimal to zero usage.
72+
Migration is straightforward - see deprecation notices in code for examples.
73+
5674
See the [otlp package documentation](./otlp/README.md) for complete details and examples.
5775

5876
### v5.8.0 (December 15, 2025)

otlp/IMPLEMENTATION_NOTES.md

Lines changed: 86 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -46,17 +46,19 @@ gauge.Record(ctx, 42.0, opts)
4646

4747
### 3. Instrument Caching
4848

49-
**Implementation**: Thread-safe two-level locking pattern
49+
**Note**: This is an internal implementation detail - users don't need to worry about this.
50+
51+
**Implementation**: Thread-safe two-level locking pattern for efficient instrument reuse
5052
```go
51-
// Fast path: read lock for lookup
53+
// Fast path: read lock for lookup (common case - instrument already exists)
5254
h.mu.RLock()
5355
inst, exists := h.instruments[metricName]
5456
h.mu.RUnlock()
5557

56-
// Slow path: write lock only if creating new instrument
58+
// Slow path: write lock only if creating new instrument (rare - first time seeing this metric)
5759
if !exists {
5860
h.mu.Lock()
59-
// Double-check after acquiring write lock
61+
// Double-check after acquiring write lock (another goroutine may have created it)
6062
inst, exists = h.instruments[metricName]
6163
if !exists {
6264
inst = h.createInstruments(meter, metricName, field.Type())
@@ -66,7 +68,13 @@ if !exists {
6668
}
6769
```
6870

69-
**Why**: Instruments are created once per metric name and reused. This pattern minimizes lock contention in the hot path (metric recording) while ensuring thread-safety during instrument creation.
71+
**Why**: OpenTelemetry instruments are expensive to create but cheap to reuse. This pattern:
72+
73+
- **Minimizes lock contention** in the hot path (metric recording uses fast read locks)
74+
- **Ensures thread-safety** during instrument creation (write locks only when needed)
75+
- **Scales well** under concurrent load (multiple goroutines can look up instruments simultaneously)
76+
77+
The double-check pattern prevents duplicate instrument creation when multiple goroutines race to create the same instrument for the first time.
7078

7179
### 4. Attribute Handling
7280

@@ -227,38 +235,90 @@ if config.ExponentialHistogram {
227235

228236
## Temporality Configuration
229237

230-
### Default: Cumulative Temporality
238+
### What is Temporality?
231239

232-
**Decision**: Use cumulative temporality for all metric instruments (Prometheus-compatible)
240+
Temporality determines whether metrics are reported as **cumulative totals** (since application start) or **deltas** (change since last export).
233241

234-
**Implementation**: OTLP exporters use `DefaultTemporalitySelector` by default
235-
```go
236-
// If no TemporalitySelector is provided, the exporter uses:
237-
// DefaultTemporalitySelector -> CumulativeTemporality for all instruments
238-
```
242+
**Example - Request Counter**:
243+
- **Cumulative**: Export "1000 total requests" → "1150 total requests" → "1320 total requests"
244+
- **Delta**: Export "1000 new requests" → "150 new requests" → "170 new requests"
239245

240-
**Why**:
241-
- **Prometheus compatibility**: Prometheus expects cumulative counters
242-
- **Standard practice**: Most OTLP backends expect cumulative semantics
243-
- **Query simplicity**: Easier to query and understand (total since start)
244-
- **No data loss**: Cumulative data can be converted to delta, but not vice versa
246+
### Why We Use Cumulative Temporality (Default)
247+
248+
This handler uses **cumulative temporality** for all metrics by default. Here's why:
249+
250+
#### Compatibility with Prometheus and Standard Backends
251+
252+
- Prometheus expects cumulative counters and will graph them correctly
253+
- Most OTLP backends (Grafana, Datadog, etc.) work best with cumulative data
254+
- Industry standard practice in the OpenTelemetry ecosystem
255+
256+
#### Reliability and Query Simplicity
257+
258+
- **No data loss on export failures**: If an export fails, the next one still has complete data
259+
- **Easier to query**: "How many total requests?" vs "Sum all deltas"
260+
- **Converts to delta easily**: Backend can calculate rates from cumulative, but can't reconstruct cumulative from deltas
261+
262+
#### Lower Cognitive Load
263+
264+
- Counters show totals since start - intuitive and matches mental model
265+
- Histograms show full distribution of all observations
266+
267+
### How It Works
268+
269+
**Cumulative semantics by instrument type**:
245270

246-
**Cumulative semantics by instrument**:
247-
- **Counter**: Total count since application start (e.g., total requests)
248-
- **Histogram**: Cumulative distribution of all observed values
249-
- **UpDownCounter/Gauge**: Current absolute value (naturally stateful)
271+
- **Counter** (`stats.Incr`, `stats.Add`): Total count since application start
272+
- Example: `requests.total` reports 1000, then 1150, then 1320
273+
274+
- **Histogram** (`stats.Observe`): Cumulative distribution of all observed values
275+
- Example: Latency histogram includes all requests since start
276+
277+
- **Gauge** (`stats.Set`): Current absolute value (temporality doesn't apply)
278+
- Example: `memory.used` always reports current memory usage
279+
280+
### Trade-offs
281+
282+
#### Advantages of Cumulative
283+
284+
- ✅ Prometheus and Grafana work out-of-box
285+
- ✅ Resilient to export failures (no data loss)
286+
- ✅ Backend can derive rates automatically
287+
- ✅ Simpler mental model for most users
288+
289+
#### Disadvantages of Cumulative
290+
291+
- ❌ Slightly higher memory usage for high-cardinality counters
292+
- ❌ Backend must calculate deltas for rate queries (minor overhead)
293+
- ❌ Some specialized telemetry systems expect delta temporality
294+
295+
#### When Delta Might Be Better
296+
297+
- Your backend explicitly requires delta temporality (check docs)
298+
- Extreme cardinality where cumulative memory overhead matters
299+
- Building a custom metrics pipeline optimized for deltas
300+
301+
### Changing Temporality (Advanced)
302+
303+
If you need delta temporality, you can override the default:
250304

251-
**User override**: Advanced users can specify custom temporality via `TemporalitySelector`:
252305
```go
253306
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
307+
Protocol: otlp.ProtocolGRPC,
308+
EndpointURL: "http://localhost:4317",
309+
// Use delta temporality for all instruments
254310
TemporalitySelector: sdkmetric.DeltaTemporalitySelector,
255311
})
256312
```
257313

258-
**Trade-offs**:
259-
- **Memory**: Cumulative uses slightly more memory than delta for high-cardinality counters
260-
- **Backend requirements**: Some specialized backends prefer delta temporality
261-
- **Migration**: Changing temporality requires coordinated backend configuration changes
314+
**Available selectors**:
315+
316+
- `sdkmetric.DefaultTemporalitySelector` - Cumulative for all (default, recommended)
317+
- `sdkmetric.CumulativeTemporalitySelector` - Cumulative for all (explicit)
318+
- `sdkmetric.DeltaTemporalitySelector` - Delta for all
319+
- `sdkmetric.LowMemoryTemporalitySelector` - Delta for Counters/Histograms, Cumulative for UpDownCounters
320+
321+
**⚠️ Warning**: Changing temporality requires updating your backend configuration and queries. Most users should stick with the default cumulative temporality.
262322

263323
## Future Enhancements
264324

otlp/README.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ func main() {
3838
// Create handler with gRPC transport
3939
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
4040
Protocol: otlp.ProtocolGRPC,
41-
Endpoint: "localhost:4317",
41+
EndpointURL: "http://localhost:4317",
4242
})
4343
if err != nil {
4444
log.Fatal(err)
@@ -59,7 +59,7 @@ func main() {
5959
```go
6060
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
6161
Protocol: otlp.ProtocolHTTPProtobuf,
62-
Endpoint: "http://localhost:4318",
62+
EndpointURL: "http://localhost:4318",
6363
})
6464
```
6565

@@ -83,7 +83,7 @@ type SDKConfig struct {
8383
// Protocol: "grpc" or "http/protobuf" (default: "grpc")
8484
Protocol Protocol
8585

86-
// Endpoint: OTLP collector endpoint
86+
// EndpointURL: Full OTLP collector endpoint URL (with http:// or https:// scheme)
8787
// gRPC: "localhost:4317"
8888
// HTTP: "http://localhost:4318"
8989
Endpoint string
@@ -150,7 +150,7 @@ import (
150150

151151
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
152152
Protocol: otlp.ProtocolGRPC,
153-
Endpoint: "collector.example.com:4317",
153+
EndpointURL: "http://collector.example.com:4317",
154154
GRPCOptions: []otlpmetricgrpc.Option{
155155
// Use TLS
156156
otlpmetricgrpc.WithTLSCredentials(
@@ -173,7 +173,7 @@ import "go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
173173

174174
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
175175
Protocol: otlp.ProtocolHTTPProtobuf,
176-
Endpoint: "https://collector.example.com:4318",
176+
EndpointURL: "https://collector.example.com:4318",
177177
HTTPOptions: []otlpmetrichttp.Option{
178178
// Add custom headers
179179
otlpmetrichttp.WithHeaders(map[string]string{
@@ -209,7 +209,7 @@ res, err := resource.New(ctx,
209209

210210
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
211211
Protocol: otlp.ProtocolGRPC,
212-
Endpoint: "localhost:4317",
212+
EndpointURL: "http://localhost:4317",
213213
Resource: res,
214214
})
215215
```
@@ -377,7 +377,7 @@ res, err := resource.New(ctx,
377377

378378
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
379379
Protocol: otlp.ProtocolGRPC,
380-
Endpoint: "localhost:4317",
380+
EndpointURL: "http://localhost:4317",
381381
Resource: res,
382382
})
383383
```
@@ -428,7 +428,7 @@ func main() {
428428
// Create handler with detected resources
429429
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
430430
Protocol: otlp.ProtocolGRPC,
431-
Endpoint: "collector.us-west-2.amazonaws.com:4317",
431+
EndpointURL: "http://collector.us-west-2.amazonaws.com:4317",
432432
Resource: res,
433433
})
434434
if err != nil {
@@ -452,13 +452,13 @@ Send metrics to multiple destinations:
452452
// Send to local collector
453453
localHandler, _ := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
454454
Protocol: otlp.ProtocolGRPC,
455-
Endpoint: "localhost:4317",
455+
EndpointURL: "http://localhost:4317",
456456
})
457457

458458
// Send to cloud service
459459
cloudHandler, _ := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
460460
Protocol: otlp.ProtocolHTTPProtobuf,
461-
Endpoint: "https://api.example.com/v1/metrics",
461+
EndpointURL: "https://api.example.com/v1/metrics",
462462
HTTPOptions: []otlpmetrichttp.Option{
463463
otlpmetrichttp.WithHeaders(map[string]string{
464464
"Authorization": "Bearer " + apiKey,
@@ -545,7 +545,7 @@ By default, histograms use explicit bucket aggregation with fixed bucket boundar
545545
```go
546546
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
547547
Protocol: otlp.ProtocolGRPC,
548-
Endpoint: "localhost:4317",
548+
EndpointURL: "http://localhost:4317",
549549
ExponentialHistogram: true, // Enable exponential histograms
550550
})
551551
```
@@ -561,7 +561,7 @@ handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
561561
```go
562562
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
563563
Protocol: otlp.ProtocolGRPC,
564-
Endpoint: "localhost:4317",
564+
EndpointURL: "http://localhost:4317",
565565
ExponentialHistogram: true,
566566
ExponentialHistogramMaxSize: 160, // Max buckets (default: 160)
567567
ExponentialHistogramMaxScale: 20, // Max resolution (default: 20)
@@ -596,7 +596,7 @@ For advanced use cases, you can configure custom temporality:
596596
```go
597597
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
598598
Protocol: otlp.ProtocolGRPC,
599-
Endpoint: "localhost:4317",
599+
EndpointURL: "http://localhost:4317",
600600
TemporalitySelector: sdkmetric.DeltaTemporalitySelector, // Use delta for all metrics
601601
})
602602
```
@@ -624,7 +624,7 @@ The handler uses **native OpenTelemetry SDK batching** via `PeriodicReader`:
624624
```go
625625
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
626626
Protocol: otlp.ProtocolGRPC,
627-
Endpoint: "localhost:4317",
627+
EndpointURL: "http://localhost:4317",
628628
ExportInterval: 5 * time.Second, // Export every 5 seconds
629629
ExportTimeout: 15 * time.Second, // 15 second timeout per export
630630
})

otlp/client.go

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,28 @@ import (
1212
"google.golang.org/protobuf/proto"
1313
)
1414

15+
// Deprecated: Client is deprecated and will be removed in v6.
16+
// It is only used by the deprecated Handler. Use SDKHandler instead.
1517
type Client interface {
1618
Handle(context.Context, *colmetricpb.ExportMetricsServiceRequest) error
1719
}
1820

21+
// Deprecated: HTTPClient is deprecated and will be removed in v6.
22+
// Use SDKHandler with ProtocolHTTPProtobuf instead, which provides the official
23+
// OpenTelemetry SDK with retry logic, proper timeout handling, and full OTLP support.
24+
//
25+
// Migration example:
26+
//
27+
// // Old (deprecated)
28+
// client := otlp.NewHTTPClient("http://localhost:4318/v1/metrics")
29+
// handler := &otlp.Handler{Client: client}
30+
//
31+
// // New (recommended)
32+
// handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
33+
// Protocol: otlp.ProtocolHTTPProtobuf,
34+
// EndpointURL: "http://localhost:4318",
35+
// })
36+
//
1937
// HTTPClient implements the Client interface and is used to export metrics to
2038
// an OpenTelemetry Collector through the HTTP interface.
2139
//
@@ -26,6 +44,8 @@ type HTTPClient struct {
2644
endpoint string
2745
}
2846

47+
// Deprecated: NewHTTPClient is deprecated. Use SDKHandler with ProtocolHTTPProtobuf instead.
48+
// See HTTPClient documentation for migration example.
2949
func NewHTTPClient(endpoint string) *HTTPClient {
3050
return &HTTPClient{
3151
// TODO: add sane default timeout configuration.

0 commit comments

Comments
 (0)