Skip to content

feat: Add internal event processing OTEL metrics#617

Merged
keelerm84 merged 10 commits intofeat/fdv2from
mk/sdk-2104/enhanced-telemetry
Apr 2, 2026
Merged

feat: Add internal event processing OTEL metrics#617
keelerm84 merged 10 commits intofeat/fdv2from
mk/sdk-2104/enhanced-telemetry

Conversation

@keelerm84
Copy link
Copy Markdown
Member

@keelerm84 keelerm84 commented Mar 31, 2026

Establish an EventMetrics interface pattern for reporting internal
event processing telemetry via OpenTelemetry. Each subsystem defines
its own interface; the metrics package provides an OTEL-backed
implementation (EventMetricsRecorder) that satisfies both the ld-relay
and go-sdk-events interfaces via Go structural typing.

New metrics under the events.received namespace:

  • events.received.bytes: bytes of event data received from SDKs
    (renamed from events.ingested.bytes)

New metrics under the events.sent namespace:

  • events.sent.count: events successfully delivered to LaunchDarkly
  • events.sent.bytes: payload bytes delivered (pre-compression)
  • events.sent.failures: events in batches that failed after retries
  • events.sent.dropped: events discarded due to capacity overflow
  • events.sent.pending: current number of events buffered (gauge)

All metrics are recorded from both the verbatim relay
(HTTPEventPublisher) and the summarizing relay (go-sdk-events
DefaultEventProcessor) paths. A temporary go.mod replace directive
points to the local go-sdk-events worktree for development.


Note

Medium Risk
Adds new OpenTelemetry instruments and wires them into the async event forwarding path (verbatim + summarizing), which could affect event pipeline performance/behavior if mis-instrumented, but is largely additive and guarded with no-op defaults.

Overview
Adds internal event-processing telemetry via a new events.EventMetrics interface, with an OTEL-backed EventMetricsRecorder that tracks event queue depth, dropped events, bytes sent, successful sends, and failed sends (including statusCode).

Renames the existing ingress metric from launchdarkly.relay.events.ingested.bytes to launchdarkly.relay.events.received.bytes, updates middleware/docs/tests accordingly, and threads the metrics recorder into both the verbatim (HTTPEventPublisher) and summarizing (go-sdk-events processor) relay paths.

Also updates CI to grant integration-test workflow id-token/contents permissions and bumps go-sdk-events to v3.6.0 alongside related module tidy changes.

Written by Cursor Bugbot for commit 26f084d. This will update automatically on new commits. Configure here.

Establish an EventMetrics interface pattern for reporting internal
event processing telemetry via OpenTelemetry. Each subsystem defines
its own interface; the metrics package provides an OTEL-backed
implementation (EventMetricsRecorder) that satisfies both the ld-relay
and go-sdk-events interfaces via Go structural typing.

New metrics under the events.received namespace:
- events.received.bytes: bytes of event data received from SDKs
  (renamed from events.ingested.bytes)

New metrics under the events.sent namespace:
- events.sent.count: events successfully delivered to LaunchDarkly
- events.sent.bytes: payload bytes delivered (pre-compression)
- events.sent.failures: events in batches that failed after retries
- events.sent.dropped: events discarded due to capacity overflow
- events.sent.pending: current number of events buffered (gauge)

All metrics are recorded from both the verbatim relay
(HTTPEventPublisher) and the summarizing relay (go-sdk-events
DefaultEventProcessor) paths. A temporary go.mod replace directive
points to the local go-sdk-events worktree for development.
@keelerm84 keelerm84 requested a review from a team as a code owner March 31, 2026 16:59
@keelerm84 keelerm84 requested a review from kinyoklion March 31, 2026 17:00
The test helper was manually constructing a partial Instruments struct
with only the original 4 fields, leaving the 5 new event metric
instruments as nil. This could cause panics if any test exercises
EventMetricsRecorder through this helper, since the recorder
nil-checks the struct pointer but not individual instrument fields.

Switch to NewInstrumentsForTest which creates a complete Instruments
struct with all fields populated. Add TestEventMetricsRecorderViaTestHelper
to exercise the recorder through the test helper and catch any future
regressions.
The mock's fields are written by the event loop goroutine and flush
goroutines while tests read them concurrently via assert.Eventually.
Add a sync.Mutex and thread-safe accessor methods to prevent races
under go test -race.
When the publisher receives an unrecoverable failure and clears its
queues, the pending events gauge was not being reset to 0, leaving a
stale non-zero value in the metric.
Run gofmt on metrics package files and preallocate the opts slice in
newEventVerbatimRelay with the correct capacity.
…race

attribute.NewSet() sorts the input slice in place, which caused a data
race when multiple goroutines concurrently accessed the shared envKVs
slice (the event loop goroutine via RecordPendingEvents and HTTP
handler goroutines via buildRequestAttributes).

Fix by making a private copy of envKVs at recorder construction time
and pre-computing the attribute.Set once. Methods that only need env
attributes reuse the pre-computed set. RecordEventsFailedSend still
builds a fresh set per call since it adds the statusCode attribute,
but copies from the recorder's private slice which is safe.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Align the struct field, function, and doc comment names with the
renamed metric (events.received.bytes). Also remove a duplicated
doc comment on EventMetricsRecorder.
Add a no-op implementation of EventMetrics used as the default when no
metrics implementation is provided. Removes all nil checks at call
sites in the publisher and dispatcher, following the null object
pattern.
@keelerm84 keelerm84 merged commit e3403b6 into feat/fdv2 Apr 2, 2026
14 of 16 checks passed
@keelerm84 keelerm84 deleted the mk/sdk-2104/enhanced-telemetry branch April 2, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants