Surfaced by the Firefly platform observability E2E audit (2026-06-25, firefly-prod).
H6 — No distributed trace propagation
Every trace is confined to a single service; there is no cross-service correlation.
Evidence (GreptimeDB traces.opentelemetry_traces, firefly-prod):
- Span kinds present: only
SPAN_KIND_SERVER and SPAN_KIND_INTERNAL — 0 CLIENT/PRODUCER/CONSUMER spans.
- Every
trace_id maps to exactly one service_name (0 multi-service traces over 15 min).
- Instrumentation scope is
org.springframework.boot, not the OTel Java agent.
Without client spans + W3C traceparent propagation there is no root-cause diagnosis across services.
Ask: enable client auto-instrumentation (RestTemplate/WebClient/Feign/messaging) and W3C trace-context propagation in the observability foundation so outbound calls create CLIENT/PRODUCER spans and propagate traceparent.
M3 — Spans never carry error status
span_status_code = STATUS_CODE_ERROR count is 0 despite real production traffic and 5xx/exceptions.
Ask: ensure the instrumentation sets span.setStatus(ERROR) / recordException on 5xx responses and unhandled exceptions, so error spans are queryable/alertable.
Coverage context: traces flow for 76/78 firefly-prod services with complete k8s labels; these are completeness/quality gaps, not a transport break.
Surfaced by the Firefly platform observability E2E audit (2026-06-25, firefly-prod).
H6 — No distributed trace propagation
Every trace is confined to a single service; there is no cross-service correlation.
Evidence (GreptimeDB
traces.opentelemetry_traces, firefly-prod):SPAN_KIND_SERVERandSPAN_KIND_INTERNAL— 0CLIENT/PRODUCER/CONSUMERspans.trace_idmaps to exactly oneservice_name(0 multi-service traces over 15 min).org.springframework.boot, not the OTel Java agent.Without client spans + W3C
traceparentpropagation there is no root-cause diagnosis across services.Ask: enable client auto-instrumentation (RestTemplate/WebClient/Feign/messaging) and W3C trace-context propagation in the observability foundation so outbound calls create CLIENT/PRODUCER spans and propagate
traceparent.M3 — Spans never carry error status
span_status_code = STATUS_CODE_ERRORcount is 0 despite real production traffic and 5xx/exceptions.Ask: ensure the instrumentation sets
span.setStatus(ERROR)/recordExceptionon 5xx responses and unhandled exceptions, so error spans are queryable/alertable.Coverage context: traces flow for 76/78 firefly-prod services with complete k8s labels; these are completeness/quality gaps, not a transport break.