feat: add configurable liveness and readiness probes to OtelCollector#6680
Open
kyledharrington wants to merge 1 commit into
Open
feat: add configurable liveness and readiness probes to OtelCollector#6680kyledharrington wants to merge 1 commit into
kyledharrington wants to merge 1 commit into
Conversation
Expose LivenessProbe and ReadinessProbe fields on DynaKube.spec.templates.otelCollector so users can configure pod health checks for the OpenTelemetry collector StatefulSet. Both fields are optional *corev1.Probe pointers; when nil, no probe is applied (preserving existing behavior). The fields are added to the v1beta5 and v1beta6 (latest) API versions; v1beta4 is left unchanged and conversion drops the new fields when downgrading.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The DynaKube CR's
spec.templates.otelCollectorexposes fields for replicas, resources, tolerations, topology spread constraints, etc., but does not allow users to configure liveness or readiness probes for the OpenTelemetry Collector StatefulSet. As a result the collector container has no health checks — Kubernetes can't detect a hung or unhealthy collector, won't restart it, and won't gate traffic readiness on its health.This PR adds optional
livenessProbeandreadinessProbefields (both*corev1.Probe) toOpenTelemetryCollectorSpec, mirroring Kubernetes' ownPodSpecconvention so users can configure any probe type (HTTP/TCP/exec) with full control. When nil, no probe is applied — existing deployments are unaffected.Summary of changes
LivenessProbeandReadinessProbetoOpenTelemetryCollectorSpecinpkg/api/v1beta5/dynakube/opentelemetry.goandpkg/api/latest/dynakube/opentelemetry.go(v1beta6).pkg/controllers/dynakube/otelc/statefulset/container.go.zz_generated.deepcopy.goand the CRD manifest YAMLs viamake manifests.TestProbes) covering both the default (nil) case and the custom-probe case.v1beta4 is intentionally left unchanged; conversion drops the new fields when stepping down.
No GitHub issue or ticket is linked; this addresses a gap encountered when deploying the operator against an environment that requires health-gated rollouts.
How can this be tested?
A cluster with an existing DynaKube that has the OtelCollector active (via
spec.extensions.prometheusorspec.telemetryIngest) is sufficient.make deploy).spec.templates.otelCollector, for example:```yaml
spec:
templates:
otelCollector:
livenessProbe:
httpGet:
path: /
port: 13133
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /
port: 13133
initialDelaySeconds: 5
periodSeconds: 10
```
```sh
STS=$(kubectl get sts -n dynatrace -o name | grep -i otel | head -1)
kubectl get $STS -n dynatrace -o jsonpath='{.spec.template.spec.containers[0].livenessProbe}{"\n"}{.spec.template.spec.containers[0].readinessProbe}'
```
Both probes should print as JSON matching the CR. When the fields are unset on the CR, the container should have no probes (preserving previous behavior).
Note: the snippet above uses the otel-collector's default
health_checkextension port (13133) for illustration. The probe handler is fully user-controlled, so anexecortcpSocketprobe works equally well if you don't run the collector's HTTP health extension.