feat(observability): bridge OTLP metrics + logs to AMP/Loki and scrape hubble L7#65
Merged
Conversation
…e hubble L7 Completes the grafana-agent (Alloy) telemetry pipeline so two previously-dropped signals actually reach the backends. OTLP metrics + logs → AMP / Loki (#62). The otelcol.receiver.otlp listened on :4317/:4318 but only routed traces to Tempo — every OTLP metric and log a tenant app pushed was silently discarded, and no Service exposed the receiver ports, so grafana-agent.monitoring.svc:4318 wasn't even reachable. This: - wires the receiver's metrics + logs outputs through new otelcol.exporter.prometheus and otelcol.exporter.loki components into the same AMP remote-write (SigV4) and Loki sinks the scrape + tail pipelines already use; - exposes :4317/:4318 on the agent Service via agent.extraPorts, so workloads in other namespaces can reach the OTLP endpoint. This is the shared prerequisite for the per-tenant o11y retrofits — their metrics half was blocked on it. Hubble L7 metrics → AMP (#63). The hubble_http_* L7 flow metrics are served per cilium-agent pod on :9965 but exposed only via the headless hubble-metrics Service; the annotation-gated pod scrape can't reach them (the agent pod's prometheus.io/port is already its own :9962). Adds an endpoints-discovery scrape of the hubble-metrics service, so the hubble-overview dashboard renders. Validated offline: grafana/alloy fmt parses the config; alloy validate is identical to the deployed config except line-shifts of the pre-existing env() deprecation warnings (no new issues); helm template confirms the Service + container expose 4317/4318. End-to-end (metrics landing in AMP) verifies on a live cluster. Closes #62. Closes #63.
CI Results
All validations passed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Completes the grafana-agent (Alloy) pipeline so two previously-dropped signals reach the backends.
OTLP metrics + logs → AMP / Loki (#62)
The
otelcol.receiver.otlplistened on :4317/:4318 but only routed traces to Tempo — every OTLP metric/log a tenant pushed was silently dropped, and no Service exposed the receiver ports. This:metrics/logsoutputs through newotelcol.exporter.prometheus+otelcol.exporter.lokiinto the existing AMP (SigV4) remote-write + Loki sinks;agent.extraPortsso :4317/:4318 are reachable atgrafana-agent.monitoring.svc.This was the shared prerequisite gating the 4 per-tenant retrofits' metrics half.
Hubble L7 → AMP (#63)
hubble_http_*(:9965) is served per cilium-agent pod but exposed only via the headlesshubble-metricsService — unreachable by pod-annotation scrape. Adds an endpoints-discovery scrape so the hubble-overview board renders.Validation (offline)
grafana/alloy fmtparses the config;alloy validateis identical to the deployed config except line-shifts of the pre-existingenv()deprecation warnings — no new issues.helm templateconfirms the Service + container expose 4317/4318.Closes #62. Closes #63.