Summary
No metrics are currently collected from Cribl Stream, Cribl Edge, or the OTEL Collector itself. Without metrics, alerting and dashboards are impossible. This issue adds Prometheus as the metrics store, fed by OTEL Collector scraping internal Cribl APIs.
Motivation
- Cribl Stream exposes internal stats at
/api/v1/system/stats (port 9000)
- Cribl Edge exposes metrics at port 9420
- OTEL Collector exposes self-metrics at
:8888/metrics
- All three are currently invisible at the infrastructure level
Approach
OTEL Collector changes (k8s/base/otel-collector/configmap.yaml)
Add prometheus receiver scraping:
cribl-stream-standalone:9000/api/v1/system/stats (Cribl internal metrics)
cribl-edge-standalone:9420/metrics (Edge self-metrics)
localhost:8888/metrics (OTEL self-metrics)
Add prometheusremotewrite exporter pointing to Prometheus StatefulSet.
Expose OTEL self-metrics port 8888 in the Service.
New k8s/base/prometheus/ directory
- StatefulSet —
prom/prometheus:latest, ~256Mi memory, 10Gi PVC
- ConfigMap —
prometheus.yml with scrape configs and retention (15d)
- Service — ClusterIP for OTEL remote write, NodePort :30090 for local access
- NetworkPolicy — ingress from otel-collector on remote-write port (9090), egress to Cribl pods on stats ports
NetworkPolicy updates
- OTEL Collector: add egress to stream:9000, edge:9420
- Prometheus: ingress from OTEL on 9090
Acceptance Criteria
Notes
This is a foundational dependency for:
- Alertmanager rules (see related issue: Alertmanager with pipeline stall rules)
- Grafana dashboards (see related issue: Grafana dashboards for pipeline visibility)
- OTEL Collector self-monitoring pipeline
Implement this before alerting or dashboards.
Observability Roadmap
This is the foundational issue for the monitoring observability stack. The following issues have been consolidated here (2026-04-24):
Implement this first; the above will be reopened when scheduled.
Summary
No metrics are currently collected from Cribl Stream, Cribl Edge, or the OTEL Collector itself. Without metrics, alerting and dashboards are impossible. This issue adds Prometheus as the metrics store, fed by OTEL Collector scraping internal Cribl APIs.
Motivation
/api/v1/system/stats(port 9000):8888/metricsApproach
OTEL Collector changes (
k8s/base/otel-collector/configmap.yaml)Add
prometheusreceiver scraping:cribl-stream-standalone:9000/api/v1/system/stats(Cribl internal metrics)cribl-edge-standalone:9420/metrics(Edge self-metrics)localhost:8888/metrics(OTEL self-metrics)Add
prometheusremotewriteexporter pointing to Prometheus StatefulSet.Expose OTEL self-metrics port 8888 in the Service.
New
k8s/base/prometheus/directoryprom/prometheus:latest, ~256Mi memory, 10Gi PVCprometheus.ymlwith scrape configs and retention (15d)NetworkPolicy updates
Acceptance Criteria
kubectl --context orbstack -n monitoring port-forward svc/prometheus 9090shows Cribl metrics in Prometheus UIotelcol_process_uptime_seconds)outBytes,inBytes,outputDroppedEventsTotalare queryablemake validateNotes
This is a foundational dependency for:
Implement this before alerting or dashboards.
Observability Roadmap
This is the foundational issue for the monitoring observability stack. The following issues have been consolidated here (2026-04-24):
Implement this first; the above will be reopened when scheduled.