Istio collects a rich set of metrics related to the data plane and control plane. These metrics are stored in Prometheus.
We could leverage the collected metrics to produce alerts describing service mesh performance, for instance:
- high error rate on the communication path between two services,
- high latency on the communication path between two services,
- failure on configuring a proxy by Istio Pliot,
- failure on rotating a proxy certificate by Istio Citadel.
- failure on validating a config by Istio Galley.
First, we should look for existing alerting rules in open source and evaluate them. Second, we should attempt to elaborate custom set of alerting rules.
Istio collects a rich set of metrics related to the data plane and control plane. These metrics are stored in Prometheus.
We could leverage the collected metrics to produce alerts describing service mesh performance, for instance:
First, we should look for existing alerting rules in open source and evaluate them. Second, we should attempt to elaborate custom set of alerting rules.