Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion pkg/autoscaler/autoscaler/metric_collector.go
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,9 @@ func (collector *MetricCollector) processPrometheusString(metricStr string, past
}
if err != nil {
klog.Errorf("error decoding metric: %v", err)
continue
// Stop decoding on malformed input to avoid spinning forever
// if decoder keeps returning the same non-EOF error.
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using break here correctly prevents the infinite loop, but it allows the collector to proceed with partial metrics for the current pod. Since instanceMetricMap is an accumulator across all pods and missing metrics are filled with 0 at the end of this function (lines 238-242), this can lead to an artificially low aggregate metric value. This might cause the autoscaler to make incorrect decisions, such as scaling down during a period of high load if metrics parsing fails.

Consider refactoring processPrometheusString to return an error, and in the caller (fetchMetricsFromPods), handle this error by marking the instance as failed (instanceInfo.IsFailed = true). This would ensure that corrupted or partial data doesn't impact autoscaling logic.

}
if len(mf.Metric) < 1 {
klog.Errorf("metric is invalid")
Expand Down
Loading