[Draft] Phase 1: PD-aware autoscaling (soft role coordination)#948
[Draft] Phase 1: PD-aware autoscaling (soft role coordination)#948WHOIM1205 wants to merge 1 commit intovolcano-sh:mainfrom
Conversation
Adds optional Coordination block on HeterogeneousTarget to softly bias replica allocation across roles using a cost+pressure blend clipped to a preferred ratio band. No-op when Mode is Off or unset; bypassed in panic mode. Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request introduces PD-aware coordination for heterogeneous targets, enabling the autoscaler to distribute replicas based on pressure signals and preferred ratio bands. The implementation includes new API types, deep-copy methods, and logic in the optimizer to blend cost and pressure shares. Feedback highlights an issue in the replica rounding logic that can cause total count drift and suggests additional validation to ensure the sum of minimum preferred ratios does not exceed 100%.
| for _, nm := range names { | ||
| if out[nm] < minReps[nm] { | ||
| out[nm] = minReps[nm] | ||
| } else if out[nm] > maxReps[nm] { | ||
| out[nm] = maxReps[nm] | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation of roundSharesToReplicas applies clamping to minReps and maxReps after the total number of replicas has been distributed. This can lead to a final replica map where the sum of replicas does not match the total (the recommended instances). Since total is guaranteed to be within [sum(minReps), sum(maxReps)], a valid distribution always exists. To avoid this "drift", the distribution logic should be aware of the per-parameter bounds during the allocation process rather than as a post-processing step. This is important to ensure the autoscaler respects the total capacity recommended by the algorithm.
| for name, r := range coord.PreferredRatio { | ||
| entry := ratioPath.Key(name) | ||
| if r.Min < 0 || r.Min > 100 { | ||
| allErrs = append(allErrs, field.Invalid(entry.Child("min"), r.Min, "must be between 0 and 100")) | ||
| } | ||
| if r.Max < 0 || r.Max > 100 { | ||
| allErrs = append(allErrs, field.Invalid(entry.Child("max"), r.Max, "must be between 0 and 100")) | ||
| } | ||
| if r.Min > r.Max { | ||
| allErrs = append(allErrs, field.Invalid(entry, r, "min must be <= max")) | ||
| } | ||
| if _, ok := validNames[name]; !ok { | ||
| allErrs = append(allErrs, field.Invalid(entry, name, "must match a heterogeneousTarget.params[].target.targetRef.name")) | ||
| } | ||
| } |
There was a problem hiding this comment.
The validation for preferredRatio should ensure that the sum of the minimum percentages across all roles does not exceed 100%. If sum(Min) > 100, the renormalization logic in applyCoordination will scale down the shares, potentially resulting in an allocation that violates the user-specified minimum ratios and leading to unexpected behavior.
var sumMin int32
for name, r := range coord.PreferredRatio {
entry := ratioPath.Key(name)
if r.Min < 0 || r.Min > 100 {
allErrs = append(allErrs, field.Invalid(entry.Child("min"), r.Min, "must be between 0 and 100"))
}
if r.Max < 0 || r.Max > 100 {
allErrs = append(allErrs, field.Invalid(entry.Child("max"), r.Max, "must be between 0 and 100"))
}
if r.Min > r.Max {
allErrs = append(allErrs, field.Invalid(entry, r, "min must be <= max"))
}
if _, ok := validNames[name]; !ok {
allErrs = append(allErrs, field.Invalid(entry, name, "must match a heterogeneousTarget.params[].target.targetRef.name"))
}
sumMin += r.Min
}
if sumMin > 100 {
allErrs = append(allErrs, field.Invalid(ratioPath, coord.PreferredRatio, "sum of min ratios must not exceed 100"))
}There was a problem hiding this comment.
Pull request overview
This PR introduces an optional “PD-aware coordination” mechanism for heterogeneous autoscaling, adding an API surface to configure coordination and implementing an optimizer-side biasing step that blends cost-based allocation with per-role pressure signals, with optional preferred-ratio bands and panic-mode bypass.
Changes:
- Add
coordinationconfiguration underHeterogeneousTarget(mode + preferred ratio bands). - Validate coordination configuration in the AutoscalingPolicyBinding admission validator.
- Apply coordination biasing in the autoscaler optimizer when enabled (and not in panic mode).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| pkg/model-booster-controller/webhook/autoscaling_binding_validator.go | Adds admission-time validation for the new coordination block and preferred ratio constraints. |
| pkg/autoscaler/autoscaler/optimizer.go | Implements Phase 1 coordination blending (cost + pressure) and preferred-ratio clipping during replica distribution. |
| pkg/apis/workload/v1alpha1/coordination_deepcopy.go | Adds temporary hand-written DeepCopy implementations for new coordination types (intended to be replaced by generated code). |
| pkg/apis/workload/v1alpha1/autoscalingpolicybinding_types.go | Extends the API types to include Coordination configuration on HeterogeneousTarget. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Build the set of valid role names from existing params so we can flag typos. | ||
| validNames := make(map[string]struct{}, len(asp_binding.Spec.HeterogeneousTarget.Params)) | ||
| for _, p := range asp_binding.Spec.HeterogeneousTarget.Params { | ||
| validNames[p.Target.TargetRef.Name] = struct{}{} | ||
| } |
There was a problem hiding this comment.
validateCoordination builds validNames from heterogeneousTarget.params[].target.targetRef.name but doesn’t validate that those names are non-empty and unique. Because coordination (and other optimizer code paths) key maps by name, empty/duplicate names can silently overwrite entries and make preferredRatio validation unreliable. Please add validation to require each param’s targetRef.name be set and unique (at least when coordination is present/enabled).
| // Hand-written DeepCopy implementations for the Phase 1 PD-aware | ||
| // coordination types. | ||
| // | ||
| // IMPORTANT (reviewers): zz_generated.deepcopy.go is intentionally left | ||
| // untouched. Before merging, run `make generate` so controller-gen | ||
| // regenerates that file — this will (a) duplicate the methods below into | ||
| // zz_generated.deepcopy.go and (b) extend HeterogeneousTarget.DeepCopyInto | ||
| // to deep-copy the new Coordination pointer. After regeneration, this file | ||
| // can be deleted. |
There was a problem hiding this comment.
The API types were updated to add Coordination, but the generated deepcopy file (zz_generated.deepcopy.go) hasn’t been regenerated, so AutoscalingPolicyBinding/HeterogeneousTarget DeepCopy* currently won’t deep-copy the new Coordination pointer. This can lead to shared pointers/maps across copies. Also, keeping these hand-written DeepCopy methods in-tree will cause duplicate-method compile errors the next time make generate runs. Please run code generation and commit the updated zz_generated.deepcopy.go (including deep-copying HeterogeneousTarget.Coordination), then remove this temporary file.
| // Convert to integers via largest-remainder so the sum is preserved, | ||
| // then clamp to each param's Min/Max. Drift after clamping is allowed. | ||
| return roundSharesToReplicas(names, raw, total, minReps, maxReps) | ||
| } |
There was a problem hiding this comment.
roundSharesToReplicas clamps each role to its [min,max] after rounding but then accepts total drift. That means the returned replicasMap can sum to a value different from recommendedInstances, which breaks the optimizer’s implied invariant (only distribution changes, not the total) and desynchronizes Status history (it records the corrected total, not the applied total). Please adjust the post-clamp result to preserve the requested total (e.g., redistribute the delta across roles with remaining headroom/room-down while respecting min/max), or avoid clamping by ensuring the rounding step already respects min/max.
| // pressureMetricNames are the Phase 1 placeholder signals the optimizer | ||
| // consults to estimate per-role load. They are read by name from whatever | ||
| // the configured MetricCollector exposes — there is no hard coupling: a | ||
| // missing metric simply contributes zero. Phase 2 will let users configure | ||
| // these per role; for now keeping them as a small constant set keeps the | ||
| // blast radius small and the diff reviewable. | ||
| var pressureMetricNames = []string{ | ||
| "queue_depth", // prefill + decode: pending requests | ||
| "kv_cache_utilization", // decode: KV-cache pressure (typically 0..1) | ||
| "ttft", // prefill: time-to-first-token (seconds) | ||
| } |
There was a problem hiding this comment.
The comment says these pressure signals are read from whatever the MetricCollector exposes, but MetricCollector currently only scrapes metrics listed in autoscalePolicy.Spec.Metrics (its WatchMetricList). As a result, coordination will almost always see these metrics as missing unless users also add them to scaling metrics (which will then affect the total recommendation, not just distribution). Consider extending MetricCollector (or the optimizer) to collect a separate/extra metric set for coordination without feeding them into MetricTargets, and/or clarify this requirement in the comment/docs.
| for _, nm := range names { | ||
| r, ok := coord.PreferredRatio[nm] | ||
| if !ok { | ||
| continue | ||
| } | ||
| lo, hi := float64(r.Min)/100.0, float64(r.Max)/100.0 | ||
| if raw[nm] < lo { | ||
| raw[nm] = lo | ||
| } | ||
| if raw[nm] > hi { | ||
| raw[nm] = hi | ||
| } |
There was a problem hiding this comment.
applyCoordination assumes PreferredRatio values are well-formed (0<=Min<=Max<=100). If an older object bypasses webhook validation/CRD validation (or validation is disabled), negative/invalid ranges can push raw shares negative and produce negative replica counts after rounding. Please defensively sanitize/ignore invalid RoleRange entries here (clamp to [0,100] and skip if Min>Max) so the optimizer can’t emit negative replicas even with bad inputs.
| allErrs = append(allErrs, validateOptimizeAndScalingPolicyExistence(asp_binding)...) | ||
| allErrs = append(allErrs, v.validateAutoscalingPolicyExistence(ctx, asp_binding)...) | ||
| allErrs = append(allErrs, validateBindingTargetKind(asp_binding)...) | ||
| allErrs = append(allErrs, validateCoordination(asp_binding)...) |
There was a problem hiding this comment.
New coordination validation was added, but there are no unit tests covering it. Since this package already has autoscaling_binding_validator_test.go, please add cases for invalid coordination.mode, out-of-range min/max, min>max, and unknown preferredRatio role names to prevent regressions.
Phase 1: PD-aware autoscaling (soft role coordination)
This PR introduces an optional coordination mechanism for
HeterogeneousTargetthat allows the optimizer to bias replica distribution across roles (e.g. prefill / decode) using observed pressure signals, while preserving the existing cost-based behavior.What this adds
coordinationfield onHeterogeneousTargetOff(default) |PreferredPreferredRatio: soft per-role min/max percentage bandWhen enabled (
mode: Preferred), the optimizer:queue_depthkv_cache_utilizationttftminReplicas/maxReplicasKey properties
Implementation scope
Coordination,CoordinationMode,RoleRangecostShareFromBaselinepressureShareNormalizedroundSharesToReplicasNotes for reviewers
This is a draft PR intended to validate the approach and integration point before adding tests and further polish.
Pressure signals are currently a small fixed set (Phase 1 decision to keep scope minimal).
Largest-remainder rounding is made deterministic via stable sorting.
Required before merge
make generatemust be run so thatzz_generated.deepcopy.goincludes the newCoordinationfield.Until regeneration:
HeterogeneousTarget.DeepCopyIntodoes not deep-copy theCoordinationpointerThis is intentional for the draft to keep the diff minimal and reviewable.
Follow-ups (not part of this PR)
Goal of this PR
Validate:
before proceeding to Phase 2 and test coverage.