feat: Allow configuring priorityClassName and terminationGracePeriodS…#7783
feat: Allow configuring priorityClassName and terminationGracePeriodS…#7783cmwylie19 wants to merge 1 commit into
Conversation
…econds on the NLLB Envoy Pod Signed-off-by: Case Wylie <cmwylie19@gmail.com>
|
I wonder if we should rather actually "hardcode" the priority to Like you said, having priority at 0 makes no sense as this really is node critical component. I can't think of a reason why anyone would actually have it on any other prio level. The other thing I'm wondering is if we'd be able to use the recently merged resource patches feature for this. That'd allow users to basically override anything on the generated manifest without us having to maintain bazillion different opts in the config. |
I couldn't agree more, it definitely needs |
Just let me know how you would like me to proceed forward, if you want a default priorityClass or anything or if we need to close this. Happy to help! |
@jnummelin I am not sure if we can use resource patches if I am following correctly based on the fact that the name of the nllb pod is based on the node name and the node-name is going to be random. nllb-canes-b29b 1/1 Running 0 7d19h 192.168.4.67 canes-b29b apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
metadata:
name: k0s
namespace: kube-system
spec:
network:
nodeLocalLoadBalancing:
envoyProxy:
patches:
- target:
kind: Pod
name: ???
namespace: kube-system
patch:
type: StrategicMergePatch
content: |
spec:
terminationGracePeriodSeconds: 60 |
|
This pull request has merge conflicts that need to be resolved. |
Description
The node-local load balancing (NLLB) Envoy Pod is a static pod generated by k0s in
makePodManifest(). ItsEnvoyProxyconfig only exposesimage,imagePullPolicy,apiServerBindPortandkonnectivityServerBindPort, so there is no way to set apriorityClassNameor aterminationGracePeriodSecondson the Pod.This matters in practice. The Envoy Pod runs at priority
0, yet it is the worker's load-balanced path to the control plane. With graceful node shutdown enabled (shutdownGracePeriod/shutdownGracePeriodCriticalPodsvia a worker profile), the kubelet shutdown manager kills non-critical pods first and critical pods last. Because the Envoy Pod is priority0, it is killed in the first phase, severing the worker's path to the API server ([::1]:7443) before the remaining pods can drain or report status:This change exposes two new optional fields on
spec.network.nodeLocalLoadBalancing.envoyProxy:priorityClassName(string) — e.g.system-node-critical, so the Pod is protected from node-pressure eviction and shut down last during graceful node shutdown.terminationGracePeriodSeconds(int64,>= 0) — override the default 30s grace period to let Envoy drain in-flight connections.Both fields are plumbed through
envoyPodParamsintomakePodManifest()and set on the Pod spec. Both default to unset, so existing behavior is unchanged.Fixes #7782
Type of change
How Has This Been Tested?
Manual: configured
envoyProxy.priorityClassName: system-node-criticalandterminationGracePeriodSecondsink0s.yaml, restarted the worker, and confirmed the resulting static Pod carries both fields. Withsystem-node-criticalset, the Envoy Pod is shut down in the critical-pods phase during graceful node shutdown instead of immediately.Auto:
TestMakePodManifestasserts the fields propagate to the Pod spec (and are absent by default);TestEnvoyProxy_PriorityClassAndGracePeriodcovers config parsing and rejection of a negative grace period.Checklist