Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce flakiness when installing kubelogin in Azure pipeline runs by adding retry logic around az aks install-cli and introducing a GitHub-download fallback when Azure CLI installation repeatedly fails.
Changes:
- Add exponential backoff retries for
az aks install-cliwhenkubeloginis missing. - Add a fallback installation path that downloads
kubeloginfrom GitHub releases if Azure CLI installation fails after retries.
| curl -LO "https://github.com/Azure/kubelogin/releases/download/${VERSION}/kubelogin-linux-amd64.zip" | ||
| unzip -o kubelogin-linux-amd64.zip | ||
| sudo mv bin/linux_amd64/kubelogin /usr/local/bin/ | ||
| rm -rf kubelogin-linux-amd64.zip bin/ |
There was a problem hiding this comment.
The fallback path introduces a dependency on unzip, but this repo’s setup step only checks for jq and doesn’t install/verify unzip. If the agent image changes (or a self-hosted agent is used), this can fail at runtime. Either ensure unzip is installed/checked before use or extract the archive using a tool already guaranteed to exist in your environment.
| install_success=false | ||
|
|
||
| for i in $(seq 1 $max_retries); do | ||
| if az aks install-cli; then |
There was a problem hiding this comment.
maybe it is better to cache this cli installation into telescope package so it will not fail
There was a problem hiding this comment.
I added the binary to telescope storage account, so it will not timeout , please have a look at this PR
https://github.com/Azure/telescope/pull/1055/changes
Several of the NAP runs are failing the install kubelogin step due to timeouts. This adds retries and a fall back to direct install if all the attempts fail.