Skip to content

fix: correct order-service image name typo causing ImagePullBackOff#41

Open
github-actions[bot] wants to merge 1 commit intomainfrom
fix/cluster-doctor/order-service-image-typo-20260329
Open

fix: correct order-service image name typo causing ImagePullBackOff#41
github-actions[bot] wants to merge 1 commit intomainfrom
fix/cluster-doctor/order-service-image-typo-20260329

Conversation

@github-actions
Copy link
Copy Markdown

Summary

Fixes #40 — ArgoCD agentic-platform-engineering-demo health status Degraded.

Root Cause

A one-character typo in Act-3/argocd/apps/broken-aks-store-all-in-one.yaml caused the order-service Deployment to reference a non-existent container image, triggering ImagePullBackOff and a stuck rolling update.

- image: ghcr.io/azure-samples/aks-store-demo/order-servic:2.1.0
+ image: ghcr.io/azure-samples/aks-store-demo/order-service:2.1.0

The trailing e was missing from order-service.

Evidence Collected

Signal Observation
Pod order-service-74887bf86-lxmxs ImagePullBackOff for 2d4h — 13,881 failures
Deployment condition Progressing: False (ProgressDeadlineExceeded)
Deployment revision v17 — NewReplicaSet: order-service-74887bf86 (broken)
Cluster msftgbb / agentic-platform-engineering / AKS 1.33.7, canadacentral
ArgoCD sync status Synced (repo matches cluster) — but Health: Degraded because deployment is stuck

Cluster State Before Fix

NAME                                READY   STATUS             RESTARTS   AGE
order-service-575df9db99-x6kmq      1/1     Running            0          (old RS, correct image)
order-service-74887bf86-lxmxs       0/1     ImagePullBackOff   0          2d4h

Impact

  • The order-service Deployment rolling update was permanently blocked.
  • Kubernetes kept the old (working) pod alive, so order processing was not fully interrupted, but the deployment was marked unavailable and ArgoCD health was Degraded.

Fix

Single character correction: order-servicorder-service in the image tag field.

Test Plan

  1. Merge this PR.
  2. ArgoCD will detect the manifest change and sync the updated Deployment template.
  3. Kubernetes will create a new ReplicaSet with the correct image and roll forward.
  4. Verify:
    kubectl get pods -n default -l app=order-service
    # Expected: 1/1 Running (new RS), old broken pod terminated
    kubectl get deployment order-service -n default
    # Expected: READY 1/1, UP-TO-DATE 1, AVAILABLE 1
  5. ArgoCD app health should return to Healthy.

Rollback

Revert this PR or run:

kubectl rollout undo deployment/order-service -n default

…ice)

The image name was missing the trailing 'e', causing ImagePullBackOff
on the order-service pod and the ArgoCD app health to degrade.

Fixes #40

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🚨 ArgoCD Deployment Failed: agentic-platform-engineering-demo

0 participants