Supervisor Services failed to get images when deploying
search cancel

Supervisor Services failed to get images when deploying

book

Article ID: 389539

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

Issue Clarification:  
The kubelet failed to pull the Supervisor Service image, in this case for the Consumpion Interface(cci-service) due to a timeout error. The error message indicated that the image was in the "resolving" state on an ESXi  node.

Issue Verification:  
1. Verified that the image resolution was successful.  
2. Confirmed that the image remained in a "resolving" state despite resolution success.  
3. Observed that the affected envoy pods were not starting due to image issues.  

Environment

vCenter 8.0U3

vSphere Supervisor

Cause

The "resolving" state typically indicates an issue with the image resolution process in the container runtime, such as:
1. Incomplete or corrupted image resolution on the node.  
2. Residual artifacts (e.g., imagedisks) associated with the problematic image.  

 

Resolution

1. Fristly, Check storage: datastores, policies and resource limits. make sure that there is sufficient space available and allocated?


2. Delete the imagedisks and images for cci need to get the image disk command

kubectl delete -n vmware-system-kubeimage imagedisk.imagecontroller.vmware.com/1afcb3c1e07f65f30e1b0c3842a0aca44634cb3af27f4829027d40f42a9c83c7-v4484250
kubectl delete -n vmware-system-kubeimage  imagedisk.imagecontroller.vmware.com/4dfaf62e4c45e48a1fb49557dbd6e812a4f7c7e51b81dbc895849c448e9bbcb9-v38231648 
kubectl delete  -n svc-cci-service-domain-cXXXX   image.imagecontroller.vmware.com/cci-namespace-ui-se-d2e627b6ba4f8a15ebc193b46b50326fe80def61-v65467
kubectl delete  -n svc-cci-service-domain-cXXXX   image.imagecontroller.vmware.com/cci-supervisor-serv-7bc683ef56feae22a116a6f0e5ee94eb5e523f56-v73275

3. Delete the consumption interface pods. Redeploying the pods should force the system to fetch the image afresh, bypassing the "resolving" state issue.  

# set the variable for the consumption namespace (check this is correct name)
export TNS="svc-cci-service-domain-cXXXX"

# check the pods in the namespace
k get  pods -n $TNS

# generate kubectl delete for the pods in the namespace
k get  pods -n $TNS | awk -v tns=$TNS ' $0 ~/cci/ {print "kubectl delete pod -n ", tns,$1}'

# *** Copy the generated commands and delete the pods ***

# check the pods in the namespace come back, may take a while
k get  pods -n $TNS

4. check the images and imagedisks recreated

# check the images
kubectl get image -A | egrep "^NAME|svc-cci"

# check the image disks 
kubectl get imagedisks  -A | grep -E "$(kubectl get image -A | egrep "^NAME|svc-cci" | awk ' BEGIN { ORS ="|"} $0 ~ /cci/ {print $NF}')NAME"

# For information, these imagedisks are typically roughly 80Mi and 820 MI in most environments
# kubectl get imagedisks  -A | grep -E "$(kubectl get image -A | egrep "^NAME|svc-cci" | awk ' BEGIN { ORS ="|"} $0 ~ /cci/ {print $NF}')NAME"
# NAMESPACE                 NAME                                                                         STATUS   DISK                                   SIZE
# vmware-system-kubeimage   1afcb3c1e07f65f30e1b0c3842a0aca44634cb3af27f4829027d40f42a9c83c7-v12325951   Ready    868fdc15-13a0-4501-aa51-6c7faa794765   80012Ki
# vmware-system-kubeimage   7719d42ac092eb4f5168501316737db333d07782ea8e6d7d728452216cef7936-v8525705    Ready    03a4aeb0-551c-4e8f-9b71-20a2d9d153dd   837621Ki