TKGi pods fail with ErrImageNeverPull during cluster upgrade
search cancel

TKGi pods fail with ErrImageNeverPull during cluster upgrade

book

Article ID: 408400

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Cluster is being upgraded from TKGi 1.20 to 1.21, this includes a bump of the syncer image in vsphere-csi-webhook pod.

However, during the upgrade the vsphere-csi-webhook pod fails with ErrImageNeverPull

NAMESPACE           NAME                                                              READY   STATUS              RESTARTS   AGE
kube-system         antrea-controller-764ff6d879-v6d4p                                0/1     ErrImageNeverPull   0          3m25s
kube-system         konnectivity-agent-86a8645b-2cbb-4a97-a963-fb7fa21ec924-647hjdh   0/1     ErrImageNeverPull   0          3m31s
kube-system         konnectivity-agent-86a8645b-2cbb-4a97-a963-fb7fa21ec924-64vrppk   0/1     ErrImageNeverPull   0          3m31s
vmware-system-csi   vsphere-csi-webhook-98bf88cd5-khsxf                               0/1     ErrImageNeverPull   0          4m2s
vmware-system-csi   vsphere-csi-webhook-98bf88cd5-ptk6k                               0/1     ErrImageNeverPull   0          4m2s
vmware-system-csi   vsphere-csi-webhook-98bf88cd5-rmtf9                               0/1     ErrImageNeverPull   0          4m2s 

Environment

TKGi 1.21 and 1.22

Cause

The vsphere-csi-webhook deployment is updated during upgrade of Master nodes but the image is not available locally on the Worker node until it has been upgraded

Resolution

This will be resolved in future TKGi release.

The issue will self-solve once the vsphere-csi-webhook pods start on an upgraded Worker node.

 

Alternatively, the image can be manually imported onto the Worker node using the steps below.

Copy the attached image, registry.k8s.io_csi-vsphere_syncer-v3.3.1.tgz, to the Worker node where the pod is attempting to start

bosh -d service-instance_<GUID> scp registry.k8s.io_csi-vsphere_syncer-v3.3.1.tgz <worker ID>:/tmp

Import the image into containerd

bosh -d service-instance_<GUID> ssh <worker ID>
sudo -i
cd /tmp
gunzip registry.k8s.io_csi-vsphere_syncer-v3.3.1.tgz
/var/vcap/packages/containerd/bin/ctr --address /var/vcap/sys/run/containerd/containerd.sock -n k8s.io image import /tmp/registry.k8s.io_csi-vsphere_syncer-v3.3.1.tar

Verify it has imported the image and that pods are Running

crictl images | grep sync
kubectl get pods -n vmware-system-csi

Attachments

registry.k8s.io_csi-vsphere_syncer-v3.3.1.tgz get_app