vSphere Kubernetes Cluster Node Stuck Deleting, Failed to Detach Volumes due to Missing Virtual Machine Image

Products

VMware vSphere Kubernetes Service VMware vSphere 7.0 with Tanzu vSphere with Tanzu Tanzu Kubernetes Runtime

Issue/Introduction

In a vSphere Kubernetes Cluster, one or more nodes are stuck in Deleting state.

In this scenario, the node is stuck in Deleting state because it is failing to detach/attach volumes from/to another node which uses an image that no longer exists in the environment.

Pods are stuck in Init state, unable to attach volumes because the necessary volumes are still attached to another node in the cluster which is using the missing image.

While connected to the Supervisor cluster context, the following symptoms are present:

The affected vSphere Kubernetes Cluster's virtualmachines are on different images, the below is an example and image names will vary by environment:

kubectl get vm -o wide -n <affected cluster namespace>

NAMESPACE          NAME                               POWERSTATE    IMAGE
<namespace>       <cluster-control-plane-a>           poweredOn     <photon-ova-image-1>
<namespace>       <cluster-control-plane-b>           poweredOn     <photon-ova-image-1>
<namespace>       <cluster-control-plane-c>           poweredOn     <photon-ova-image-1>
<namespace>       <cluster-worker-node-nodepool-z>    poweredOn     <photon-ova-image-2>
<namespace>       <cluster-worker-node-nodepool-y>    poweredOn     <photon-ova-image-2>
<namespace>       <cluster-worker-node-nodepool-x>    poweredOn     <photon-ova-image-1>

One or more of the above images are no longer present in the environment, image names will vary by environment:
- ```
kubectl get virtualmachineimage
```

Describing the affected cluster shows an error message similar to the following under Conditions, where "<photon-ova-image-2>" is the missing image:

kubectl describe cluster <affected cluster name> -n <affected cluster namespace>

message: 'Failed to get VirtualMachineImage <photon-ova-image-2>:
      VirtualMachineImage.vmoperator.vmware.com "<photon-ova-image-2>" not found'

The VMOP controller pod logs show error messages similar to the following, where the missing image name will vary by environment:
- ```
error validating image hardware version for PVC: VirtualMachineImage.vmoperator.vmware.com "<photon-ova-image-2>" not found
```
The capi-controller pod logs show error messages similar to the following regarding the stuck Deleting node:
- ```
Waiting for machine 1 of 1 to be deleted
```
- ```
Waiting for node volumes to be detached
```

While connected to the affected vSphere Kubernetes cluster's context, the following symptoms are present:

The Deleting node is in SchedulingDisabled state:
- ```
kubectl get nodes
```
All application pods have been successfully drained from the deleting node:
- kube-system and vmware pods will still remain on the node as these are responsible for facilitating the draining process.
- ```
kubectl get pods -A -o wide | grep <deleting node name>
```
Application volumes have not finish detaching from the deleting node, showing True state for Attached:
- ```
kubectl get volumeattachments -A | grep <deleting node name>
```

The vsphere-csi-controller pod logs in the affected cluster show error messages similar to the below, where "<photon-ova-image-2>" is the missing image in the environment and "<pvc-pod-persistent-volume>" is one of the volumes that is failing to detach from a node in the cluster and will vary by environment:

kubectl logs -n vmware-system-csi vsphere-csi-controller -c csi-attacher

error syncing volume "<pvc-pod-persistent-volume>": persistentvolume <pvc-pod-persistent-volume> is still attached to node <deleting node name>

Time out to update VirtualMachines "<deleting node name>" with Error: admission webhook "default.validating.virtualmachine.vmoperator.vmware.com" denied the request: spec.imageName: Invalid value: "<photon-ova-image-2>": error validating image hardware version for PVC: VirtualMachineImage.vmoperator.vmware.com "<photon-ova-image-2>" not found

Environment

vSphere with Tanzu 7.0

This issue can occur regardless of whether or not the vSphere Kubernetes Cluster is managed by Tanzu Mission Control (TMC)

Duplicate image issues have been resolved in vSphere with Tanzu 8.0 and higher.

Cause

The node is unable to delete because the volumes for the pods which were originally running on the node are unable to detach or attach to another node in the cluster.

CSI is unable to attach or detach volumes because it cannot locate the missing image for the node it is trying to detach volumes from or attach volumes to.

This is caused by a manual change to the content library attached to the affected cluster's namespace or manual removal of the above noted missing image.

Removal of content library or images would not normally trigger this missing image error because the content library service maintains a cache of the images.

In this scenario, the removed content library or image happened so long ago that the image cache entry no longer exists.

This issue can also occur if the missing image was renamed instead of being removed from the content library. It is not supported to rename images in content libraries.

The vmop-controller-manager pod in the Supervisor cluster is responsible for reconciling virtual machine images in the environment and will do so periodically.

Resolution

The missing image needs to be re-added into the environment. The vmop-controller-manager pod in the Supervisor cluster is responsible for reconciling virtual machine images in the environment and will do so periodically.

IMPORTANT: Changes to content libraries and images used by existing VMs will result in rolling redeployments of all nodes across all clusters using that content library or image in the Supervisor cluster.

Connect to the Supervisor cluster context
Check on the total number of content sources in the environment:
- The contentsource IDs will match the corresponding content library ID found in vSphere web client
- ```
kubectl get contentsources -A
```
Confirm on the virtualmachineimages that the vms are running on for the affected cluster:
- ```
kubectl get vm -o wide -n <affected cluster namespace>
```
Compare the above virtualmachineimages to the virtualmachineimages in the environment:
- ```
kubectl get virtualmachineimage
```
The missing virtualmachineimage will need to be added back into the content library that the affected cluster's namespace is associated with.
Once the missing image is present again in the associated content library, the system will need to reconcile the image in the Supervisor cluster.
- The vmop-controller-manager pod in the Supervisor cluster is responsible for reconciling virtual machine images in the environment and will do so periodically. However, the vmop-controller-manager pod can be restarted to clean up stale images and quickly reconcile the images available in the environment.
- ```
kubectl rollout restart deploy -n vmware-system-vmop vmware-system-vmop-controller-manager
```
Confirm that the missing virtualmachineimage is now present again in the environment:
- ```
kubectl get virtualmachineimage
```
Monitor the progress of the rolling redeployment for the affected cluster's nodes on the desired virtualmachineimage:
- Note: It is expected for all nodes in the cluster to rolling redeploy on the same image, regardless of the node already being on that image.
- ```
watch kubectl get vm,ma -o wide -n <affected cluster namespace>
```

Additional Information

Duplicate image issues have been resolved in vSphere with Tanzu 8.0 and higher.