Harbor Pods Stuck in PodVMCreationFailed State with “Cannot Find VM” Error

search cancel

Harbor Pods Stuck in PodVMCreationFailed State with “Cannot Find VM” Error

book

Article ID: 420557

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

^{In certain environments, administrators may observe that Harbor pods fail to start and remain stuck in the PodVMCreationFailed state. Running kubectl describe pod -n <namespace> shows the following error message:}

^{Status: Failed}^{Reason: PodVMCreationFailed}^{Message: Cannot find VM vm-<ID>}

^{This condition prevents Harbor pods from running successfully and can disrupt container registry services. The issue is typically encountered when the underlying VM resources required for pod scheduling are unavailable.}

Environment

^{VMware vSphere Kubernetes Service}^{Supervisor Service using vSphere Pods}

Cause

^{The PodVMCreationFailed state occurs when Kubernetes attempts to schedule a pod onto a virtual machine that is no longer available or accessible. In such cases, the scheduler references a VM ID that cannot be found, resulting in the error message:}

^{Reason: PodVMCreationFailed}^{Message: Cannot find VM vm-<ID>}

^{This condition is typically triggered by underlying infrastructure faults such as:}

^{Power-off events affecting the control plane VM.}
^{I/O write failures or storage issues that cause VM files to become truncated or corrupted.}
^{Unexpected removal or unavailability of VM resources referenced by the pod.}

^{Because the pod depends on the control plane VM for creation, any disruption to that VM prevents the pod from being instantiated successfully, leaving it stuck in the PodVMCreationFailed state.}

Resolution

^{To remediate the issue perform the following resolution:}

^{Part A: Reference Guidance}

^{Follow the steps outlined in the resolution of Broadcom Knowledge Base article: 🔗 https://knowledge.broadcom.com/external/article?articleNumber=389895}

^{Part B: Pod Cleanup Procedure}

^{1. Delete all pods that are not in a running state:}

^{kubectl delete pod -n <namespace> $(kubectl get pods -n <namespace> --field-selector=status.phase!=Running -o jsonpath='{.items[*].metadata.name}')}

^{2. Force delete pods if standard deletion fails:}

^{kubectl delete pod <pod-name> --grace-period=0 --force}

Additional Information

^{For detailed instructions and further context, refer to the Broadcom Knowledge Base article: 🔗 https://knowledge.broadcom.com/external/article?articleNumber=389895}

Feedback

thumb_up Yes

thumb_down No