All pods are in Pending State and unable to locate the workload/VKS cluster worker nodes.
search cancel

All pods are in Pending State and unable to locate the workload/VKS cluster worker nodes.

book

Article ID: 389359

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • All pods are in a pending state or unable to view pod status when running command: kubectl get pods -n namespace 
  • All VKS worker nodes have disappeared from the vCenter UI and are unable to view the worker node status from SSH of the Supervisor and VKS. 
  • vCenter's /var/log/vmware/vmon/vmon.log shows DRS related error messages as below :

    YYYY-MM-DDTHH:MM:SSZ stderr F E0201      1 controller.go:317] controller/virtualmachine "msg"="Reconciler error" "error"="deploy from content library failed for image \"ob-####-tkgs-ova-photon-3-v####\": deploy error: Target datastore must be specified in order to deploy the OVF template to the vSphere DRS disabled cluster General-Cluster." "name"="worker-nodepool-####" "namespace"="general" "reconciler group"="vmoperator.vmware.com" "reconciler kind"="VirtualMachine"

    YYYY-MM-DDTHH:MM:SSZ  stderr F E0201     1 virtualmachine_controller.go:748] VirtualMachine "msg"="Provider failed to create VirtualMachine" "error"="deploy from content library failed for image \"ob-####-tkgs-ova-photon-3-v####\": deploy error: Target datastore must be specified in order to deploy the OVF template to the vSphere DRS disabled cluster General-Cluster." "name"="general/worker-nodepool-####"

    YYYY-MM-DDTHH:MM:SSZ  stderr F E0201     1 virtualmachine_controller.go:263] VirtualMachine "msg"="Failed to reconcile VirtualMachine" "error"="deploy from content library failed for image \"ob-####-tkgs-ova-photon-3-v####\": deploy error: Target datastore must be specified in order to deploy the OVF template to the vSphere DRS disabled cluster General-Cluster." "name"="general/worker-nodepool-####"

Environment

vSphere Kubernetes Service

Cause

Fully automated DRS is a pre-requisite for vSphere with Tanzu environments. The DRS mode set to manual, partially automated, or disabled will cause this situation.

Resolution

Enable DRS and set it to fully automated mode. Following this change, all worker nodes and pods will gradually transition to a ready state.

Additional Information