Issue with kubernetes VM Creation in the vCenter
search cancel

Issue with kubernetes VM Creation in the vCenter

book

Article ID: 392475

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime VMware Telco Cloud Automation

Issue/Introduction

  • The VM is not created in vCenter.
  • The corresponding Machine resource is stuck in the Pending state.
  • Nodes are in a NotReady state after the resize cpu and memory for the worker nodes using kubectl edit cluster

Environment

2.x, 3.x

Cause

The issue is primarily caused by stale or inconsistent resource states within the Cluster API (CAPI) and Cluster API Provider for vSphere (CAPV) controllers, or an inconsistency in the NodeConfig Operator. This prevents the Management Cluster from communicating with vCenter to initiate VM cloning or creation, leaving the Machine resource in a permanent Pending state.

Resolution

  • If the VM is not created in vCenter and the Machine is in a Pending state, please follow the steps below before restarting the components:

    1. Resource Cleanup
      • Delete the following Kubernetes resources related to the failed provisioning:
        • Use the following command to list machines 
          kubectl get machines -A | grep cluster_name | grep Pending
          Use the following command to delete machines
          kubectl delete machine <machine-name> -n <namespace>

          This shows all Machines and vSpheremachines that are using a VSphere provider and list their associated names

          kubectl get machines -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.infrastructureRef.name}{"\t"}{.spec.providerID}{"\n"}{end}' | grep vsphere://
          Use this command to delete vSpheremachine and vspherevm with associated provider ID you wish to delete:
          kubectl delete vspheremachine VSPHEREMACHINENAME -n NAMESPACE
          kubectl delete vspherevm -n <namespace> <vspherevm>
    2. If the VM is not created in vCenter and the Machine is in a Pending state, please follow the steps below before restarting the components:

      Restart the following pods in the management cluster:
          •    Use the following command to list capv and capi pods 

      kubectl get pods -n capi-system && kubectl get pods -n capv-system

           •    Use the following command to restart capv and capi pods:

      kubectl delete pods --all -n capi-system && kubectl delete pods --all -n capv-system

         •    Use the following command to list nodeconfig-controller

      kubectl get pods -n tca-system | grep nodeconfig-operator

          •    Use the following command to restart 

      kubectl delete pod -n tca-system $(kubectl get pods -n tca-system --no-headers -o custom-columns=":metadata.name" | grep ^nodeconfig-operator-)

       

    3. Validation:
          •    After restarting the components, verify if the objects (VMs) are being created successfully in vCenter.
          •    Confirm that the provisioning completes as expected and nodes are joining the cluster properly.