Issue with kubernetes VM Creation in the vCenter

Products

Tanzu Kubernetes Runtime VMware Telco Cloud Automation

Issue/Introduction

The VM is not created in vCenter.
The corresponding Machine resource is stuck in the Pending state.
Nodes are in a NotReady state after the resize cpu and memory for the worker nodes using kubectl edit cluster

Environment

2.x, 3.x

Cause

The issue is primarily caused by stale or inconsistent resource states within the Cluster API (CAPI) and Cluster API Provider for vSphere (CAPV) controllers, or an inconsistency in the NodeConfig Operator. This prevents the Management Cluster from communicating with vCenter to initiate VM cloning or creation, leaving the Machine resource in a permanent Pending state.

Resolution

If the VM is not created in vCenter and the Machine is in a Pending state, please follow the steps below before restarting the components:
1. Resource Cleanup
  - Delete the following Kubernetes resources related to the failed provisioning:
    - Use the following command to list machines
```
kubectl get machines -A | grep cluster_name | grep Pending
```
      Use the following command to delete machines
```
kubectl delete machine <machine-name> -n <namespace>
```
      This shows all Machines and vSpheremachines that are using a VSphere provider and list their associated names
```
kubectl get machines -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.infrastructureRef.name}{"\t"}{.spec.providerID}{"\n"}{end}' | grep vsphere://
```
      Use this command to delete vSpheremachine and vspherevm with associated provider ID you wish to delete:
```
kubectl delete vspheremachine VSPHEREMACHINENAME -n NAMESPACE
kubectl delete vspherevm -n <namespace> <vspherevm>
```
2. If the VM is not created in vCenter and the Machine is in a Pending state, please follow the steps below before restarting the components:
  
  Restart the following pods in the management cluster:
  • Use the following command to list capv and capi pods
```
kubectl get pods -n capi-system && kubectl get pods -n capv-system
```
  • Use the following command to restart capv and capi pods:
```
kubectl delete pods --all -n capi-system && kubectl delete pods --all -n capv-system
```
  • Use the following command to list nodeconfig-controller
```
kubectl get pods -n tca-system | grep nodeconfig-operator
```
  • Use the following command to restart
```
kubectl delete pod -n tca-system $(kubectl get pods -n tca-system --no-headers -o custom-columns=":metadata.name" | grep ^nodeconfig-operator-)
```
3. Validation:
  • After restarting the components, verify if the objects (VMs) are being created successfully in vCenter.
  • Confirm that the provisioning completes as expected and nodes are joining the cluster properly.

Issue with kubernetes VM Creation in the vCenter

Article ID: 392475

Updated On:

Products

Issue/Introduction

Environment

Cause

Resolution

Feedback