Node in not ready state and machine showing failed or stuck in provisioning state.

search cancel

Node in not ready state and machine showing failed or stuck in provisioning state.

book

Article ID: 378405

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

Some times after an unexpected reboot of nodes or the cluster you will see nodes in not ready state. Also the machine resource in TKG might show as failed or are stuck in provisioning state

Environment

2.X

Resolution

In order to resolve the issue remove the machine, vspheremachine and vspherevm resources and let the capi/capv recreate these resources

Delete machine resource

kubectl delete machine MACHINENAME -n NAMESPACE

Delete vspheremachine resource

kubectl delete vspheremachine VSPHEREMACHINENAME -n NAMESPACE

Delete vspherevm resource

kubectl delete vspherevm VSPHEREVMNAME -n NAMESPACE

Additional Information

After the removal of the machines from the cli, the nodes may get auto created ( reason being the Machine Health check being enabled).
To synch the exact replicas for the nodes, please edit the cluster configuration from the TCA-M GUI with the correct node count of the replicas and wait for the nodes to get provisioned .

Feedback

thumb_up Yes

thumb_down No