The cluster worker is not able to be running in certain period

search cancel

The cluster worker is not able to be running in certain period

book

Article ID: 422292

calendar_today

Updated On:

Products

VMware Tanzu Mission Control - SM

Issue/Introduction

The VKS cluster worker node failed to up and running in certain period, e.g more than 30 mins.

Check the node provisioning progress with the command “kubectl get ma -n $NS”, all machine should be in running status

NAME                    CLUSTER   NODENAME   PROVIDERID                                       PHASE         AGE     VERSION
tmc-sm-sznsx-wf8vz      tmc-sm               vsphere://423ec211-43bc-4ad5-e175-075888e6d4d0   Provisioned   38m16s   v1.30.1+vmware.1-fips

Environment

VKS

Tanzu Mission Control - SM

Resolution

Delete it to make it recreated again with command “kubectl delete ma $tmc-sm-sznsx-wf8vz -n $NS”

Additional Information

Sometimes the VKS cluster worker node failed stuck in deleting in certain period, e.g more than 30 mins, due to some Pods are not able to be evicted.

Solution:

Drain the node from the cluster with the command “kubectl drain $nodeName --ignore-daemonsets --delete-emptydir-data --disable-eviction”

Feedback

thumb_up Yes

thumb_down No