When a worker node is rebooted, it will come back online in a 'Ready, SchedulingDisabled' state:
NAME STATUS ROLES AGE VERSION
mit-shared-prod-test-1-cdm64-xmlsf Ready control-plane 2d4h v1.27.5+vmware.1
mit-shared-prod-test-1-lin-0-v5g7p-cfbbbc7f7xf55qp-wt58q Ready,SchedulingDisabled <none> 29h v1.27.5+vmware.1
You will also notice that when checking the machine's status from the management cluster context, it appears in a deleting state.
When the MachineHealthCheck (MHC) is configured on a workload cluster, the default timeout before a machine is marked as unhealthy and scheduled for deletion is 5 minutes. This issue arises when, from the management cluster's perspective, the nodes take longer than 5 minutes to come back online. There are two possible reasons for this issue:
tanzu cluster machinehealthcheck node set test-cluster --unhealthy-conditions "Ready:False:5m,Ready:Unknown:5m"
- Another possible cause for this issue is a time discrepancy between the workload cluster and the management cluster. Since it typically takes around two -> three minutes for a node to transition from "NotReady" to "Ready" after a reboot, a time difference of more than three minutes can lead to the node being flagged for deletion if the MHC is using the default 5-minute constraints.
To address this, ensure proper time synchronisation by configuring NTP servers on both the management and workload clusters, as outlined in this KB article:
https://knowledge.broadcom.com/external/article?articleNumber=337407
As a temporary workaround, you can adjust the MachineHealthCheck (MHC) configuration to be more lenient. For example, you can configure the MHC to delete a node only after it has been in a "NotReady" state for 10 minutes or more. This adjustment should provide enough time for the cluster to come back online, even when there is a significant time discrepancy.tanzu cluster machinehealthcheck node set test-cluster --unhealthy-conditions "Ready:False:10m,Ready:Unknown:10m"
The "SchedulingDisabled" component of this issue arises when a node is marked for deletion but cannot be properly drained. This is typically caused by a PodDisruptionBudget (PDB) that restricts the draining process. This situation is more commonly encountered in clusters where a single control plane and worker node are connected to TMC, often due to the "gatekeeper-controller" PodDisruptionBudget. To resolve this and allow the node to complete its deletion, there are two potential approaches:
kubectl delete pod -n gatekeeper-system --all --force
tanzu cluster scale test-cluster --controlplane-machine-count 1 --worker-machine-count 3
Note: It is not advisable to connect a single control plane and worker node cluster to TMC. If an upgrade or machine recreation occurs for reasons not outlined in this article, the worker node may encounter difficulties during the draining process.