Workload cluster deployment fails due to "\"ControlPlaneIsStable\" preflight failed"

search cancel

Workload cluster deployment fails due to "\"ControlPlaneIsStable\" preflight failed"

book

Article ID: 378056

calendar_today

Updated On: 09-27-2024

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

In CAPV/CAPI logs you may see logs similar to:

test-workload-cluster-controlplane-42plf is provisioning (\"ControlPlaneIsStable\" preflight failed). The operation will continue after the preflight check(s) pass" controller="machineset" controllerGroup="cluster.x-k8s.io" controllerKind="MachineSet" MachineSet="default/test-workload-cluster-md-dfjko" namespace="default" name="test-workload-cluster-md-dfjko" reconcileID=dee86c96-fac6-46a0-9bba-x7886c8906 MachineDeployment="default/test-workload-cluster" Cluster="default/test-workload-cluster"

You may observe that no new machines are being provisioned in vSphere, and similarly, no machine objects are being created for the new cluster within the management cluster. As a result, from the Tanzu CLI, the cluster will remain stuck in a "creating" state

Cause

The exact cause of this issue is unclear, but it can occur in situations where a cluster deletion is initiated. In such cases, some top-level components may be successfully removed, while certain objects remain, causing inconsistencies and potential issues.

Resolution

To ensure the cluster can be deployed successfully, please initiate the deletion of the problematic cluster using the Tanzu CLI. Allow sufficient time for the process to complete. Afterward, use the following commands from management cluster context to confirm that all objects associated with the cluster have been fully removed:

kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -l "cluster.x-k8s.io/cluster-name=CLUSTER-NAME" -n CLUSTER-NAMESPACE

kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -l "tkg.tanzu.vmware.com/cluster-name=CLUSTER-NAME" -n CLUSTER-NAMESPACE

If any objects remain after the cluster deletion, you can use kubectl to manually remove them.

To delete remaining objects, use the following command for each type of resource:

kubectl delete <resource> <name> -n <namespace>

Replace <resource>, <name>, and <namespace> with the appropriate values. Once all lingering objects have been cleared, you can proceed with redeploying the cluster. The process should then complete successfully without further issues.

After an object is deleted, it may sometimes be necessary to remove its finalizer. When doing so, please ensure that the object being deleted belongs to this cluster and not another.

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No