TMC create cluster only shows "Creating" using NSX Topology with errors "failed to reconcile loadbalanced endpoint for WCPCluster" and "Missing LB source IP for namespace"

search cancel

TMC create cluster only shows "Creating" using NSX Topology with errors "failed to reconcile loadbalanced endpoint for WCPCluster" and "Missing LB source IP for namespace"

book

Article ID: 386360

calendar_today

Updated On: 01-21-2025

Products

VMware Tanzu Mission Control Tanzu Mission Control Tanzu Mission Control Prepaid Commitment Plan per Core VMware Tanzu Mission Control - SM VMware Tanzu Mission Control Self-Managed

Issue/Introduction

Scenario:

Unable to complete creation of a new Kubernetes guest cluster using TMC in an individual vSphere Namespace.

When using TMC, the guest cluster name is created on vSphere but no VMs for the control plane and worker nodes were not being created.

No Virtual Machines get created.

Attempts to create guest clusters in other vSphere Namespaces are successful.

Attempt to create guest cluster in the problematic vSphere Namespace manually using CLI result in the same failure.

Environment

vSphere IaaS Control Plane (vSphere with Tanzu)

NSX-T

Cause

Summary:

This occurred because one of the two NSX nodes deployed was in maintenance mode. It is also a good idea to check for any expired NSX Certificates and to rotate those if found.

Details:

Errors from: kubectl describe cluster CLUSTER_NAME

Warning ReconcileFailure 17m (x10 over 17m) vmware-system-capw/vmware-system-capw-controller-manager/WCPCluster unexpected error while reconciling control plane endpoint for <VS NAMESPACE>: failed to reconcile loadbalanced endpoint for WCPCluster <VS NAMESPACE>/<VS CLUSTER>: failed to get control plane endpoint for Cluster <VS NAMESPACE>/<VS CLUSTER>: VirtualMachineService LB does not yet have VIP assigned: VirtualMachineService LoadBalancer does not have any Ingresses
Warning ReconcileFailure 59s (x10 over 17m) vmware-system-capw/vmware-system-capw-controller-manager/WCPCluster failed to configure cluster network for WCPCluster <VS NAMESPACE>/<VS CLUSTER>: virtual network ready status is: 'False' in cluster <VS NAMESPACE>/<VS CLUSTER>. reason: NetworkNotRealized, message:
1 wcpmachine_controller.go:315] vmware-system-capw-controller-manager/WCPMachine/<VS NAMESPACE>/<VS CLUSTER>/<VS CLUSTER>-default-nodepool-qzxvj-9jjmf "msg"="Waiting for the control plane to be initialized"

Errors from: vmware-system-capw-controller-manager log

E0114 18:58:45.235498 1 controller.go:317] controller/WCPCluster "msg"="Reconciler error" "error"="failed to configure cluster network for WCPCluster <VS NAMESPACE>/<VS CLUSTER>: virtual network ready status is: 'False' in cluster <VS NAMESPACE>/<VS CLUSTER>. reason: NetworkNotRealized, message: Cannot realize ne
twork" "name"="<NETWORK NAME>" "namespace"="<VS NAMESPACE>" "reconciler group"="infrastructure.cluster.vmware.com" "reconciler kind"="WCPCluster"

Errors from: capi-kubeadm-controller log

NOTE: In reference to no VMS, this shows that the initial CP Machine object never comes up:

I0114 18:24:44.682681 1 scale.go:212] "msg"="Waiting for control plane to pass preflight checks" "cluster-name"="CLUSTER_NAME" "name"="CLUSTER_NAME-control-plane" "namespace"="VS_NAMESPACE" "failures"="[machineCLUSTER_NAME-control-plane-5w92q reports ControllerManagerPodHealthy condition is false (Info, Waiting for startup or readiness probes), machine CLUSTER_NAME-control-plane-5w92q reports SchedulerPodHealthy condition is false (Info, Waiting for startup or readiness probes), machine CLUSTER_NAME-control-plane-8cnzp reports APIServerPodHealthy condition is false (Error, Missing node), machine CLUSTER_NAME-control-plane-8cnzp reports ControllerManagerPodHealthy condition is false (Error, Missing node), machine CLUSTER_NAME-control-plane-8cnzp reports SchedulerPodHealthy condition is false (Error, Missing node), machine CLUSTER_NAME-control-plane-8cnzp reports EtcdPodHealthy condition is false (Error, Missing node), machine CLUSTER_NAME-control-plane-8cnzp does not have EtcdMember

Error from: vmware-system-nsx_nsx-ncp log

2025-01-17T16:34:40.883504843Z stderr F [ncp GreenThread-50 W] nsx_ujo.ncp.nsx.policy.firewall_service Missing LB source IP for namespace VSPHERE_NAMESPACE

Resolution

Solution is to bring the NSX Node out of maintenance mode.

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No