Error
status.LoadbalancerError
" can be seen in the Kubernetes Container Clusters UI in the Cluster Information page, in the Events tab:
Event Details
Name: LoadBalancerError
Type: Error
Time: <DATE>
Resource Name: N/A
error: failed to create load balancer for the cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]: [unable to create load balancer. err [unable to determine if gateway [
<EDGE_NAME>
] is using Ip Spaces or not. error [[ENF] entity not found: got 0 Edge Gateways by name '<EDGE_NAME>
']]]
/var/log/cloud-final.error
file on the Bootstrap EPHEMERAL-TEMP-VM has repeating errors of the form:
<DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: Error: "<CLUSTER_NAME>-kubeconfig" not found in namespace "<CLUSTER_NAME>-ns": secrets "<CLUSTER_NAME>-kubeconfig" not found
<DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: + [[ -n '' ]]
<DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: + echo 'Waiting for kubeconfig to be available...'
<DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: Waiting for kubeconfig to be available...
<DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: + sleep 5
capvcd-controller-manager
pod in the capvcd-system
namespace in the KIND cluster on the Bootstrap EPHEMERAL-TEMP-VM shows errors of the form:
<DATE> ERROR Reconciler error {"controller": "vcdcluster", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "VCDCluster", "VCDCluster": {"name":"<CLUSTER_NAME>","namespace":"<CLUSTER_NAME>-ns"}, "namespace": "<CLUSTER_NAME>-ns", "name": "<CLUSTER_NAME>", "reconcileID": "<RECONCILE_ID>", "error": "failed to create load balancer for the cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]: [unable to create load balancer. err [unable to determine if gateway [<EDGE_NAME>] is using Ip Spaces or not. error [[ENF] entity not found: got 0 Edge Gateways by name '<EDGE_NAME>']]]
/root/cse.log
file on the CSE server shows errors of the form:
{"level":"error","ts":"<DATE>","caller":"app/main.go:660","msg":"error creating cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)] : [error occurred during the execution of cluster creation script for [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]: [error while bootstrapping the machine [<CLUSTER_NAME>/EPHEMERAL-TEMP-VM]; timeout for post customization phase [guestinfo.cloudinit.target.cluster.get.kubeconfig.status]; this usually means that the Avi Load-Balancer is not getting created; check if virtual service for the cluster is created, if not, please check your Avi configuration]]","workerID":"<WORKER_ID>","stacktrace":"main.processRDE\n\t/app/main.go:660"}
{"level":"info","ts":"<DATE>","caller":"repair/heartbeat.go:96","msg":"Received heartbeat message [Waiting for postcustomization status [guestinfo.cloudinit.target.cluster.get.kubeconfig.status] exceeded timeout [600.000000] seconds] for RDE [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]","workerID":"<WORKER_ID>"}
{"level":"info","ts":"<DATE>","caller":"utils/rdeUtils.go:917","msg":"Assigning value [error] to key [State] in [types.VCDKEStatus] for cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]","workerID":"<WORKER_ID>"}
This issue occurs if the Edge Gateway and Routed Organization VDC Network are scoped to a Data Center Group in VMware Cloud Director (VCD), IP Spaces are in use and no Control Plane IP (Optional) is provided when creating the TKG cluster.
To confirm if the Edge Gateway and Routed Organization VDC Network are part of a Data Center Group, view the Scope column visible in the Tenant portal of Cloud Director in Networking > Edge Gateways and Networking > Networks.
To view the Data Center Groups in the Tenant portal of Cloud Director navigate to Networking > Data Center Groups.
This issue is resolved in the Kubernetes External Cloud Provider for VMware Cloud Director 1.6.1 release.
To configure CSE to use this CPI 1.6.1 version when creating Kubernetes clusters, update the CSE server configuration through the Cloud Director Provider portal as per the documentation, Update Server Configuration.
To workaround this issue deploy the Kubernetes clusters to a Routed Organization VDC network and not a Data Center Group network.
Alternatively to workaround this issue specify a Control Plane IP when creating a TKG cluster instead of allowing one to be automatically assigned.
To determine the IPs available review the specific IP Space used with the Edge Gateway and Routed Organization VDC Network to which the Kubernetes cluster is connected in the Tenant portal of Cloud Director in Networking > IP Spaces > IP Space Name > Floating IPs.
For more information on the Control Plane IP see the CSE documentation on Using VMware Cloud Director Container Service Extension as a Tenant User, Create a Tanzu Kubernetes Grid Cluster.
Please also see the KB article for the related issue when creating load balancer services, Error: "entity not found: got 0 Edge Gateways by name '<EDGE_NAME>'" when creating a load balancer service.