Error: "entity not found: got 0 Edge Gateways by name '<EDGE_NAME>'" when creating a Kubernetes cluster
search cancel

Error: "entity not found: got 0 Edge Gateways by name '<EDGE_NAME>'" when creating a Kubernetes cluster

book

Article ID: 369778

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • Attempting to create a Tanzu Kubernetes Grid (TKG) cluster using VMware Cloud Director Container Service Extension (CSE) fails and shows an Error status.
  • Error: "LoadbalancerError" can be seen in the Kubernetes Container Clusters UI in the Cluster Information page, in the Events tab:

    Event Details
    Name:   LoadBalancerError
    Type:   Error
    Time:   <DATE>
    Resource Name:  N/A
    error:  failed to create load balancer for the cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]: [unable to create load balancer. err [unable to determine if gateway [<EDGE_NAME>] is using Ip Spaces or not. error [[ENF] entity not found: got 0 Edge Gateways by name '<EDGE_NAME>']]]

  • The /var/log/cloud-final.error file on the Bootstrap EPHEMERAL-TEMP-VM has repeating errors of the form:

    <DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: Error: "<CLUSTER_NAME>-kubeconfig" not found in namespace "<CLUSTER_NAME>-ns": secrets "<CLUSTER_NAME>-kubeconfig" not found
    <DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: + [[ -n '' ]]
    <DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: + echo 'Waiting for kubeconfig to be available...'
    <DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: Waiting for kubeconfig to be available...
    <DATE> <CLUSTER_NAME>@<ORG_NAME>/<USER_NAME>: + sleep 5

  • The capvcd-controller-manager pod in the capvcd-system namespace in the KIND cluster on the Bootstrap EPHEMERAL-TEMP-VM shows errors of the form:

    <DATE>        ERROR   Reconciler error        {"controller": "vcdcluster", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "VCDCluster", "VCDCluster": {"name":"<CLUSTER_NAME>","namespace":"<CLUSTER_NAME>-ns"}, "namespace": "<CLUSTER_NAME>-ns", "name": "<CLUSTER_NAME>", "reconcileID": "<RECONCILE_ID>", "error": "failed to create load balancer for the cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]: [unable to create load balancer. err [unable to determine if gateway [<EDGE_NAME>] is using Ip Spaces or not. error [[ENF] entity not found: got 0 Edge Gateways by name '<EDGE_NAME>']]]

  • The /root/cse.log file on the CSE server shows errors of the form:

    {"level":"error","ts":"<DATE>","caller":"app/main.go:660","msg":"error creating cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)] : [error occurred during the execution of cluster creation script for [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]: [error while bootstrapping the machine [<CLUSTER_NAME>/EPHEMERAL-TEMP-VM]; timeout for post customization phase [guestinfo.cloudinit.target.cluster.get.kubeconfig.status]; this usually means that the Avi Load-Balancer is not getting created; check if virtual service for the cluster is created, if not, please check your Avi configuration]]","workerID":"<WORKER_ID>","stacktrace":"main.processRDE\n\t/app/main.go:660"}
    {"level":"info","ts":"<DATE>","caller":"repair/heartbeat.go:96","msg":"Received heartbeat message [Waiting for postcustomization status [guestinfo.cloudinit.target.cluster.get.kubeconfig.status] exceeded timeout [600.000000] seconds] for RDE [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]","workerID":"<WORKER_ID>"}
    {"level":"info","ts":"<DATE>","caller":"utils/rdeUtils.go:917","msg":"Assigning value [error] to key [State] in [types.VCDKEStatus] for cluster [<CLUSTER_NAME>(urn:vcloud:entity:vmware:capvcdCluster:<CLUSTER_ID>)]","workerID":"<WORKER_ID>"}

  • IP Spaces are used with the Edge Gateway and Routed Organization VDC Network to which the Kubernetes cluster is connected.

Environment

  • VMware Cloud Director Container Service Extension 4.2.1
  • VMware Cloud Director 10.5.1.1
  • Kubernetes External Cloud Provider for VMware Cloud Director (CPI) 1.6

Cause

This issue occurs if the Edge Gateway and Routed Organization VDC Network are scoped to a Data Center Group in VMware Cloud Director (VCD), IP Spaces are in use and no Control Plane IP (Optional) is provided when creating the TKG cluster.

To confirm if the Edge Gateway and Routed Organization VDC Network are part of a Data Center Group, view the Scope column visible in the Tenant portal of Cloud Director in Networking > Edge Gateways and Networking > Networks.

To view the Data Center Groups in the Tenant portal of Cloud Director navigate to Networking > Data Center Groups.

Resolution

This issue is resolved in the Kubernetes External Cloud Provider for VMware Cloud Director 1.6.1 release.

To configure CSE to use this CPI 1.6.1 version when creating Kubernetes clusters, update the CSE server configuration through the Cloud Director Provider portal as per the documentation, Update Server Configuration.

Workaround

To workaround this issue deploy the Kubernetes clusters to a Routed Organization VDC network and not a Data Center Group network.

Alternatively to workaround this issue specify a Control Plane IP when creating a TKG cluster instead of allowing one to be automatically assigned.

To determine the IPs available review the specific IP Space used with the Edge Gateway and Routed Organization VDC Network to which the Kubernetes cluster is connected in the Tenant portal of Cloud Director in Networking > IP Spaces > IP Space Name > Floating IPs.

Additional Information

For more information on the Control Plane IP see the CSE documentation on Using VMware Cloud Director Container Service Extension as a Tenant User, Create a Tanzu Kubernetes Grid Cluster.

Please also see the KB article for the related issue when creating load balancer services, Error: "entity not found: got 0 Edge Gateways by name '<EDGE_NAME>'" when creating a load balancer service.