Cannot delete Kubernetes Container cluster within VMware Cloud Director Container Service Extension
search cancel

Cannot delete Kubernetes Container cluster within VMware Cloud Director Container Service Extension

book

Article ID: 325672

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • Deleting a TKG cluster within Cloud Director gives the following error and never deletes:

error deleting resources by rdeId: [urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a], [error deleting resources by rde ID: [urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a] after [10] retry attempts: [error occurred deleting L4 loadbalancer for rde [<name of virtual service>(urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a)]: [virtual service [<name of virtual service>-urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a-tcp] is busy]. [0] remaining retry attempts]]

  • In the cse.log the following entries are seen: 
E0111 12:17:21.341433   17639 gateway.go:1049] Virtual service [<cluster_name>-urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a-tcp] is still being configured. Virtual service status: [REALIZATION_FAILED]
E0111 12:17:21.341476   17639 gateway.go:1788] delete virtual service failed; virtual service [<cluster_name>-urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a-tcp] is busy: [virtual service [<cluster_name>-urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a-tcp] is busy]
{"level":"error","ts":"2024-01-11T12:17:21.341Z","caller":"cluster/clusterManager.go:821","msg":"error occurred deleting L4 loadbalancer for rde [<cluster_name>(urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a)]: [virtual service [<cluster_name>-urn:vcloud:entity:vmware:capvcdCluster:########-####-####-####-########110a-tcp] is busy]. [33] remaining retry attempts","workerID":"########-####-####-####-########234f","stacktrace":"########/core-build/vcd-k8s-provider/src/cluster.DeleteResourcesByRDEId\n\t/app/src/cluster/clusterManager.go:821\n########/core-build/vcd-k8s-provider/src/cluster.DeleteWithoutScript\n\t/app/src/cluster/clusterManager.go:1110\ngitlab.eng.vmware.com/core-build/vcd-k8s-provider/src/cluster.DeleteCluster\n\t/app/src/cluster/clusterManager.go:429\nmain.processRDE\n\t/app/main.go:691"}
  • Unable to delete a k8s cluster.

Environment

VMware Cloud Director 10.x
VMware Container Service Extension 4.x

Cause

This is a known issue which is caused due to the network configuration that was used with the failed cluster deployments. If NSX-T is not used, VMware Container Service Extension(CSE) is unable to find edge gateway reference to delete L4 load balancer components which would result in the error.

Resolution

To workaround the issue, carry out the following steps :

  1. Delete existing vApps associated with the failed clusters (they should have the same name as the cluster, if any vApp remained).
  2. Double check to ensure the Load Balancer components (virtual service, LB pool members) are not present for the cluster to delete. They are prefixed with the cluster name.
  3. In the interface, click on the cluster "Overview" tab and select "vApp Details" to locate the clusterid. Example format of: urn:vcloud:entity:vmware:capvcdCluster:#####-#####-####-####-####



  4. For each cluster, execute the following API requests (using the cluster ID noted from Step 3) in the following order to remove the cluster from the UI. Cluster ID used in the API request includes all of the text, in the format urn:vcloud:entity:vmware:capvcdCluster:#####-#####-####-####-####

a. POST {{vcd-ip-or-fqdn}}/cloudapi/1.0.0/entities/{{clusterId}}/resolve
b. DELETE {{vcd-ip-or-fqdn}}/cloudapi/1.0.0/entities/{{clusterId}}

Note: When attempting to perform these API requests, if you receive an "Access to resource is forbidden" response from the API, ensure that the settings "Auto Repair on Errors" & "Node Health Check" are disabled. To validate this, click on the "SETTINGS" button on the cluster (visible in the screenshot above in step 3).


Example API from API Explorer (embedded in the Cloud Director UI).