Namespace-specific services on vSphere Supervisor are different compared to the Guest Cluster
search cancel

Namespace-specific services on vSphere Supervisor are different compared to the Guest Cluster

book

Article ID: 417915

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

  • Under some specific circumstances, the services of type LoadBalancer get out-of-sync between vSphere Supervisor- and Guest Cluster (TKC)-perspective on individual namespaces.
  • Comparing the services count on the Supervisor and on the Guest Cluster itself, they drift:
    root@Supervisor [ ~ ]# k get services -n <vsphere-namespace> --no-headers | grep -E "tkc-.*LoadBalancer" | grep -v "-control-plane-service" | wc -l
    3

    root@tkc-#####-#####:~# k get services -A --no-headers | grep LoadBalancer | wc -l
    2

Environment

VMware vSphere with Kubernetes

Cause

A cause observed in the past is when corresponding required finalizers on services resources were removed within the Guest Cluster, hence vSphere Supervisor is not informed about service removal.

Resolution

To get them back in sync, first the differences must be identified. Then, if Supervisor has more services any leftovers need to be deleted from the vSphere Supervisor.

Note: If there are any further questions or concerns, please consult with Broadcom Support prior deleting the virtualmachineservice resource.

Steps:

  1. First, get all services from a specific namespace from the perspective of the Supervisor:
    root@SV [ ~ ]# kubectl get services -n <vsphere-namespace> --no-headers | grep -E "tkc-.*LoadBalancer"
    tkc-0###################a   LoadBalancer   ##.##.##.10   ##.##.##.200   1337/TCP    137m
    tkc-1###################b   LoadBalancer   ##.##.##.11   ##.##.##.210   1337/TCP    128d
    tkc-2###################c   LoadBalancer   ##.##.##.12    <pending>      1337/TCP    137m
    tkc-control-plane-service   LoadBalancer   ##.##.##.1     ##.##.##.220   6443/TCP    135d  --> OK and expected

    root@SV [ ~ ]# kubectl get services -n <vsphere-namespace> --no-headers | grep -E "tkc-.*LoadBalancer" | grep -v "control-plane-service" | wc -l
    3
    From this Supervisor perspective, there are 4 services of type LoadBalancer. The service "tkc-control-plane-service" is solely on the Supervisor and must not be deleted. Hence, 3 relevant services in total.

  2. Now, connect to the Guest Cluster (via SSH or kubectl) and retrieve the configured services from this perspective:
    root@tkc-#####-#####:~# kubectl get services -A --no-headers | grep LoadBalancer
    test-namespace    app-1    LoadBalancer   ##.##.##.12  <pending>    1337:8080/TCP   137m
    test-namespace    app-2    LoadBalancer   ##.##.##.10   ##.#.##.200 1337:8080/TCP   137m

    root@tkc-#####-#####:~# kubectl get services -A --no-headers | grep LoadBalancer | wc -l
    2

    From this Guest Cluster perspective, there are 2 services in total.

  3. In this example, the Supervisor has 3 user-created services while the Guest Cluster has 2 user-created services. In other words, the Supervisor has 1 service as a possible leftover.
  4. The next step is identifying the 1 leftover service which should be cleaned up based on the IP address and port definitions:
    1. Checking relevant services from above output, service with IP ##.##.##.200 and port 1337/tcp is present from both perspectives. The ID is tkc-0###################a. This is OK.
    2. However, the service with IP ##.##.##.210 and same port 1337/tcp is only present on the Supervisor and not on the Guest Cluster. The ID is tkc-1###################b. Hence, this is most likely a leftover.
  5. As service ID tkc-1###################b has been identified as the leftover, it should be double-verified that this service is no longer present on the guest cluster and no longer used.
  6. Verify if above service is indeed no longer present in the respective Guest Cluster by relying on this information:
    root@SV [ ~ ]# kubectl describe virtualmachineservice -n test-namespace tkc-1###################b
    Name:         tkc-1###################b
    Namespace:    test-namespace
    Labels:       run.tanzu.vmware.com/cluster.name=tkc
                run.tanzu.vmware.com/service.name=web-app-http
                run.tanzu.vmware.com/service.namespace=webapp
    [...]
    Spec:
    Ports:
      Name:         http
      Port:         1337
      Protocol:     TCP
      Target Port:  8080
    [...]
    Status:
    Load Balancer:
      Ingress:
        Ip:  ##.##.##.210
  7. If above has been confirmed, it can be deleted from the Supervisor by running:
    kubectl delete virtualmachineservice -n <vsphere-namespace> tkc-1###################b=