How to recover supervisor and guest cluster control plane LB IP hosted on Avi Load balancer
search cancel

How to recover supervisor and guest cluster control plane LB IP hosted on Avi Load balancer

book

Article ID: 390201

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

User is unable to connect to control-plane service with Error : "Error occurred during HTTP request: Get "https://<redacted>/wcp/loginbanner": tls: failed to verify certificate: x509: certificate is valid for <redacted>, not <redacted>"

[root@worker1:~# kubectl vsphere login --server=https://<redacted>
ERRO[2025-04-08 14:11:41.262] Error occurred during HTTP request: Get "https://<redacted>/wcp/loginbanner": tls: failed to verify certificate: x509: certificate is valid for <redacted>, not <redacted>
There was an error when trying to connect to the server.
Please check the server URL and try again.

 

IP address of Supervisor Cluster and Guest Cluster control plane services changed.

Environment

  • vSphere with Tanzu

Cause

There are a couple of scenarios when the user can land into an issue leading to a change in IP address of the control-plane services. 

  •  If kube-apiserver-lb-svc service gets impacted, it can lead AKO to delete the config on Avi. At a later point, when the service comes back up healthy, AKO will be notified to recreate the services on Avi. As all the virtual services get deleted and recreated by AKO, the IPs for virtual services (hosting control-plane svc) can change.
  •  In a SDDC-managed infra, if multiple vCenter Servers are sharing a single NSX cloud and a new cluster is created in a new workload domain, it can lead to the deletion of existing virtual services on Avi present for another cluster. This happens as the initials of the cluster name (domain-<>)  is auto-created and can end up being the same for two different clusters (part of different workload domain). This is currently an unsupported configuration, and the newly created cluster must be deleted followed by AKO restart on supervisor to recover the virtual services for the old cluster. 
    Reference document : 

https://techdocs.broadcom.com/us/en/vmware-cis/vsphere/vsphere-supervisor/8-0/vsphere-supervisor-concepts-and-planning/supervisor-architecture-and-components/supervisor-networking.html 

 

 

Resolution

To restore the IPs of the supervisor and guest cluster control plane services, the user should have the old IPs mapping present for each control plane svc. 

Steps to restore IP in both VDS and NSX WCP deployments -

1. Run below command on supervisor cluster to get the currently assigned IPs for the control plane services. 

 kubectl get service -A | awk '/LoadBalancer/ {print $1,$2,$5}' > svcs

2. Modify the above-created "svcs" file for correct IP with the help of user-provided mapping.

3. Run below command to identify the number of replicas for the net-operator service 

 kubectl get deployments -n vmware-system-netop -o yaml

Check and note “spec.replicas” in the output of the above command

4. Run below command to remove the current net-operator pods

kubectl patch deployment vmware-system-netop-controller-manager -n vmware-system-netop --subresource='scale' --type='merge' -p '{"spec":{"replicas":0}}'

5. In AVI UI, Delete Virtual Services(VS) and referenced objects created by AKO - select all VS required to delete, below pop up will be shown

6. Run the below script  “. /<script.sh> svcs”. (Please make sure that the script has execute permissions)

#!/usr/bin/bash
filename="$1"
while IFS=' ' read -r f1 f2 f3; do
kubectl get gateway "$f2" -n "$f1" -oyaml | sed 's/addresses: \[\]/addresses:\n    - type: IPAddress\n      value: '"$f3"'/1' | kubectl replace -f -
done < "$filename"

7.Restart AKO

8.Run below command to restart net-operator pods

kubectl patch deployment vmware-system-netop-controller-manager -n vmware-system-netop --subresource='scale' --type='merge' -p '{"spec":{"replicas":<Count from Step 3>}}'