TKG upgrade failed with "unable to retrieve the complete list of server APIs"
search cancel

TKG upgrade failed with "unable to retrieve the complete list of server APIs"

book

Article ID: 313118

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Symptoms:
TKG Management cluster upgrade fails with error:
action failed after 9 attempts: unable to retrieve the complete list of server APIs: controlplane.antrea.tanzu.vmware.com/v1beta1


Environment

VMware Tanzu Kubernetes Grid Plus 1.x

Cause

Check the status of  ApiService v1beta1.controlplane.antrea.tanzu.vmware.com, it will most likely be in FailedDiscoveryCheck state
kubectl get apiservices
v1beta1.controlplane.antrea.tanzu.vmware.com           kube-system/antrea                          False (FailedDiscoveryCheck)

 

 

Resolution

Confirm that APIService v1beta1.controlplane.antrea.tanzu.vmware.com is no longer used in the relevant TKG version.
In this example, TKG 1.6 Release notes confirm that it uses Antrea 1.5.3. And Antrea 1.5.3 antrea.yaml file confirms that it supports v1beta2.controlplane.antrea.tanzu.vmware.com, not v1beta1.
So it is safe to delete APIService v1beta1.controlplane.antrea.tanzu.vmware.com. But first, the APIService needs to be removed from the Antrea CRS secret.

Update antrea-crs.yaml and remove  APIService v1beta1.controlplane.antrea.tanzu.vmware.com
kubectl get secret -n tkg-system <Mgmt Cluster Name>-antrea-crs -o jsonpath='{.data.value}' | base64 -d > antrea-crs.yaml
vi antrea-crs.yaml

Encode the updated antrea-crs.yaml file update Antrea CRS secret .data.value with the encoded data
cat antrea-crs.yaml | base64 -w 0
kubectl edit secret -n tkg-system <Mgmt Cluster Name>-antrea-crs 

Delete the APIService
kubectl delete apiservice v1beta1.controlplane.antrea.tanzu.vmware.com

The TKG Management cluster upgrade can started again.

Additional Information

While this KB is specific to controlplane.antrea.tanzu.vmware.com/v1beta1, the same troubleshooting and resolution steps can be applied to other ApiServices.