TKG upgrade failed with "unable to retrieve the complete list of server APIs"
TKG upgrade failed with "unable to retrieve the complete list of server APIs"


Article ID: 313118


TKG Management cluster upgrade fails with error:
action failed after 9 attempts: unable to retrieve the complete list of server APIs:


VMware Tanzu Kubernetes Grid Plus 1.x


Check the status of  ApiService, it will most likely be in FailedDiscoveryCheck state
kubectl get apiservices           kube-system/antrea                          False (FailedDiscoveryCheck)




Confirm that APIService is no longer used in the relevant TKG version.
In this example, TKG 1.6 Release notes confirm that it uses Antrea 1.5.3. And Antrea 1.5.3 antrea.yaml file confirms that it supports, not v1beta1.
So it is safe to delete APIService But first, the APIService needs to be removed from the Antrea CRS secret.

Update antrea-crs.yaml and remove  APIService
kubectl get secret -n tkg-system <Mgmt Cluster Name>-antrea-crs -o jsonpath='{.data.value}' | base64 -d > antrea-crs.yaml
vi antrea-crs.yaml

Encode the updated antrea-crs.yaml file update Antrea CRS secret .data.value with the encoded data
cat antrea-crs.yaml | base64 -w 0
kubectl edit secret -n tkg-system <Mgmt Cluster Name>-antrea-crs 

Delete the APIService
kubectl delete apiservice

The TKG Management cluster upgrade can started again.

Additional Information

While this KB is specific to, the same troubleshooting and resolution steps can be applied to other ApiServices.