The supervisor upgrade is stuck as AKO detects the wrong cloud for the Supervisor cluster.
Check the namespace annotation for cloud information by running the below command. Cloud name against annotation "ako.vmware.com/wcp-cloud-name" will be different than the expected NSX-T cloud.
kubectl get ns vmware-system-ako -oyaml
Avi with NSX-T cloud deployed with TKGS and AKO
Multiple NSX-T clouds configured on same Data transport zone
Workaround: AKO uses nsxt-alb user's credentials to make API calls to the Avi Controller. The nsxt-alb user's permission is defined in the Nsxt-Alb-Admin role. The issue can be resolved by limiting access to the Nsxt-Alb-Admin role to the concerned cloud and associated resources.
Login to the Avi controller and launch "shell" from avi cli.
Restrict read access to the cloud in Avi. It can be done by running the below commands in Avi controller shell
[admin:10-167-118-30]: > configure controller properties
[admin:10-167-118-30]: controllerproperties> restrict_cloud_read_access
[admin:10-167-118-30]: controllerproperties> save
++ Add a marker to the intended cloud with key as clustername and values as the Supervisor cluster name
[admin:10-167-118-30]: > configure cloud nsxt-cloud
[admin:10-167-118-30]: cloud> markers
New object being created
[admin:10-167-118-30]: cloud:markers> key clustername
[admin:10-167-118-30]: cloud:markers> values domain-c10
[admin:10-167-118-30]: cloud:markers> save
[admin:10-167-118-30]: cloud> save
++ Add a marker to the conflicting clouds with a key other than clustername
[admin:10-167-118-30]: > configure cloud other-nsxt-cloud
[admin:10-167-118-30]: cloud> markers
New object being created
[admin:10-167-118-30]: cloud:markers> key avi
[admin:10-167-118-30]: cloud:markers> values avi
[admin:10-167-118-30]: cloud:markers> save
[admin:10-167-118-30]: cloud> save
++ Update the Nsxt-Alb-Admin role to filter other clouds by adding a label based filter in the role. It was done using the AVI UI.
++ Login to Avi Controller UI and navigate: Administration > Accounts > Roles
++ Edit the Nsxt-Alb-Admin role, scroll down to the Labels section and click Add
++ Configure the Label filter by setting the Name field
Key → clustername
Criteria → Unix styled glob match
Values → domain* (this matches any value that starts with domain)
Restart the AKO pod in the Supervisor cluster, it can be done by logging into the Supervisor control plane node and running the below command
kubectl delete pod <pod_name> -n vmware-system-ako
Verify
===
Run the below command to check the cloud detected by AKO.
kubectl get ns vmware-system-ako -oyaml
The cloud name in the annotation "ako.vmware.com/wcp-cloud-name" should reflect the correct cloud. (Refer above in Issue section)