During a vSphere Supervisor upgrade, and "Component AKOUpgrade failed" error message is received.
search cancel

During a vSphere Supervisor upgrade, and "Component AKOUpgrade failed" error message is received.

book

Article ID: 423482

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

While performing a vSphere Supervisor upgrade to the next incremental version, a configuration error stating "Component AKOUpgrade failed" is received:

  • A general system error occurred. Error message: Component AKOUpgrade failed: Failed to run command: ['kubectl', 'rollout', 'status', 'deployment', 'vmware-system-ako-ako-controller-manager', '-n', 'vmware-system-ako', '--timeout=3m', '--watch=true'] ret=1 out=Waiting for deployment "vmware-system-ako-ako-controller-manager" rollout to finish: 0 of 1 updated replicas are available... err=error: timed out waiting for the condition Component upgrade failed..
  • The vmware-system-ako-ako-controller-manager pod is restarting, and ends up in a CrashLoopBackOff state:
    • NAMESPACE                                   NAME                                                              READY   STATUS             RESTARTS        AGE
      vmware-system-ako                           vmware-system-ako-ako-controller-manager-##########-#####         1/2     CrashLoopBackOff   51 (2m6s ago)   3h58m

Environment

  • vSphere Supervisor with NSX-T and AVI-ALB

Cause

Avi controller information is missing because the "avi-secret" is not found, which is indicated in the vmware-system-ako-ako-controller-manager logs.

  • Entries in the vmware-system-ako-ako-controller-manager logs may include the following examples: 
    • 2025-##-##T##:##:##.#########Z stdout F 2025-##-##T##:##:##.###Z        ESC[31mERRORESC[0m      lib/lib.go:1720 secrets "avi-secret" not found
      2025-##-##T##:##:##.#########Z stdout F 2025-##-##T##:##:##.###Z        ESC[33mWARNESC[0m       ako-main/main.go:237    Error while fetching secret for AKO bootstrap secrets "avi-secret" not found
      2025-##-##T##:##:##.#########Z stdout F 2025-##-##T##:##:##.###Z        ESC[34mINFOESC[0m       api/api.go:68   Shutting down the API server
      2025-##-##T##:##:##.#########Z stdout F 2025-##-##T##:##:##.###Z        ESC[31mFATALESC[0m      cache/avi_ctrl_clients.go:52    Avi Controller information missing (username: , password: , authToken: , controller: ###.###.###.###). Update them 

 

This issue can be caused when a custom certificate is applied to the Avi Load Balancer, and the certificate chain is not available within the NSX Manager.

  • Logs from the nsx-ncp pods will include entries similar to the follow:
    • 2025-##-##T##:##:##.#########Z stderr F [ncp GreenThread-132 W] vmware_nsxlib.v3.client The HTTP request returned error code 400, whereas 201/200 response codes were expected. Response body {'httpStatus': 'BAD_REQUEST', 'error_code': 500016, 'module_name': 'Policy', 'error_message': 'Error: I/O error on GET request for \"https://###.###.###.###/api/user\": PKIX path building failed: java.security.cert.CertPathBuilderException: Unable to find certificate chain.; nested exception is javax.net.ssl.SSLHandshakeException: PKIX path building failed: java.security.cert.CertPathBuilderException: Unable to find certificate chain.'}","annotations":{"last-sync":"##########.#######","prometheus.io/port":"8001","prometheus.io/scrape":"true"},"namespace_name":"vmware-system-nsx","pod_id":"########-####-####-####-############","labels":{"pod-template-hash":"#########","tier":"nsx-networking","component":"nsx-ncp","version":"v1"},"procid":"nsx-ncp-#########-#####"}]

Resolution

Review KB article 385435 for additional details on this particular issue, and the steps to resolve it: