AKO Pod Crashing Due to Expired Avi Controller Certificate
search cancel

AKO Pod Crashing Due to Expired Avi Controller Certificate

book

Article ID: 439552

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

  • The Avi Kubernetes Operator (AKO) pod on the Supervisor Cluster enters a CrashLoopBackOff state. The AKO pod status can be verified using the following command:
kubectl get pods -n vmware-system-ako

Example output:
 
vmware-system-ako ako-controller-manager-######## 1/2 CrashLoopBackOff
 
  • The AKO pod logs can be collected using the command: kubectl logs <ako-pod-name> -n vmware-system-ako. The following error is observed in the AKO logs:
Client error for URI: login. Error: Post "https://[IP_ADDRESS]/login": tls: failed to verify certificate: x509: certificate has expired or is not yet valid
 




Environment

VKS with NSX and Avi LB

Cause

The Avi controller portal certificate has expired, preventing the AKO pod from establishing a secure connection to the controller.

To confirm if the Avi portal certificate has expired, follow the below steps: 

  1. Login to Avi Controller UI and navigate to Administration > system settings and note the name of the certificate under SSL - SSL/TLS Certificate in system settings. 
  2. Navigate to templates > security > ssl/tls certificate and check the "valid until" date for the certificate noted in step 1. This will confirm the expiry status of the certificate.  

 

 

 

Resolution

Follow these steps to renew the certificate and restore AKO pod functionality:

  1. Generate a new certificate for Avi Load Balancer:

    • Create a server certificate for Avi load balancer. You can either self-sign the certificate or get it signed by a CA. This certificate should include the
      Avi Load Balancer
      Controller IP in its
      Subject Alternative Name (SAN)
      field.
  2. Apply the certificate to the Avi portal:

    • Navigate to Administration > System Settings.
    • Click Edit and go to SSL/TLS Certificates.
    • Select the newly created controller certificate under SSL/TLS Certificates section. 
  3. Deregister the Avi Load Balancer from NSX using the below API Call: 

    curl -k --location -u 'NSX_admin_user:NSX_admin_password' --request DELETE https://[HOSTNAME]/policy/api/v1/infra/alb-onboarding-workflow/LCM

     

  4. Register the Avi Load Balancer to NSX using the following command: 

    curl -k --location -u 'NSX_admin_user:NSX_admin_password' --request PUT 'https://<nsx_mgr_address>/policy/api/v1/infra/alb-onboarding-workflow' \
    --header 'X-Allow-Overwrite: True' \
    --header 'Content-Type: application/json' \
    --data-raw '{
    "owned_by": "LCM",
    "cluster_ip": "<cluster_ip>",
    "infra_admin_username" : "<Avi_cluster_username>",
    "infra_admin_password" : "<Avi_cluster_password>",
    "dns_servers": ["<dns-server ip>"],
    "ntp_servers": ["<ntp server ip>"]
    }'

     

  5. Validate the registration is successful using the below API: 

    curl -k -u 'NSX_admin_user:NSX_admin_password' --location --request GET \
    https://<nsx-mgr-ip>/policy/api/v1/infra/sites/default/enforcement-points/alb-endpoint

    A successful registration returns the status as: DEACTIVATE_PROVIDER

  6. Restart NSX and AKO pods: Delete the existing pods to trigger a restart on the supervisor cluster:

    • Restart NCP pods: 
      kubectl delete pod <ncp-pod> -n vmware-system-nsx
    • Restart AKO pods: 
      kubectl delete pod <ako-pod> -n vmware-system-ako 
 
 

Additional Information

Below is a reference KB on how to register Avi load balancer with a NSX manager instance: 

Registering-avi-load-balancer-cluster