Problem Statement:
The AKO (Avi Kubernetes Operator) pod running on the supervisor cluster is experiencing persistent crashes. This issue is correlated with the following system configurations:
Administration -> System Settings -> SSL/TLS Certificate
on the Avi Controller.avi-secret
within the vmware-system-ako
namespace on the supervisor cluster is unavailable.Troubleshooting Steps:
1) Verify avi-secret existence
kubectl get secret avi-secret -n vmware-system-ako
2) Inspect AKO Pod Logs:
kubectl logs <ako-pod-name> -n vmware-system-ako
This applies to the TKGS cluster deployed with NSX-T networking and VMware Avi Loadbalancer with NSX-T cloud.
This is caused when a certificate issued by a non well-known CA is used as a portal certificate in Avi Controller.
Procedure:
1. Exporting Certificates from the Avi Controller:
2. Constructing the Certificate Chain File:
Create a text file (e.g., avi-certificate.crt
). in /tmp directory.
Arrange the copied certificates within the file according to the following formats:
a) Root Certificate Only:
-----BEGIN CERTIFICATE-----
(Root Certificate Content)
-----END CERTIFICATE-----
b) Root and Intermediate Certificates:
-----BEGIN CERTIFICATE-----
(Intermediate Certificate Content)
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
(Root Certificate Content)
-----END CERTIFICATE-----
Note: The intermediate certificate should precede the root certificate in the file.
c) Self-Signed Certificate:
-----BEGIN CERTIFICATE-----
(Self-Signed Certificate Content)
-----END CERTIFICATE-----
Note: For self-signed certificates, only the leaf certificate is required.
3. Import the file (example: avi-certificate.crt) to all three NSX-T managers in the /tmp directory.
4. Update the new certificate in the CA store on NSX-T manager. Run the below commands
a) keytool -importcert -alias avi-portal-certificate -keystore /usr/lib/jvm/jre/lib/security/cacerts -storepass changeit -file /tmp/avi-certificate.crt
If the above path is not found, please use the command below:
b) keytool -importcert -alias avi-portal-certificate -keystore /usr/java/jre/lib/security/cacerts -storepass changeit -file /tmp/avi-certificate.crt
c) sudo cp /tmp/avi-certificate.crt /usr/local/share/ca-certificates/
d) sudo update-ca-certificates
e) Complete the above changes on all three NSX-T nodes.
f) service proton restart (On the leader node)
Verification
A) Certificate Upload on NSX-T
1) List All the certificates and search for the alias
keytool -list -v -keystore /usr/java/jre/lib/security/cacerts -storepass changeit | grep -i alias
Look for the Alias we created i.e avi-portal-certificate
2) When attempting to import a certificate into the cacerts keystore, if you encounter the following error:
keytool error: java.lang.Exception: Certificate not imported, alias <startssl> already exists
Run the command in (1) and make sure the alias created is not listed in the output of (1)
B) Verify Ako Pod is running
1) Check if the avi-secret is created on the Supervisor Cluster. kubectl get secret -n vmware-system-ako
2) Restart/Delete the NCP pods if the avi-secret is not created. kubectl get pods -A | grep -i ncp
kubectl delete pod <ncp-pod-name> -n <namespace>
3) IF the secret is created and the AKO pod is still crashing, delete the AKO pod.
C) Execute the below API to get ALB enpoint
curl -kv https://{{nsx_ip}}/policy/api/v1/infra/sites/default/enforcement-points/alb-endpoint