Problem Statement:
The AKO (Avi Kubernetes Operator) pod running on the supervisor cluster is experiencing persistent crashes. This issue is correlated with the following system configurations:
This configuration is implemented in Administration -> System Settings -> SSL/TLS Certificate on the Avi Controller.
avi-secret within the vmware-system-ako namespace on the supervisor cluster is unavailable.Troubleshooting Steps:
While connected to the Supervisor cluster context, the following symptoms are observed:
kubectl get secret -n vmware-system-ako
kubectl get pods -n vmware-system-ako
kubectl logs -n vmware-system-ako <AKO pod name> -c manager
POLICY REST API failed: /api/systemconfiguration/?include_name I/O error on GTE request for "https://<AVI controller IP>/api/systemconfiguration/": PKIX path building failed: java.security.cert.CertPathBuilderException: Unable to find certificate chain.; nested exception is javax.net.ssl.SSLHandshakeException: PKIX path building failed: java.security.cert.CertPathBuilderException: Unable to find certificate chain.
[ALB Controller] Caught exception during Get CA Certs : org.springframework.web.client.ResourceAccessException: I/O Error on GET request for "https://<AVI controller IP>/api/systemconfiguration/": PKIX path building failed: java.security.cert.CertPathBuilderException: Unable to find certificate chain.
1 avisession.go:668] Client error for URI: login. Error: Post "https://<AVI controller IP>/login": x509: certificate signed by unknown authority
1 avisession.go:716] Failed to invoke API. Error: Post "https://<AVI controller IP>/login": x509: certificate signed by unknown authority
1 avisession.go:383] response error: Rest request error, returning to caller: Post "https://<AVI controller IP>/login": x509: certificate signed by unknown authority
ERROR k8s/ako_init.go:384 AVI controller initialization failed with err: Rest request error, returning to caller: Post "https://<AVI controller IP>/login": x509: certificate signed by unknown authority
INFO lib/dynamic_client.go:377 init Secret not found, retrying...
FATAL lib/dynamic_client.go:374 Found new init secret, rebooting AKO
kubectl get pods -n vmware-system-nsx
kubectl logs -n vmware-system-nsx <nsx ncp pod name> -c nsx-ncp
Failed to get secret avi-init-secret
creating avi-init-secret
vSphere Supervisor with NSX-T and AVI-ALB within NSX-T Cloud
This is caused when a certificate issued by a non well-known CA is used as a portal certificate in Avi Controller.
Procedure:
Navigate to Templates -> Security -> SSL Certificate within the Avi Controller web interface.
Locate the SSL certificate intended for use with NSX-T.
Click the download button associated with the certificate.
Copy the certificate data (excluding the private key).
Repeat this process for any intermediate certificates in the chain.
Create a text file (e.g., avi-certificate.crt). in /tmp directory.
Arrange the copied certificates within the file according to the following formats:
Root Certificate Only:
-----BEGIN CERTIFICATE-----
(Root Certificate Content)
-----END CERTIFICATE-----
Root and Intermediate Certificates:
-----BEGIN CERTIFICATE-----
(Intermediate Certificate Content)
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
(Root Certificate Content)
-----END CERTIFICATE-----
Note: The intermediate certificate should precede the root certificate in the file.
Self-Signed Certificate:
-----BEGIN CERTIFICATE-----
(Self-Signed Certificate Content)
-----END CERTIFICATE-----
Note: For self-signed certificates, only the leaf certificate is required.
Update the new certificate in the CA store on NSX-T manager. Run the below commands:
keytool -importcert -alias avi-portal-certificate -keystore /usr/lib/jvm/jre/lib/security/cacerts -storepass changeit -file /tmp/avi-certificate.crt
If the above path is not found, please use the command below:
keytool -importcert -alias avi-portal-certificate -keystore /usr/java/jre/lib/security/cacerts -storepass changeit -file /tmp/avi-certificate.crt
sudo cp /tmp/avi-certificate.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
Perform all of the above steps on each NSX-T manager node.
Once the above steps are completed on each NSX-T manager node, restart the proton service on the leader node:
service proton restart
A) Certificate Upload on NSX-T
keytool -list -v -keystore /usr/java/jre/lib/security/cacerts -storepass changeit | grep -i alias
keytool error: java.lang.Exception: Certificate not imported, alias <startssl> already exists
Ensure that the alias you are using is not listed in the output of Step 1.
B) Verify Ako Pod is running
kubectl get secret -n vmware-system-ako
kubectl rollout restart deploy -n vmware-system-nsx nsx-ncp
kubectl rollout restart deploy -n vmware-system-ako
C) Execute the below API call to validate the ALB endpoint:
curl -kv https://{{nsx_ip}}/policy/api/v1/infra/sites/default/enforcement-points/alb-endpoint