vmware-vapi-endpoint fails to start or crashes after upgrading to vCenter Server 6.5 Update 2
book
Article ID: 342879
calendar_today
Updated On:
Products
VMware vCenter Server
Issue/Introduction
Symptoms:
After upgrading to vCenter Server 6.5 Update 2, the vmware-vapi-endpoint fails to start or crashes.
In the endpoint.log file, you see entries similar to:
# less /var/log/vmware/vapi/endpoint/endpoint.log
Caused by: javax.net.ssl.SSLHandshakeException: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint verification is not configured at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1964) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:328) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) at sun.security.ssl.Handshaker.process_record(Handshaker.java:987) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1072) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397) at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager$HostnameVerifier.verify(ThumbprintTrustManager.java:420) ... 45 more Caused by: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint verification is not configured at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:206) at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:985) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1596) ... 53 more Caused by: com.vmware.identity.vecs.VecsGenericException: Native platform error [code: 87][Enum of entries on store 'TRUSTED_ROOT_CRLS' failed. [Server: __localhost__, User: __localuser__]] at com.vmware.identity.vecs.VecsEntryEnumeration.BAIL_ON_ERROR(VecsEntryEnumeration.java:108) at com.vmware.identity.vecs.VecsEntryEnumeration.enumEntries(VecsEntryEnumeration.java:139) at com.vmware.identity.vecs.VecsEntryEnumeration.fetchMoreEntries(VecsEntryEnumeration.java:122) at com.vmware.identity.vecs.VecsEntryEnumeration.<init>(VecsEntryEnumeration.java:36) at com.vmware.identity.vecs.VMwareEndpointCertificateStore.enumerateEntries(VMwareEndpointCertificateStore.java:369) at com.vmware.provider.VecsCertStoreEngine.engineGetCRLs(VecsCertStoreEngine.java:77) at java.security.cert.CertStore.getCRLs(CertStore.java:181) at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkForRevocation(ThumbprintTrustManager.java:246) at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:158) ... 55 more YYYY-MM-DDTHH:MM:SS.685+02:00 | INFO | state-manager1 | HealthStatusCollectorImpl | HEALTH ORANGE Failed to retrieve SSO settings from component manager. YYYY-MM-DDTHH:MM:SS.685+02:00 | ERROR | state-manager1 | DefaultStateManager | Could not initialize endpoint runtime state. com.vmware.vapi.endpoint.config.ConfigurationException: Failed to retrieve SSO settings. at com.vmware.vapi.endpoint.cis.SsoSettingsBuilder.buildInitial(SsoSettingsBuilder.java:63) at com.vmware.vapi.state.impl.DefaultStateManager.build(DefaultStateManager.java:354) at com.vmware.vapi.state.impl.DefaultStateManager$1.doInitialConfig(DefaultStateManager.java:168) at com.vmware.vapi.state.impl.DefaultStateManager$1.run(DefaultStateManager.java:151) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
In the vmafdd-syslog file, you see the same certificates being pushed to VECS over and over. You can verify this by running the command.
# grep "Added cert to VECS DB" /var/log/vmware/vmafdd/vmafdd-syslog.log
YY-MM-DDTHH:MM:SS.090346+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.085596+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.089158+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.041227+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.084083+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.095645+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.087458+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.318936+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.091393+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.108070+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.082253+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.098974+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.084759+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.086880+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.092401+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ######################################## YY-MM-DDTHH:MM:SS.099424+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: ########################################
Note: The CRL store is filled with spurious entries and the number grows indefinitely over time. Run the following command to see the current number and to monitor growth:
# /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS --text | wc -l
Environment
VMware vCenter Server 6.7.x VMware vCenter Server Appliance 6.5.x VMware vCenter Server 7.0.x
Cause
This issue is caused by one or more corrupt CRL files in /etc/ssl/certs. To verify that you have corrupt entries complete the following steps.
SSH to the vCenter Server Appliance.
Navigate to the /etc/ssl/certs location and run the following command to return the "Authority Key Identifier" for all CRLs, if you see a failure then you may have a corrupt entry.
# for i in `grep -l "BEGIN X509 CRL" *`;do openssl crl -inform PEM -text -noout -in $i | grep -A 1 " Authority Key Identifier";done
Run the following command to check for an corruption relating to CA certificates. This should return with the "Subject Key Identifier", if you see a failure then you may have a corrupt entry.
# for i in `grep -l "BEGIN CERTIFICATE" *`;do openssl x509 -in $i -noout -text | grep -A 1 "Subject Key Identifier";done
To resolve this issue, delete any corrupt files in /etc/ssl/certs and remove all entries from the CRL store so that VMDIR push down fresh certificates to VECS. This in turn allows the VAPI service to start successfully.