vmware-vapi-endpoint fails to start or crashes after upgrading to vCenter Server 6.5 Update 2
search cancel

vmware-vapi-endpoint fails to start or crashes after upgrading to vCenter Server 6.5 Update 2

book

Article ID: 342879

calendar_today

Updated On: 03-31-2025

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • After upgrading to vCenter Server 6.5 Update 2, the vmware-vapi-endpoint fails to start or crashes.
  • In the endpoint.log file, you see entries similar to:
# less /var/log/vmware/vapi/endpoint/endpoint.log

Caused by: javax.net.ssl.SSLHandshakeException: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint verification is not configured
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
        at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1964)
        at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:328)
        at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322)
        at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614)
        at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
        at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052)
        at sun.security.ssl.Handshaker.process_record(Handshaker.java:987)
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1072)
        at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
        at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
        at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
        at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager$HostnameVerifier.verify(ThumbprintTrustManager.java:420)
        ... 45 more
Caused by: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint verification is not configured
        at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:206)
        at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:985)
        at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1596)
        ... 53 more
Caused by: com.vmware.identity.vecs.VecsGenericException: Native platform error [code: 87][Enum of entries on store 'TRUSTED_ROOT_CRLS' failed. [Server: __localhost__, User: __localuser__]]
        at com.vmware.identity.vecs.VecsEntryEnumeration.BAIL_ON_ERROR(VecsEntryEnumeration.java:108)
        at com.vmware.identity.vecs.VecsEntryEnumeration.enumEntries(VecsEntryEnumeration.java:139)
        at com.vmware.identity.vecs.VecsEntryEnumeration.fetchMoreEntries(VecsEntryEnumeration.java:122)
        at com.vmware.identity.vecs.VecsEntryEnumeration.<init>(VecsEntryEnumeration.java:36)
        at com.vmware.identity.vecs.VMwareEndpointCertificateStore.enumerateEntries(VMwareEndpointCertificateStore.java:369)
        at com.vmware.provider.VecsCertStoreEngine.engineGetCRLs(VecsCertStoreEngine.java:77)
        at java.security.cert.CertStore.getCRLs(CertStore.java:181)
        at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkForRevocation(ThumbprintTrustManager.java:246)
        at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:158)
        ... 55 more
YYYY-MM-DDTHH:MM:SS.685+02:00 | INFO  | state-manager1            | HealthStatusCollectorImpl      | HEALTH ORANGE Failed to retrieve SSO settings from component manager.
YYYY-MM-DDTHH:MM:SS.685+02:00 | ERROR | state-manager1            | DefaultStateManager            | Could not initialize endpoint runtime state.
com.vmware.vapi.endpoint.config.ConfigurationException: Failed to retrieve SSO settings.
        at com.vmware.vapi.endpoint.cis.SsoSettingsBuilder.buildInitial(SsoSettingsBuilder.java:63)
        at com.vmware.vapi.state.impl.DefaultStateManager.build(DefaultStateManager.java:354)
        at com.vmware.vapi.state.impl.DefaultStateManager$1.doInitialConfig(DefaultStateManager.java:168)
        at com.vmware.vapi.state.impl.DefaultStateManager$1.run(DefaultStateManager.java:151)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
  • In the vmafdd-syslog file, you see the same certificates being pushed to VECS over and over. You can verify this by running the command. 

# grep "Added cert to VECS DB" /var/log/vmware/vmafdd/vmafdd-syslog.log

YY-MM-DDTHH:MM:SS.090346+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.085596+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.089158+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.041227+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.084083+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.095645+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.087458+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.318936+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.091393+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.108070+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.082253+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.098974+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.084759+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.086880+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.092401+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################
YY-MM-DDTHH:MM:SS.099424+02:00 notice vmafdd  t@140015749916416: Added cert to VECS DB: ########################################

 

Note: The CRL store is filled with spurious entries and the number grows indefinitely over time. Run the following command to see the current number and to monitor growth:

# /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS --text | wc -l

Environment

VMware vCenter Server 6.7.x
VMware vCenter Server Appliance 6.5.x
VMware vCenter Server 7.0.x

Cause

This issue is caused by one or more corrupt CRL files in /etc/ssl/certs. To verify that you have corrupt entries complete the following steps. 

  • SSH to the vCenter Server Appliance.
  • Navigate to the /etc/ssl/certs location and run the following command to return the "Authority Key Identifier" for all CRLs, if you see a failure then you may have a corrupt entry.
# for i in `grep -l "BEGIN X509 CRL" *`;do openssl crl -inform PEM -text -noout -in $i | grep -A 1 " Authority Key Identifier";done
 
Expected output example:
 
X509v3 Authority Key Identifier:
keyid:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
X509v3 Authority Key Identifier:
keyid:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
X509v3 Authority Key Identifier:
keyid:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
X509v3 Authority Key Identifier:
keyid:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##

 

  • Run the following command to check for an corruption relating to CA certificates. This should return  with the "Subject Key Identifier", if you see a failure then you may have a corrupt entry.
# for i in `grep -l "BEGIN CERTIFICATE" *`;do openssl x509 -in $i -noout -text | grep -A 1 "Subject Key Identifier";done

Expected output example:
 
X509v3 Subject Key Identifier:
##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
X509v3 Subject Key Identifier:
##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
X509v3 Subject Key Identifier:
##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##
X509v3 Subject Key Identifier:
##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##

Resolution

To resolve this issue, delete any corrupt files in /etc/ssl/certs and remove all entries from the CRL store so that VMDIR push down fresh certificates to VECS. This in turn allows the VAPI service to start successfully. 

Ensure you a have a valid backup or snapshot of the vCenter Server before proceeding. Overview of Backup and Restore options in vCenter Server 6.x (2149237)

A script has been written to automate this process. 
  1. SSH to the vCenter Server Appliance. 
  2. CD into /tmp. 
  3. Create a file for the script. For example # vi crl-fix.sh
  4. Copy and paste the following into the file:
#!/bin/bash
cd /etc/ssl/certs
mkdir /tmp/pems
mkdir /tmp/OLD-CRLS-CAs
mv *.pem /tmp/pems && mv *.* /tmp/OLD-CRLS-CAs
h=$(/usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS --text | grep Alias | cut -d : -f 2)
for hh in "echo "${h[@]}"";do echo "Y" | /usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store TRUSTED_ROOT_CRLS --alias $hh;done
mv /tmp/pems/* .
for l in `ls *.pem`;do ln -s $l ${l/pem/0};done
service-control --stop vmafdd && service-control --start vmafdd
  1. Save the file and change the permissions before executing the script. 
# chmod +x crl-fix.sh
  1. Run the script using following syntax. 
# ./crl-fix.sh
  1. Reboot the vCenter Server Appliance.