ERROR: Failed to start services in profile ALL. RC=2, stderr=Failed to start hvc, vpxd, vpxd-svcs services.
search cancel

ERROR: Failed to start services in profile ALL. RC=2, stderr=Failed to start hvc, vpxd, vpxd-svcs services.

book

Article ID: 403611

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • vCert.py option 1 output will demonstrate the expired Root or Intermediate CA certificate under the Trusted Roots Store
...
Checking CA certificates in VMDir [by CN(id)]
-----------------------------------------------------------------
<SUBJECT_KEY_IDENTIFIER_OF_GOOD_CA>                   VALID
<SUBJECT_KEY_IDENTIFIER_OF_GOOD_CA>                   VALID
<SUBJECT_KEY_IDENTIFIER_OF_GOOD_CA>                   VALID
<SUBJECT_KEY_IDENTIFIER_OF_EPIRED_ROOT_CA>                 EXPIRED


Checking CA certificates in VECS [by Alias]
-----------------------------------------------------------------
<ALIAS_OF_GOOD_CA>                   VALID
<ALIAS_OF_GOOD_CA>                   VALID
<ALIAS_OF_GOOD_CA>                   VALID
<ALIAS_OF_EXPIRED_CA>                 EXPIRED



Checking STS Server Configuration
-----------------------------------------------------------------
Checking VECS store configuration                              OK
Checking STS ConnectionStrings                          MISCONFIG

------------------------!!! Attention !!!------------------------
 - One or more CA certificates is missing the Subject Key ID extension
 - One or more certificates are expired
 - The STS ConnectionStrings value is not set properly for an SSO
   domain with multiple Domain Controllers
  • Stopping and starting all services will result in the following services failing to start:
root@VCENTER [~]# service-control --start --all
Service-control failed. Error: Failed to start services in profile ALL. RC=2, stderr=Failed to start hvc, vpxd, vpxd-svcs services. Error: Service crashed while starting
  • vSphere Web Client showing "no healthy upstream"
  • vmon.log indicates vpxd-svcs service will not start due to certificate expired:
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/http/client.py", line 1327, in _send_request
[timestamp] Wa(03)+ host-####     self.endheaders(body, encode_chunked=encode_chunked)
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/http/client.py", line 1276, in endheaders
[timestamp] Wa(03)+ host-####     self._send_output(message_body, encode_chunked=encode_chunked)
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/http/client.py", line 1036, in _send_output
[timestamp] Wa(03)+ host-####     self.send(msg)
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/http/client.py", line 976, in send
[timestamp] Wa(03)+ host-####     self.connect()
[timestamp] Wa(03)+ host-####   File "/usr/lib/vmware/site-packages/pyVmomi/SoapAdapter.py", line 1153, in connect
[timestamp] Wa(03)+ host-####     six.moves.http_client.HTTPSConnection.connect(self)
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/http/client.py", line 1451, in connect
[timestamp] Wa(03)+ host-####     server_hostname=server_hostname)
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/ssl.py", line 423, in wrap_socket
[timestamp] Wa(03)+ host-####     session=session
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/ssl.py", line 899, in _create
[timestamp] Wa(03)+ host-####     self.do_handshake()
[timestamp] Wa(03)+ host-####   File "/usr/lib/python3.7/ssl.py", line 1168, in do_handshake
[timestamp] Wa(03)+ host-####     self._sslobj.do_handshake()
[timestamp] Wa(03)+ host-#### ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1076)
[timestamp] Wa(03)+ host-####
[timestamp] Er(02) host-#### <vpxd-svcs> Service pre-start command failed with exit code 1.
[timestamp] In(05) host-#### <cis-license> Running the API Health command as user cis-license
[timestamp] In(05) host-#### <cis-license-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vmon/vmonApiHealthCmd.py -n cis-license -u /ls/healthstatus -t 30
[timestamp] In(05) host-#### <cis-license> Service STARTED successfully.
[timestamp] In(05) host-#### <event-pub> Constructed command: /usr/bin/python /usr/lib/vmware-vmon/vmonEventPublisher.py --eventdata cis-license,HEALTHY,UNHEALTHY,0
[timestamp] In(05) host-#### <infraprofile-prestart> Constructed command: /usr/lib/vmware-infraprofile/config/pre_start.sh
  • vpxd-svcs.log will not have new logs recorded, the latest log indicates a crash of the service with "^@^@^" entries:
[timestamp] [tomcat-exec-213 [] WARN  com.vmware.cis.authorization.impl.AclPrivilegeValidator  opId=########-####-####-####-############] User <user account> does not have privileges [System.Read] on object urn%3Avmomi%3AInventoryServiceTag%##########-####-####-####-############%3AGLOBAL
[timestamp] [tomcat-exec-70 [] WARN  com.vmware.cis.authorization.impl.AclPrivilegeValidator  opId=########-####-####-####-############] User <user account> does not have privileges [System.Read] on object urn%3Avmomi%3AInventoryServiceTag%##########-####-####-####-############%3AGLOBAL
[timestamp] [tomcat-exec-197 [] WARN  com.vmware.cis.authorization.impl.AclPrivilegeValidator  opId=########-####-####-####-############] User <user account> does not have privileges [System.Read] on object urn%3Avmomi%3AInventoryServiceTag%##########-####-####-####-############%3AGLOBAL
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ ... 
(END)
  • Checking the output of the command: /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOTS --text confirms the Alias in question is expired
...
Alias : <ALIAS_OF_EXPIRED_CA> 
Entry type :    Trusted Cert
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            AA:BB:CC:DD:EE:FF:GG:HH:II:JJ:KK:LL
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=THE_EXPIRED_ROOT_CA
        Validity
            Not Before: [TIMESTAMP WHEN ROOT CA CERTIFICATE GOT ISSUED]
            Not After : [TIMESTAMP WHEN ROOT CA CERTIFICATE EXPIRED]
        Subject: CN=THE_EXPIRED_ROOT_CA

Environment

7.x
8.x

Cause

vCenter will validate the whole chain of the certificates used, if the Machine SSL has any certificate expired on it's chain (root CA or intermediates) the services cannot be normally started.

Resolution

Replace the Expired CA certificate with the following steps:

  1. Ensure a recent backup of the vCenter and snapshots are taken prior to proceeding, please refer to "Additional Information" section below for best practices
  2. Upload the new Root CA in Base64 format to the vCenter.
  3. Run the following commands to unpublish the old certificate and import the new one:
    /usr/lib/vmware-vmafd/bin/vecs-cli entry getcert --store TRUSTED_ROOTS --alias <ALIAS_OF_ROOT_CA> --output /tmp/oldRoot.cer
    /usr/lib/vmware-vmafd/bin/dir-cli trustedcert unpublish --cert /tmp/oldRoot.cer
    /usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store TRUSTED_ROOTS --alias <ALIAS_OF_ROOT_CA>
    /usr/lib/vmware-vmafd/bin/dir-cli trustedcert publish --cert /root/newRoot.cer
    /usr/lib/vmware-vmafd/bin/vecs-cli force-refresh
  4. Restart services
    service-control --stop --all
    service-control --start --all

    NOTE: If vCenter is on ELM, the new root CA certificate will be replicated across the ELM nodes, restart services on all other vCenters to validate they are sync and the services can start normally with the new root CA certificate.

Additional Information