vCenter Server services fail to start after the Machine SSL certificate's issuing CA expires
search cancel

vCenter Server services fail to start after the Machine SSL certificate's issuing CA expires

book

Article ID: 442815

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • The core vCenter Server service (vpxd) does not start, and services that depend on it remain stopped.
  • You open the vCenter Server URL and the page shows:
    no healthy upstream
    
  • A vCenter Server upgrade fails at Stage 2 with an error similar to:
    Failed to connect to source vCenter
    
  • When you check service status, the core service and its dependents are stopped while the identity and certificate services are running:
    service-control --status --all
    Stopped:
     vmware-vpxd vmware-vpxd-svcs vmware-vapi-endpoint vmware-content-library ...
    Running:
     vmafdd vmcad vmdird vmware-stsd lookupsvc vmware-vpostgres vmware-envoy ...
    
  • In /var/log/vmware/vpxd/vpxd.log, you see the service start and then shut down, with errors similar to:
    error vpxd[#####] [sub=SsoWrapper] [AcquireToken] AcquireToken exception: InvalidCredentialsException(Authentication failed: Invalid credentials)
    warning vpxd[#####] [sub=IO.Connection] Failed to SSL handshake; ... certificate verify failed
    error vpxd[#####] [sub=Default] Failed to start VMware VirtualCenter. Shutting down
    
  • The vCenter service certificate check reports an expired CA certificate in the TRUSTED_ROOTS store, similar to:
    warning vpxd[#####] [opID=CheckCertificateExpiry] Certificate [Subject: CN=<enterprise-CA-name>] from store TRUSTED_ROOTS will expire on YYYY-MM-DD
    

Additional symptoms reported:

  • Upgrading vCenter Server from 7.0 to 8.0 keeps failing on Stage 2.
  • The vCenter web interface is unreachable and shows a proxy error.

Environment

  • VMware vCenter Server 7.0
  • VMware vCenter Server 8.0
  • vCenter Server using a custom, externally signed Machine SSL certificate

Cause

A certificate can never be trusted for longer than the certificate authority (CA) certificate that signs it. In a hybrid certificate configuration, the Machine SSL certificate that vCenter Server presents on port 443 is signed by an external CA, while the solution-user certificates remain signed by the built-in VMware Certificate Authority (VMCA).

When the external CA certificate in the VMware Endpoint Certificate Store (VECS) TRUSTED_ROOTS store expires, the Machine SSL certificate is no longer trusted, even if the Machine SSL certificate itself has not reached its own expiration date. vCenter Server services connect to each other and to the local reverse proxy on port 443 using this certificate. Once the chain is no longer trusted, the core vCenter service (vpxd) cannot complete the connections it requires at startup and shuts down. Every service that depends on the core service then remains stopped, the reverse proxy returns "no healthy upstream," and a vCenter Server upgrade cannot connect to the source vCenter.

This commonly happens after the Machine SSL certificate is renewed while the original CA root certificate is re-used. The CA root certificate has its own expiration date. Re-using a root that is near the end of its life means the renewed Machine SSL certificate loses trust when that root expires, regardless of the validity period stamped on the Machine SSL certificate itself.

Resolution

Take offline snapshots of all vCenter Server nodes in the single sign-on domain before making any change.

Part 1 - Restore service with a VMware CA-signed certificate

  1. Open an SSH session to the vCenter Server as root. Start the Certificate Manager tool:
    /usr/lib/vmware-vmca/bin/certificate-manager
    
  2. Select option 3, "Replace Machine SSL certificate with VMCA Certificate," and provide the single sign-on administrator credentials (for example, [email protected]).
  3. Confirm recovery. Run service-control --status --all and confirm critical services such as vmware-vpxd and vapi-endpoint are running, then open the vCenter Server URL and confirm the login page loads.
    • If services still do not start, run certificate-manager again and select option 8, "Reset all Certificates."

Part 2 - Return to your external CA certificate

  1. Obtain from your CA team or vendor a renewed, currently valid root CA certificate (and an intermediate certificate if your PKI is two-tier), plus a new Machine SSL certificate signed under that renewed root. Do not sign the new certificate with the expired root.
  2. Download and run the vCert tool (see Additional Information). From the main menu select 3 (Manage certificates), then 1 (Machine SSL certificate), then the custom CA-signed option, and generate a private key and certificate signing request (CSR). Submit the CSR to your CA.
  3. Publish the renewed root (and intermediate) certificate to the trusted store: main menu 3 (Manage certificates), then 3 (CA certificates in VMware Directory), then publish.
  4. Install the new Machine SSL certificate: main menu 3 (Manage certificates), then 1 (Machine SSL certificate), then import the signed certificate and key. Provide the full chain file if prompted.
  5. Update the trust anchors and restart services: main menu 4 (Manage SSL trust anchors), then 2 (Update SSL Trust Anchors); then main menu 8 (Restart services), then 1 (Restart all VMware services).
  6. Verify with main menu 1 (Check current certificate status) and confirm there are no expired certificates and no trust problems.
  7. Retry the vCenter Server upgrade if this issue was blocking an upgrade.

If you also need to remove the old expired root certificates:

Once the renewed root is in place, the old expired root certificates can remain without causing harm, so removal is optional. To remove them, follow the steps in Verify and remove CA Certificates from the TRUSTED_ROOTS store in VECS. These certificates are also published in VMware Directory and must be unpublished there first, or they are copied back automatically.

Additional Information