vCenter Server /storage/log partition full due to vsphere-ui Java heap dumps after Certificate Replacement
search cancel

vCenter Server /storage/log partition full due to vsphere-ui Java heap dumps after Certificate Replacement

book

Article ID: 435696

calendar_today

Updated On:

Products

VMware vCenter Server 8.0

Issue/Introduction

  • The vCenter Server Appliance (VCSA) /storage/log partition becomes 100% full in a short period.
  • The vsphere-ui service is in a crash-restart loop.
  • Users receive 503 Service Unavailable or No Healthy Upstream errors when accessing the vSphere Client.
  • Large Java heap dump files (.hprof) or core dumps accumulate under /var/log/vmware/vsphere-ui/.
  • The issue typically occurs immediately following a Custom Machine SSL Certificate replacement.

Environment

vCenter Server 8.x

Cause

The vsphere-ui service is struggling with an incomplete or circular trust chain in the VMware Endpoint Certificate Store (VECS) TRUSTED_ROOTS store. This often occurs if an Intermediate Certificate Authority (CA) was added without its Root CA, or if duplicate alias entries cause the Java KeyStore to loop during validation.

In the /var/log/vmware/vsphere-ui/logs/vsphere_client_virgo.log, you may see the following entry: TrustManagerFactory initialization took too long

This hang holds objects in memory, eventually triggering an OutOfMemoryError and a subsequent heap dump, which exhausts the disk space on the /storage/log partition.

Resolution

Note: Please ensure to create a fresh backup or offline snapshot (in powered off state) of the vCenter Server Appliance before implementing the steps below. In case the affected vCenter Server Appliance is part of an Enhanced Linked Mode (ELM) replication group, the backup or offline snapshots must be created for all of its replication partners as well. When restoring an ELM vCenter, all members of the ELM replication need to be restored too, otherwise there will be inconstancies in the VMDirectory LDAP database.

To resolve this issue, perform the following steps to clear the logs and reset the certificate trust chain:

  1. Free Disk Space
    SSH into the vCenter Server Appliance as root and navigate to the vsphere-ui log directory:
    cd /var/log/vmware/vsphere-ui/


    Review this command before running it.

    bash
    find . -name "*.hprof" -delete
    find . -name "core.*" -delete

    Note: This will immediately free space to allow services to start.

  2. Verify the Root Cause
    Check the Virgo logs for the specific initialization hang:
    grep "TrustManagerFactory initialization took too long" /var/log/vmware/vsphere-ui/logs/vsphere_client_virgo.log

  3. Regenerate Certificates
    Use the VMware Certificate Manager utility or the vCert script to reset the certificates to VMCA-signed defaults. This process rebuilds the TRUSTED_ROOTS store correctly.

    Using Certificate Manager:

    • Run /usr/lib/vmware-vmca/bin/certificate-manager
    • Select Option 6 (Replace Machine SSL certificate with VMCA Certificate).

  4. Restart Services
    After the certificates are regenerated, restart all vCenter services:

    This command will make changes to your system. Review it carefully before running.

    bash
    service-control --stop --all && service-control --start --all