After vSAN Deployment /storage/log Disk Space full Issues caused by Infraprofile Java Heap Dump Files in vCenter Server
search cancel

After vSAN Deployment /storage/log Disk Space full Issues caused by Infraprofile Java Heap Dump Files in vCenter Server

book

Article ID: 437655

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

The vCenter Server /storage/log partition fills up rapidly due to the accumulation of large Java heap dump files (java_pid*.hprof) located in the /storage/log/vmware/vmware-sps/ directory.

The VMware vSphere Profile-Driven Storage Service (SPS) intermittently crashes. The following errors are observed in /var/log/vmware/vmware-sps/sps.log:

yyyy-mm-ddThh:mm:ss.302Z [pool-14-thread-3] ERROR ... - Ignoring error: java.lang.OutOfMemoryError: Java heap space
yyyy-mm-ddThh:mm:ss.634Z [pool-4-thread-5] INFO ... [VLSI] Active thread count is: 10, Core Pool size is: 10, Queue size: 0, Time spent waiting in queue: 1742 millis | ThreadPool Starvation Alert
yyyy-mm-ddThh:mm:ss.142Z [main] INFO ... - Updating sps.properties file with Map{spbm.vlsi.threadpool.corePoolSize.auto=10} properties for vc size type SMALL

Symptoms typically manifest after deploying vSAN or heavily scaling storage architectures, which increases vSphere APIs for Storage Awareness (VASA) query overhead.

Environment

VMware vCenter Server 8.0.x

Cause

The vCenter Server Appliance deployment size is configured as "Small" in the internal database. Although the underlying virtual machine compute and memory resources may have been manually expanded, the SPS service artificially restricts its JVM heap size and thread pool based on the original deployment size. The heavy VASA query workload introduced by vSAN exceeds this restricted allocation, causing the SPS service to experience an Out of Memory (OOM) crash.

Resolution

Immediate Mitigation:

  1. Establish an SSH session to the vCenter Server Appliance using root credentials.

  2. Verify the current memory allocation for the SPS service: cloudvm-ram-size -l | grep vmware-sps

  3. Manually increase the memory limit (e.g., to 4000 MB) by executing: cloudvm-ram-size -C 4000 vmware-sps

  4. Restart the SPS service to apply the new memory boundaries: service-control --restart vmware-sps

  5. Reclaim disk space by deleting the existing OOM heap dumps: rm /storage/log/vmware/vmware-sps/java_pid*.hprof

Permanent Resolution: In vSphere 8.x, there is no API to dynamically scale the deployment size state in the database. During lifecycle operations (such as upgrades), the manual memory adjustments will revert to the original deployment size limits.

To permanently update the internal deployment size:

  1. Perform a File-Based Backup of the vCenter Server Appliance.

  2. Deploy a new vCenter Server Appliance using the File-Based Restore workflow.

  3. During the target sizing phase of the restore UI, explicitly select the appropriate larger deployment size (e.g., Large or X-Large).

Additional Information

Storage Policy Service (SPS) shows unhealthy after adding a large number of ESXi hosts with active IOFilters

Increasing the disk space for the vCenter Server Appliance