vCenter Server UI experiences intermittent crashes and generates a core.vpxd-worker dump file
search cancel

vCenter Server UI experiences intermittent crashes and generates a core.vpxd-worker dump file

book

Article ID: 432644

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

vCenter Server services crash intermittently, requiring a service restart or full reboot to temporarily restore access.

The log partition (/storage/log) is observed to be fully exhausted due to the excessive growth of envoy-system-proxy logs.

The following errors are observed in /var/log/vmware/vpxd.log:

Cannot compress file: /var/log/vmware/envoy-system-proxy/envoy-access-21.log: No space left on device
Caused by: java.io.IOException: No space left on device

Environment

VMware vCenter Server 8.x

VMware vCenter Server 7.x

Cause

This issue is caused by the /storage/log partition reaching 100% capacity. This typically occurs when an external source (e.g., third-party monitoring, backup software, or API scripts) initiates an excessive number of connections to the vCenter Server.

The envoy-system-proxy component logs these connections, filling the partition faster than log rotation can compress and purge the data. When the partition fills up, log rotation fails, and core services crash because write operations to the disk cannot be completed.

Resolution

To resolve this issue, space must be freed on the affected partition, and the root cause of the excessive logging must be identified.

The following steps must be executed to clean up files and reclaim disk space:

  1. Establish an SSH connection to the vCenter Server Appliance as root.

  2. Enable access to the Bash shell:
    • shell.set --enabled true shell

  3. Verify that the /storage/log partition is at 100% utilization:
    •  df -h | grep /storage/log

  4. Navigate to the envoy-system-proxy log directory:

    • cd /var/log/vmware/envoy-system-proxy/

  5. Remove older, compressed log archives to immediately free up space:

    • rm envoy-access-*.gz

  6. Start affected vCenter services to recover from the crashed state:
    • service-control --start --all

If the partition continues to fill rapidly, the IP address generating the excessive API traffic must be identified.

  1. Execute the following command to count the number of connections per IP address in the current Envoy access log:

    • awk '{print $18}' /var/log/vmware/envoy-system-proxy/envoy-access.log | cut -d: -f1 | sort | uniq -c | sort -nr | head -n 10

  2. Identify the system belonging to the top IP addresses and engage the respective administrators or vendors to reduce the polling frequency.

  3. If the traffic volume is expected and legitimate, consider expanding the /storage/log virtual disk on the vCenter Server Appliance.

Additional Information

The /root partition on the vCenter Server is full