Reconnecting ESXi host to vCenter fails with "Timed out waiting for vpxa to start"
search cancel

Reconnecting ESXi host to vCenter fails with "Timed out waiting for vpxa to start"

book

Article ID: 313389

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • An ESXi host shows a status of Not Responding in the vCenter Server inventory.
  • Attempting to manually reconnect the host in vCenter fails with the following error:

A general system error occurred. Timed out waiting for vpxa to start.

  • When checking the status of the vpxa service or attempting to restart it manually via SSH, the following errors may be observed:
Unable to terminate watchdog: No running watchdog process for vpxa
sh: can't kill pid [PID]: No such process
  • In the var/run/log/vpxa.log on the ESXi host, you will see entries similar to:
error vpxa[] [Originator@6876 sub-IO.Http opID-WFU-#####] User agent failed to send request; (null), N7Vmacore15SystemExceptionE (Too many open files)
error vpxa[2100848] [Originator@6876 sub=vpxaInvtHostCnx opID=WEU-#####] Can't connect to hostd. Shutting down ...
info vpxa[2100848] [Originator@6876 sub=Default opID=WFU-#####] [Vpxa] Shutting down now

Environment

VMware vSphere ESXi 8.0.x
VMware vSphere ESXi 7.x

Cause

This issue occurs when the vpxa (vCenter Agent) service on the ESXi host enters an unresponsive or hung state.
The trigger for this issue is a race condition which occurs between directory open and rename operations due to inconsistent lock ordering when file movements and directory access are performed simultaneously

Resolution

This issue is resolved in VMware ESXi 7.0 Update 3q and VMware ESXi 8.0 Update 2b

Workaround:

  • When the `vpxa` service is completely unresponsive and cannot be restarted via the command line, the only way to recover the management connection is to reboot the ESXi host.

Note: If similar symptoms occur on the fixed or later versions of ESXi host, contact Broadcom Support