vCenter Server services fail to start due to high CPU Ready (%RDY) or performance issues on the underlying ESXi host

Products

VMware vCenter Server

Issue/Introduction

When attempting to start or reboot a vCenter Server Appliance (VCSA), the vCenter management interface remains inaccessible and critical vCenter services fail to start.

Symptoms:

The vSphere Client (HTML5 UI) and VAMI (port 5480) are completely unresponsive.
The vCenter appliance memory utilization may report at or near 100% capacity.
Restarting vCenter services or rebooting the entire appliance multiple times does not resolve the issue; services consistently time out during initialization.
The command line interface (CLI) of the vCenter appliance or the hosting ESXi host responds extremely slowly (e.g., commands hang or take an unusually long time to return output).
Running esxtop on the specific ESXi host hosting the vCenter VM reveals a high CPU Ready value (%RDY) for the vCenter world ID, typically exceeding the critical threshold of 10%.

Environment

Product: VMware vCenter Server Appliance 7.x, 8.x, and 9.x
Product: VMware ESXi 7.x, 8.x, and 9.x
Setup: vCenter managing a multi-host cluster with shared storage.

Cause

The root cause is severe CPU resource contention or performance degradation on the specific physical ESXi host where the vCenter Server Appliance is running.

When an ESXi host undergoes high resource overcommit or internal scheduling bottlenecks, the CPU Ready (%RDY) percentage increases. A %RDY value above 10% indicates that the vCenter VM is waiting significantly long intervals to be scheduled onto physical CPU cores. Because the vCenter boot sequence is highly time-sensitive, this structural delay causes internal service components (such as vpxd, vmafdd, or scg) to exceed their hard-coded startup timeout thresholds, resulting in a cascade of failed services.

Resolution

To resolve this issue, the vCenter Server Appliance must be migrated to a healthy, responsive ESXi host within the cluster to alleviate the CPU scheduling bottlenecks.

Step 1: Shut Down the vCenter VM

If the vCenter VM is still partially responsive via SSH, shut down the guest OS cleanly:

shutdown -h now

If the VM or host is entirely unresponsive, proceed directly to Step 2.

Step 2: Unregister vCenter from the Degraded Host

Log into the CLI of the problematic/degraded ESXi host via SSH as root.

Find the vCenter Virtual Machine ID (Vmid):

vim-cmd vmsvc/getallvms | grep -i vcenter

Unregister the VM using its Vmid:
```
vim-cmd vmsvc/unregister <Vmid>
```

(Note: If the host is completely frozen or inaccessible, skip this step).

Step 3: Register vCenter on a Healthy Host

Log into the CLI of a healthy ESXi host in the same cluster via SSH as root.
Locate the configuration (.vmx) file of the vCenter VM on your shared storage:
```
find /vmfs/volumes/ -name "*.vmx" | grep -i vcenter
```

Register the vCenter VM onto the healthy host:

vim-cmd solo/registervm /vmfs/volumes/<Datastore_Name>/<vCenter_Folder>/<vCenter_Name>.vmx

Step 4: Power On and Verify

Retrieve the new Vmid assigned to the vCenter on this healthy host:
```
vim-cmd vmsvc/getallvms | grep -i vcenter
```
Power on the VM:
```
vim-cmd vmsvc/power.on <New_Vmid>
```
If prompted with an "I moved it" or "I copied it" question due to skipping Step 2, answer "I moved it" to resume power-on:
```
vim-cmd vmsvc/answer <New_Vmid> <Question_ID> 2
```

Allow 10–15 minutes for the guest operating system to boot up and initialize all services smoothly under normal %RDY levels (under 1%).