You attempt to power on a virtual machine configured with more than 60 vCPUs on an ESXi 8.0 Update 3 or vSphere 9.0 host with NVMe memory tiering enabled. The VM fails to power on and the vmware.log file shows the following error:
KLMCall_RunVCPU terminated unexpectedly
The full error sequence in vmware.log appears as:
Msg_Post: Error
[msg.log.error.unrecoverable] VMware ESX unrecoverable error: (vcpu-XX)
KLMCall_RunVCPU terminated unexpectedly
The failure occurs at a consistent vCPU threshold. Power-on succeeds if you reduce the vCPU count below the threshold or disable memory tiering on the VM. This prevents you from using the full CPU capacity of the host for large-scale VM configurations.
Additional symptoms reported:
sched.mem.enableTiering = "TRUE")This is a known issue with overhead memory allocation in the NVMe Memory Tiering feature.
Each vCPU in a virtual machine requires overhead memory for its vCPU world initialization. The Mem.VMOverheadGrowthLimit advanced setting controls how much overhead memory the system can allocate for VM operations. When memory tiering is enabled, the default value of this setting (1) does not provide sufficient headroom for VMs with high vCPU counts.
When you attempt to power on a VM with more than 60 vCPUs and memory tiering is active, the system cannot allocate enough overhead memory to initialize all the vCPU worlds. The power-on process fails at the vCPU where the overhead memory limit is exceeded, resulting in the "KLMCall_RunVCPU terminated unexpectedly" error.
This is planned to be fixed in a future release Until then, use one of the following workarounds.
Workaround 1: Increase the overhead memory growth limit (recommended)
This workaround allows you to keep memory tiering enabled while supporting high-vCPU VMs. The default value is 1. A value of 3 is recommended. Higher values may be appropriate for very high vCPU counts.
VMOverheadGrowthLimit.Mem.VMOverheadGrowthLimit.1 to 3.No host reboot is required. You can now power on the VM with memory tiering enabled.
Workaround 2: Disable memory tiering on the VM
sched.mem.enableTiering = "FALSE"
Workaround 3: Disable memory tiering at the host level
Use this approach only if memory tiering is not required for any workloads on the host.
memoryTiering.VMkernel.Boot.memoryTiering.FALSE.