VMs crashing or failing to boot due to flaws in the NUMA scheduler
There's a flaw in the NUMA initial placement algorithm that can cause all VMs powered on within a short window to be placed on the same NUMA node. When a mix of low-latency and standard VMs is present, the scheduler may underestimate the CPU utilization of the low-latency VMs. This underestimation prevents the normal VMs from being migrated to other, less-congested NUMA nodes.
This behavior is addressed and will be resolved in an upcoming vSphere 9.x release. The updated NUMA placement in 9.x will consider the demand of all vCPUs, explicitly treating the demand of low-latency VMs as 100%. This ensures nodes are correctly identified as high-load, allowing the NUMA scheduler to move new VMs away from saturated nodes rapidly.
Workaround
Change the NUMA_INITIAL_PLACEMENT_LOAD_THRESHOLD option to 100 on the affected ESXi hosts.
esxcfg-advcfg -s 100 /Numa/InitialPlacementLoadThresholdApplying via Host Profiles
This configuration can be applied uniformly across multiple hosts by setting the Numa.InitialPlacementLoadThreshold option to 100 within a Host Profile:
Verification
To verify the threshold value before or after remediation, run the following command on the ESXi host:
esxcfg-advcfg -g /Numa/InitialPlacementLoadThreshold