Rightsizing virtual machines on ESXi 8.0: vCPU, memory, and CPU topology guidance
search cancel

Rightsizing virtual machines on ESXi 8.0: vCPU, memory, and CPU topology guidance

book

Article ID: 438023

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article provides reference guidance for sizing virtual machines that run on VMware ESXi 8.0.x. It covers vCPU count, memory allocation, NUMA awareness, the Cores per Socket setting, and the Automatic vTopology feature introduced in ESXi 8.0.

The intended audience is administrators who are planning new VM deployments or reviewing existing VM configurations for performance. The article focuses on the configuration choices that affect a VM's performance on a single host. It does not cover broader topics such as CPU reservations and limits, shares, DRS rules, or latency-sensitive workload tuning.

Environment

ESXi 8.0.x

Resolution

NUMA on ESXi 8.0

Modern multi-socket servers divide memory among the processor sockets. Each socket has fast access to a local subset of memory and slower access to memory attached to other sockets. This design is called Non-Uniform Memory Access (NUMA).

ESXi keeps each VM's vCPUs and memory within a single physical NUMA node when the VM is small enough to fit. This arrangement gives the lowest memory access latency. When a VM's vCPU count exceeds the cores in one node, ESXi treats it as a wide VM and exposes a vNUMA topology to the guest, so the guest scheduler can also make NUMA-aware decisions.

Per Performance Best Practices for VMware vSphere 8.0, ESXi NUMA scheduling is enabled by default on hosts that have at least four CPU cores and at least two cores per NUMA node. For full background, see Using NUMA Systems with ESXi in the vSphere Resource Management documentation.

Automatic vTopology in ESXi 8.0

ESXi 8.0 introduced an enhanced virtual topology feature, often called Automatic vTopology. When the feature is active, ESXi automatically selects an optimal Cores per Socket value and an optimal virtual L3 cache size at VM power-on, based on the underlying physical hardware.

Requirements:

  • The VM must be running virtual hardware version 20 or later.
  • Cores per Socket must be set to "Assigned at power on", which is the default for new HW20 VMs. Internally this is represented as NumCoresPerSocket = 0.

Automatic vTopology also includes a new virtual motherboard layout that preserves vNUMA when CPU Hot Add is enabled. This is a change from earlier vSphere releases, where activating CPU Hot Add disabled vNUMA for the VM.

For details, see Virtual Topology in ESXi 8.0 on Broadcom Tech Docs and the VMware vSphere 8.0 Virtual Topology Performance Technical White Paper.

vCPU sizing

Size the vCPU count to match real workload demand. Gather data over a representative period, typically 24 hours, using either tool below:

Metrics to collect, with general guidance on what to look for:

  • Sustained guest CPU utilization. Aim at or below 80% during normal operation, with headroom for spikes. The same 80% guideline appears for host-level PCPU utilization in Performance Best Practices for VMware vSphere 8.0. Sustained saturation at or near 100% indicates undersizing and typically shows up as application slowness even when the host has spare capacity.
  • CPU ready (%RDY in esxtop, "Ready" in vSphere Client charts). Industry-accepted thresholds are below about 5% per vCPU for benign behavior, 5% to 10% per vCPU for investigate, and above 10% per vCPU for noticeable performance impact. Sustained high ready time generally signals CPU contention at the host level rather than undersizing of the individual VM.
  • Co-Stop (%CSTP in esxtop). Applies only to SMP (multi-vCPU) VMs. Below about 3% per vCPU is normal. Sustained higher values typically indicate the VM has more vCPUs than it can effectively use, and reducing the vCPU count usually helps.
  • Configure neither more vCPUs than the workload can effectively use, nor fewer than it needs. Per Performance Best Practices for VMware vSphere 8.0, extra vCPUs add scheduling overhead and can reduce performance on heavily loaded hosts.

Memory sizing

Size memory to match actual guest demand and to support NUMA locality. Use the same monitoring tools as for vCPU (see View Performance Charts and Performance Monitoring Utilities: resxtop and esxtop).

Metrics to collect, with general guidance on what to look for:

  • Active memory and consumed memory. These show how much memory the guest is actually using versus how much it has been allocated. A consistently low active-to-allocated ratio suggests oversizing.
  • Balloon (MCTL/SZ and MCTL/TGT in esxtop). Non-zero ballooning indicates host memory pressure has caused ESXi to reclaim memory from the guest. Sustained ballooning is a signal to revisit either VM memory allocation or host memory capacity.
  • Swap activity (SWCUR, SWR/s, SWW/s in esxtop, swap-in and swap-out rates in performance charts). Non-zero swap-in or swap-out at the host level usually indicates memory contention severe enough to impact VM performance. The goal is sustained zero swap activity.
  • Guest swap (in-guest paging). Use in-guest tools, since ESXi cannot see what the guest's own swap subsystem is doing. Active in-guest paging suggests the VM itself is undersized for its workload.

NUMA-related memory placement:

  • When the VM's vCPU count fits in one physical NUMA node, keep total memory at or below what one node provides where possible. This keeps both vCPUs and memory local and gives the lowest memory access latency.
  • For a wide VM (described below), align memory size with the same number of NUMA nodes the vCPUs span.
  • For procedural steps to change the configured memory size, see Change the Memory Configuration on Broadcom Tech Docs.

Cores per Socket guidance

For VMs running virtual hardware version 20 or later, leave Cores per Socket at "Assigned at power on" in nearly all cases. ESXi will pick the optimal value automatically.

The default differs by hardware version:

  • HW20 and later: "Assigned at power on" (Automatic vTopology).
  • Pre-HW20: one core per socket.

The Cores per Socket setting moved in vSphere 8.0. If you cannot find it in the expected place in the vSphere Client, see Cores per Socket setting location has been changed in vSphere 8.0 (KB 407648).

Per Performance Best Practices for VMware vSphere 8.0, beginning with vSphere 6.5 the cores-per-socket value no longer drives vNUMA topology. It only affects how the guest OS sees the virtual processor layout, which can be relevant for software licensing.

When manual Cores per Socket is appropriate

Manual configuration is appropriate only in a few specific situations:

  • Software or guest OS licensing requires a specific socket count. Examples include certain Microsoft SQL Server editions that license per socket, and older Windows Server editions that limit the number of sockets the OS can consume.
  • The VM is configured with around 32 vCPUs or more and the automatic default produces suboptimal guest behavior. The manual configuration for cores per socket for Virtual Machine might result in reduced performance (KB 413111) calls out this case directly.
  • You are pinning a benchmarked configuration or following a vendor-prescribed layout for a specific application.

If you do set Cores per Socket manually, follow the rules covered in Setting corespersocket can affect guest OS topologies (KB 340277). Pick a value that produces a guest topology that maps cleanly onto the host's physical NUMA nodes. Avoid values that force the guest to see odd-sized vNUMA nodes, or that present socket boundaries the underlying physical hardware does not support.

Wide VM guidance

A wide VM has more vCPUs than the cores available in one physical NUMA node. For these VMs:

  • Size the vCPU count so the VM splits evenly across the smallest possible number of NUMA nodes.
  • Avoid odd vCPU counts that cross NUMA boundaries.
  • Keep the VM's vCPU count at or below the host's total physical core count.
  • Align memory size with the same number of NUMA nodes the vCPUs span.

Automated rightsizing with VCF Operations and Aria Operations

Where customers have VCF Operations (or its predecessor, Aria Operations) deployed, the built-in rightsizing feature should be the first reference for sizing decisions. The Rightsize page reviews CPU, memory, storage, and other resource allocations across the entire VM fleet, surfaces oversized and undersized VMs based on actual workload demand, and provides forward-looking projections so recommendations account for projected growth and not just historical usage. Resize operations can be applied directly from the page, scheduled for an off-hours window, or excluded for VMs where automatic recommendations should not apply.

This approach is preferred over a manual review for customers who have these products available, because it scales across many VMs at once and uses the platform's built-in capacity projections. The manual process described in the next section remains valuable for one-off reviews and for environments where VCF Operations or Aria Operations is not deployed.

For documentation, see:

Reviewing an existing VM for sizing changes

The following process is appropriate when reviewing a VM that is already deployed and showing signs of being undersized, oversized, or topologically misaligned, in environments where the automated rightsizing tool described above is not in use:

  1. Capture baseline performance data over a representative 24-hour period: CPU utilization, CPU ready, co-stop, memory active, balloon, and swap-in/out.
  2. Decide whether the VM is undersized, oversized, or has a misaligned topology, using the principles in the sections above.
  3. Verify the virtual hardware version. Right-click the VM, choose Compatibility, and verify it is ESXi 8.0 and later (HW20) or higher to use Automatic vTopology. If lower, plan a hardware version upgrade after taking a snapshot or backup.
  4. Power off the VM. Changes to vCPU count, memory size, or CPU topology require the VM to be powered down. CPU Hot Add can stay enabled. In vSphere 8.0 with HW20, vNUMA is preserved when CPU Hot Add is active.
  5. Right-click the VM and choose Edit Settings.
  6. Adjust the CPU value to the right-sized vCPU count, leaving Cores per Socket at "Assigned at power on" unless one of the manual configuration cases above applies.
  7. Adjust the Memory value if memory is part of the change.
  8. Click OK to save settings, then power on the VM.
  9. Re-measure with the same counters from step 1 and verify the changes had the intended effect.

Additional Information

Additional Information