VMware Response to RSBA Mitigation Performance Questions
search cancel

VMware Response to RSBA Mitigation Performance Questions

book

Article ID: 330027

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

The purpose of this article is to answer recuring questions and provide additional details with regard to the performance impact of mitigating RSBA-related vulnerabilities.


Q: Why are some VMs slower after upgrading to Linux kernel 5.19?

A: The Linux kernel maintainers, following advice from Intel, for CPUs with Return Stack Buffer Alternative (RSBA), changed default CPU vulnerability mitigation to Indirect Branch Restrict Speculation (IBRS). IBRS has a higher performance cost than previous default CPU vulnerability mitigations.


Q: Did VMware's July patches to vSphere cause this performance loss?

A: No, VMware's July patches mitigated RSBA-related vulnerability at the host level (preventing malicious VMs from attacking vSphere or other VMs), and with no measurable performance impact. VMware recommends installation of the July patches. The patches did not change VMware's optimization of virtualization of guest CPU mitigations. This implementation was created and optimized to minimize hypervisor overhead in 2018.


Q: Can I reconfigure my hosts or cluster to avoid performance regressions while running with Linux default CPU mitigations?

A: A VM on a newer host CPU (see list below) not enumerating RSBA can run with significantly reduced mitigation performance costs. The host must either:

  1. not be in an Enhanced vMotion Compatibility (EVC) cluster, or
  2. be in an EVC cluster of mode Cascade Lake or Ice Lake.
(All older Intel EVC modes will cause RSBA to be advertised to VMs, resulting in greater performance costs).


Q: Does VMware issue specific recommendations for in-guest mitigation of RSBA?

A: No. VMware enables a customer to implement the security policy if their choice, including in-guest CPU vulnerability mitigation configuration. This choice is up to the customer. Different mitigations will have different performance and/or security properties. Consult with the OS vendor for further details.
 

vSphere-supported CPUs not enumerating RSBACPUID (FMS)
Intel Xeon Gold 6200/5200 (Cascade-Lake-SP/Refresh) Series6.55.7
Intel Xeon Platinum 8200 (Cascade-Lake-SP) Series6.55.7
Intel Xeon Silver 4200, Bronze 3200 (Cascade-Lake-SP/Refresh) Series6.55.7
Intel Xeon Gold 6300/5300 (Cooper-Lake-SP) Series6.55.B
Intel Xeon Platinum 8300 (Cooper-Lake-SP) Series6.55.B
Intel Atom C3000 Series6.5F
Intel Xeon Gold 6300/5300 (Ice-Lake-SP) Series6.6A.6
Intel Xeon Platinum 8300 (Ice-Lake-SP) Series6.6A.6
Intel Xeon Silver 4300 (Ice-Lake-SP) Series6.6A.6
Intel Xeon E-2200 (8-core) Series6.9E.D
Intel Xeon E-2300 Series6.A7.1


Resolution

What is RSBA?

Return Stack Buffer Alternative (RSBA) is a behavior of some Intel CPUs of the Skylake family. RSBA potentially enables an information leakage via an attack on Return Stack Buffer Underflow (RSBU). For concerned parties running software on vulnerable CPUs, this attack can be mitigated. Mitigation incurs a performance penalty.

Why is RSBA noteworthy now?

In July 2022, researchers demonstrated exploitability of RSBU. Many vendors' existing mitigations of previous CPU vulnerabilities were insufficient to mitigate RSBU. Software vendors have added and/or are adding RSBU mitigation, potentially incurring performance penalties.

Mitigation of RSBA in vSphere

For VMware, mitigation of RSBA-based attacks fall into two categories:

  • Hypervisor-Specific Mitigation
  • Guest Mitigation

Hypervisor-Specific Mitigation

Mitigates leakage from the hypervisor or guest VMs into a malicious guest VM. In July 2022, VMware released patches to vSphere, documented in VMSA-2022-0020, implementing hypervisor-specific mitigation at no visible performance cost.

Guest Mitigation

Mitigates leakage between processes within the VM, or between the VM's kernel and user processes. In 2018, VMware implemented hypervisor-assistance of guest mitigation by virtualizing speculative execution control mechanisms. VMware optimized this assistance to minimize hypervisor overhead. Within a VM, a guest operating systems may implement whatever mitigation it chooses (or none at all) and vSphere will execute this faithfully.

Performance Concerns and Guest Mitigation

Linux kernel 5.19 implements a new default mitigation for CPU vulnerabilities. If RSBA is detected in the underlying CPU, RSBU is mitigated using IBRS. IBRS is more expensive than the previous default mitigation (retpoline), resulting in performance loss. This performance loss is specific to workload and underlying physical CPU. Performance loss is experienced whether the underlying platform is physical (bare metal) or virtual (running on a hypervisor such as vSphere).

Windows already mitigated existing CPU vulnerabilities using IBRS by default, so there is no new overhead for RSBU mitigation in Windows guests.

Please refer to the VMware Performance Blog for additional information. 

RSBA Performance Considerations on vSphere

RSBA is advertised to a guest VM unless either of these following:

  • The host CPU does not enumerate RSBA (see above table of invulnerable CPUs) and the host CPU is not in a cluster in a Skylake or earlier EVC mode.
  • The VM is configured with virtual hardware version 8 or earlier (in which case, the VM will receive no CPU mitigation virtualization at all as well as none of the benefits of newer virtual hardware). This is typically true of older, legacy VMs and virtual appliances.

VMware's patches in 2018 implemented this RSBA behavior. This was not changed by VMware's July patches.

Options for Regaining Performance

VMware enables customers to implement their choice of security policies, within their virtual machines. VMware enables these mitigation options with minimal hypervisor overheads. Customers may choose whether and how to mitigate given their policies. Operating system providers may provide mitigation alternatives with different security and performance properties.

A heterogeneous cluster of hosts (vulnerable and invulnerable) with a Skylake or older EVC mode, will advertise RSBA to all VMs therein. Splitting this cluster into different clusters (one vulnerable, one invulnerable) would allow the invulnerable cluster's VMs to run without RSBA advertised, recouping performance for those VMs.

Why is AMD not relevant to this discussion?

AMD CPUs are affected by a similar vulnerability called Branch Type Confusion (BTC). Default Linux mitigation of BTC on AMD uses different techniques than the above, with a different performance footprint. These defaults were added to the Linux kernel before version 5.19. As AMD does not have RSBA/RSBU but instead BTC, the kernel affecting the change was different and the performance effects are different, this is omitted from broader discussion.