Configuring RDMA for vSAN
search cancel

Configuring RDMA for vSAN

book

Article ID: 382163

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This document provides guidance for configuring Remote Direct Memory Access (RDMA) on VMware vSAN environments.

This is intended for advisory purposes only.

Resolution

All switch configurations need to be validated with your switch vendor.

All vmnic/driver configurations need to be validated with your NIC hardware vendor.

Additional Information

Switch Configuration

Proper switch configuration is critical to the success of RDMA on vSAN. Below are key areas to address.

Always refer to vendor-specific documentation for details.

General Requirements

  1. Data Center Bridging (DCB):

    • Enable DCB to ensure lossless Ethernet required for RDMA.
  2. Priority Flow Control (PFC):

    • Configure PFC on switches to prioritize RDMA traffic. Verify VLAN tagging is consistent across all devices in the path.
  3. Congestion Management:

    • Use switch features like Explicit Congestion Notification (ECN) where supported to manage network congestion.
  4. Firmware and Software:

    • Ensure all switches are running a firmware or software version compatible with RDMA traffic.

Vendor-Specific Guidance

  • Cisco:

    • Verify switch firmware supports DCBX. For example, Nexus switches may require specific firmware versions to enable RDMA features.
  • Arista:

    • Enable DCB and PFC features. Ensure switches support QoS configurations required for RDMA traffic.
  • Mellanox:

    • Configure switches to support RoCE traffic, including PFC, DCBX, and ECN.

Caveats

Operational Challenges

  • This is not a "plug-and-play" feature. Misconfiguration, such as improper Priority Flow Control (PFC) settings, will require troubleshooting by the operational teams (not VMware Global Support).
  • If you cannot align VMware admins and networking teams operationally, consider using TCP instead of RDMA for simplicity.

Restrictions

  • Do not rely on VMware Global Support for network-specific configurations. Ensure network teams handle settings like VLAN and PFC.
  • Do not mix NIC vendors within the same cluster.
  • Do not run RDMA over Converged Ethernet (RoCE) over Layer 3 or use LAG/LACP configurations with RDMA.

General Notes

  • Mixing vendors or using unsupported configurations can lead to unmanageable QA and operational issues.

NIC Configuration

The following are general NIC setup requirements. Refer to vendor-specific documentation for precise configuration instructions.

General Requirements

  • Enable RDMA over Converged Ethernet (RoCE).
  • Configure Priority Flow Control (PFC) and ensure it is set correctly for VLANs used by vSAN traffic.
  • Avoid mixing NIC vendors within the same vSAN cluster.

Vendor-Specific Guidance

  • Broadcom:

    • Prefer modern NICs such as Thor-based adapters. Older NICs may lack the required feature set for optimal RDMA performance.
    • Enable Data Center Bridging (DCB) and configure PFC on the adapter.
  • Mellanox:

    • Enable DCBX, configure VLAN settings, and disable CEE mode.
    • Set hardware parameters to enforce PFC and DSCP values for RDMA traffic.
  • Intel:

    • Verify that RDMA functionality is supported and configure NICs according to vendor recommendations.