Switch Configuration
Proper switch configuration is critical to the success of RDMA on vSAN. Below are key areas to address.
Always refer to vendor-specific documentation for details.
General Requirements
-
Data Center Bridging (DCB):
- Enable DCB to ensure lossless Ethernet required for RDMA.
-
Priority Flow Control (PFC):
- Configure PFC on switches to prioritize RDMA traffic. Verify VLAN tagging is consistent across all devices in the path.
-
Congestion Management:
- Use switch features like Explicit Congestion Notification (ECN) where supported to manage network congestion.
-
Firmware and Software:
- Ensure all switches are running a firmware or software version compatible with RDMA traffic.
Vendor-Specific Guidance
-
Cisco:
- Verify switch firmware supports DCBX. For example, Nexus switches may require specific firmware versions to enable RDMA features.
-
Arista:
- Enable DCB and PFC features. Ensure switches support QoS configurations required for RDMA traffic.
-
Mellanox:
- Configure switches to support RoCE traffic, including PFC, DCBX, and ECN.
Caveats
Operational Challenges
- This is not a "plug-and-play" feature. Misconfiguration, such as improper Priority Flow Control (PFC) settings, will require troubleshooting by the operational teams (not VMware Global Support).
- If you cannot align VMware admins and networking teams operationally, consider using TCP instead of RDMA for simplicity.
Restrictions
- Do not rely on VMware Global Support for network-specific configurations. Ensure network teams handle settings like VLAN and PFC.
- Do not mix NIC vendors within the same cluster.
- Do not run RDMA over Converged Ethernet (RoCE) over Layer 3 or use LAG/LACP configurations with RDMA.
General Notes
- Mixing vendors or using unsupported configurations can lead to unmanageable QA and operational issues.
NIC Configuration
The following are general NIC setup requirements. Refer to vendor-specific documentation for precise configuration instructions.
General Requirements
- Enable RDMA over Converged Ethernet (RoCE).
- Configure Priority Flow Control (PFC) and ensure it is set correctly for VLANs used by vSAN traffic.
- Avoid mixing NIC vendors within the same vSAN cluster.
Vendor-Specific Guidance
-
Broadcom:
- Prefer modern NICs such as Thor-based adapters. Older NICs may lack the required feature set for optimal RDMA performance.
- Enable Data Center Bridging (DCB) and configure PFC on the adapter.
-
Mellanox:
- Enable DCBX, configure VLAN settings, and disable CEE mode.
- Set hardware parameters to enforce PFC and DSCP values for RDMA traffic.
-
Intel:
- Verify that RDMA functionality is supported and configure NICs according to vendor recommendations.