How to Enable Tx/Rx hardware flow control and Disable Priority Flow Control on ESXi host
search cancel

How to Enable Tx/Rx hardware flow control and Disable Priority Flow Control on ESXi host

book

Article ID: 417909

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Hardware Flow Control:

- Hardware flow control in ESXi manages network traffic by pausing and resuming data transmission between devices to prevent packet loss from congestion.

- It is typically enabled by default and configured through the ESXi driver, though it can be controlled using command-line tools like esxcli for manual adjustments or to make settings persistent. 

- Pause Frames are related to Ethernet flow control and are used to manage the pacing of data transmission on a network segment.

- Sometimes, a sending node (ESXi host, switch, and so on) may transmit data faster than another node can accept it. In this case, the overwhelmed network node can send pause frames back to the sender, pausing the transmission of traffic for a brief period of time.

Priority Flow Control

- Priority Flow Control (PFC) is an extension of the standard Ethernet flow control mechanism (IEEE 802.3x) that operates on individual traffic priorities rather than the entire network link

- Selective Pausing: Unlike traditional flow control which pauses all traffic on a link when congestion occurs, PFC allows the ESXi host or switch to apply pause functionality to specific classes of traffic (defined by IEEE 802.1p CoS values). This means that lower-priority, non-loss-sensitive traffic can be temporarily halted without affecting high-priority, time-sensitive traffic flowing on the same physical link.

- Lossless Ethernet: The main goal of PFC is to prevent packet loss due to congestion in data center bridging (DCB) networks, which is crucial for storage protocols that assume a lossless network fabric.

- Hardware Support: PFC functionality is dependent on the network adapter hardware and driver support within ESXi. The configuration typically involves enabling Data Center Bridging Exchange (DCBX) protocols to ensure the ESXi host and the connected physical switch agree on the PFC settings. 

- For ESXi configuration, PFC usually requires explicit configuration and coordination on both the ESXi host and the connected physical switch ports to ensure a matching configuration.

- In summary, PFC in ESXi provides a granular, hardware-level mechanism to manage network congestion, ensuring zero packet loss for critical workloads by selectively pausing traffic flows based on priority

Working concept of PFC:
  • Priority tagging: Ethernet frames are tagged with a priority value, ranging from 0 (lowest) to 7 (highest).
  • Congestion detection: When a network interface's buffer reaches a certain threshold due to congestion, PFC is triggered.
  • Per-priority pausing: Instead of pausing all traffic, the interface sends a pause frame that instructs the sender to temporarily stop sending packets only of that specific priority class.
  • Resumption: When the congestion clears, a resume frame is sent to allow the paused traffic to resume.
  • Lossless transport: This is crucial for applications that cannot tolerate packet loss, such as NVMe over RDMA and iSCSI storage traffic, ensuring that they have a lossless Ethernet transport. 

Environment

VMware ESXi 7.x

VMware ESXi 8.x

Resolution

** Before proceeding of enabling the hardware flow control, please make sure the drivers and firmware of the ethernet cards on the hypervisor are according to the Broadcom's compatibility guide: Broadcom Compatibility Guide

To enable the hardware flowcontrol, following is the command:

1. List pause parameters of all NICs

  esxcli network nic pauseParams list
 

2. Set pause parameters for a NIC

  esxcli network nic pauseParams set

                Cmd options:
                      -a|--auto=<bool> Enable/disable auto negotiation.
                      -n|--nic-name=<str> Name of NIC whose pause parameters should be set. (required)
                      -r|--rx=<bool> Enable/disable pause RX flow control.
                      -t|--tx=<bool> Enable/disable pause TX flow control.

  Example: esxcli network nic pauseParams set -n vmnic2 --rx=true --tx=true

* If the physical NIC interface is in Down state, you will not be able to enable the pauseParams on that interface *

To make the changes persistent across reboots:  

- Modify the local.sh file located at the /etc/rc.local.d/ directory

- Add the command to be executed above the line exit 0

- Reference KB relating to config to stay persistent across reboots: Persistent across reboots

 

Configuration of Priority Flow Control:

  1. Verify NIC Driver Support for PFC:
    Not all NICs and their drivers support PFC. You need to ensure your NIC and its corresponding driver on the ESXi host are capable of PFC. Refer to your NIC vendor's documentation or use ESXi commands to check driver parameters. For example, to list parameters for a specific module:
esxcli system module parameters list --module <driver_name>
     2. Configure PFC Parameters on the ESXi Host:
         PFC is typically configured at the driver level using module parameters. The exact parameters and values depend on the NIC vendor and model. For Mellanox ConnectX-4/5 drivers, for instance, you might use:
esxcli system module parameters set -m nmlx5_core -p "pfctx=0x08 pfcrx=0x08 trust_state=2"esxcli system module parameters set -m nmlx5_rdma -p "dscp_force=26"
  • pfctx and pfcrx: These parameters control the transmit and receive PFC behavior, respectively. The value is a bitmap representing the priorities (0-7) for which PFC should be enabled. For example, 0x08 (binary 00001000) enables PFC for priority 3.
  • trust_state: This parameter relates to how the NIC trusts incoming QoS markings.
  • dscp_force: This parameter can be used to force a specific Differentiated Services Code Point (DSCP) value for RDMA traffic.

      If these configurations are not setup on the host or if the NIC drivers doesn't support the PFC or if physical switch (ESXi NICs connecting to physical switch ports) ports doesn't support PFC, then PFC is in a disabled state

    3. Configure PFC on the Physical Switch:
        PFC configuration on the ESXi host must be consistent with the configuration on the physical network switch ports connected to the ESXi host. This includes:
  • Enabling PFC: 
    Ensure PFC is enabled on the relevant switch ports for the specific traffic classes (priorities) you intend to protect with PFC.
  • Matching DCB Modes: 
    If Data Center Bridging (DCB) is used, ensure the DCB mode (e.g., IEEE or CEE) is consistent between the ESXi host and the switch. To get the status of DCB on ESXi, use the command: esxcli network nic dcb status get -n vmnicX. 
  • Traffic Class Groups and Bandwidth Allocation: 
    Configure traffic class groups and allocate bandwidth for different traffic types (e.g., SAN traffic, management traffic) on the switch, aligning with your desired PFC strategy.
    4. Reboot the ESXi Host:
        After modifying module parameters, a reboot of the ESXi host is typically required for the changes to take effect.
 
    Important Considerations:
  • Vendor Documentation: 
    Always consult your specific NIC and switch vendor documentation for detailed instructions and recommended configurations.
  • Consistency: 
    Ensure PFC settings are consistent across the entire path for lossless traffic, from the ESXi host to the physical switch and any other involved network devices.
  • Monitoring: 
    After configuration, monitor network performance and ensure PFC is functioning as expected to prevent packet loss in congested scenarios.

Additional Information

Reference KBs:

Configuring advanced driver module parameters in ESXi

Persistent configuration across reboots on ESXi

Flow Control setup in ESXi

Priority Flow Control not enabled due to DCB mode mismatch