Large packet loss in the guest OS using VMXNET3 in ESXi
search cancel

Large packet loss in the guest OS using VMXNET3 in ESXi

book

Article ID: 324556

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server

Issue/Introduction

When using the VMXNET3 driver on a virtual machine on ESXi, you see significant packet loss during periods of very high traffic bursts. The virtual machine may even freeze entirely. Doing one of the following may resolve the issue:

  • vMotion the VM to another host
  • Disconnect and reconnect the VM's adapter

Environment

vSphere ESXi
vCenter Server

Cause

This can occur due to a lack of receive and transmit buffer space or when receiving traffic that is speed-constrained using, for example, a traffic filter.

Resolution

Investigate potential causes of significant packet loss during periods of very high traffic bursts with the VMXNET3 vNIC:

  1. Ensure that there is no traffic filtering occurring (for example, by a security product).
  2. Check the driver/firmware of the ESXi's physical NICs and update if needed. To check the driver/firmware, see Determining Network/Storage firmware and driver version in ESXi.
  3. After eliminating the above possibilities, consider increasing the number of buffers in the guest operating system.

Determine if increasing the size of virtual NIC buffers could help solve your problem:

  1. For a given virtual machine, determine the ESXi host on which it is running, and log into the ESXi host via SSH or a KVM with root privileges.
  2. Assuming an example of "VMNAME", run the command:  net-stats -l | grep -i VMNAME
  3. The output might look something like the following:
    • <PortNumber>       5    9    <Switch Name>    <MAC Address>    VMNAME.eth0
  4. Run a command such as the following to check the 1st ring buffer stats:
    • vsish -e get /net/portsets/<Switch Name>/ports/<PortNumber>/vmxnet3/rxSummary | grep "1st ring"
  5. The output might look something like the following:
    •    1st ring size:512
         # of times the 1st ring is full:276
  6. Another command that will show insight would be:
    • vsish -e get /net/portsets/<Switch Name>/ports/<PortNumber>/vmxnet3/rxSummary | grep "running out of buffers"
  7. The output might look something like the following:
    •      running out of buffers:3198
  8. If the "# of times the 1st ring is full" and/or "running out of buffers" is zero, then changing the ring buffer settings will not affect your symptoms.  
  9. However, if they are greater than zero, increasing the buffer values may help solve your problem.

NOTE: The virtual NIC counters as seen by the vsish command are reset, when the Virtual Machine is vMotioned or is Power Cycled.

 

To increase VMXnet3 ring buffer values in Windows OS:

  1. Click Start > Control Panel > Device Manager.
  2. Right-click vmxnet3 and click Properties.
  3. Click the Advanced tab.
  4. Click Small Rx Buffers and increase the value (The maximum value is 8192).
  5. Click Rx Ring #1 Size and increase the value (The maximum value is 4096).

To increase VMXnet3 ring buffer values in Linux OS:

1.You can view the current values using the command: 
ethtool -g <interface> 
where <interface> is the interface name as it appears in your OS, e.g. eth0.
 
2.You can set a value using a capital G, followed by the interface name, followed by pairs of settings and values, for example: 
ethtool -G <interface> rx 4096
ethtool -G <interface> rx 4096 rx-jumbo 4096 rx-mini 2048 tx 4096

Refer to The output of esxtop show dropped receive packets at the virtual switch for detailed instructions on changing these values.

 

Notes:
  • No reboot is required for these changes to take effect. However, any applications sensitive to TCP session disruption such as RDP could be disconnected. Performing these changes using a vSphere console will allow you to avoid disconnection.
  • Increasing the size of the buffers will increase memory usage on the virtual machine, which in turn increases memory usage on the host. It is important to increase ring buffer values gradually to avoid drastically increasing the memory overhead on the host and possibly causing performance issues if resources are close to capacity.
  • If this issue occurs on only 2-3 virtual machines, set the value of Small Rx Buffers and Rx Ring #1 to the maximum value. Monitor virtual machine performance to see if this resolves the issue.
  • The Small Rx Buffers and Rx Ring #1 variables affect non-jumbo frame traffic only on the adapter.
  •  Windows Servers must be on NDIS 6.3 or higher to change RX Ring #2
    Overview of NDIS versions - Windows driver


Additional Information