Large packet loss in the guest OS using VMXNET3 in ESXi
search cancel

Large packet loss in the guest OS using VMXNET3 in ESXi

book

Article ID: 324556

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • To resolve the issue of significant packet loss during periods of very high traffic bursts:
    1. Ensure that there is no traffic filtering occurring (for example, with a mail filter).
    2. Check the driver/firmware of the ESXi's physical NICs and update if needed. To check the driver/firmware, see Determining Network/Storage firmware and driver version in ESXi.
    3. After eliminating the above possibilities, slowly increase the number of buffers in the guest operating system.
  • When using the VMXNET3 driver on a virtual machine on ESXi, you see significant packet loss during periods of very high traffic bursts. The virtual machine may even freeze entirely. Doing one of the following may resolve the issue:
    • vMotion the VM to another host
    • disconnect and reconnect the VM's adapter


Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x
VMware vCenter Server 7.x
VMware vCenter Server 8.x

Cause

This can occur due to a lack of receive and transmit buffer space or when receive traffic which is speed-constrained using, for example a traffic filter.

Resolution

DETERMINE IF VIRTUAL NIC BUFFERS COULD BE INCREASED:

  • 1) For a given virtual machine, determine the ESXi host on which it is running, and log into the ESXi host via SSH or a KVM with root privileges.
  • 2) Assuming an example of "VMNAME", run the following command:  "net-stats -l | grep VMNAME"
  • 3) The output might look something like the following:

100663349 2189023:VMNAME.eth0              all(1) DvsPortset-2       3821.41  349.01 11970.00    5663.45    5.34  123.00   0.00   0.00

  • 4) Run a command such as the following to check the 1st ring buffer stats:

vsish -e get /net/portsets/DvsPortset-2/ports/100663349/vmxnet3/rxSummary | grep "1st ring"

  • 5) The output might look something like the following:

   1st ring size:512
   # of times the 1st ring is full:276

  • 6) Another command that will show insight would be:

vsish -e get /net/portsets/DvsPortset-2/ports/100663349/vmxnet3/rxSummary | grep "running out of buffers"

  • 7) The output might look something like the following:

    running out of buffers:3198

  • 8) If the "# of times the 1st ring is full" and/or "running out of buffers" is zero, then changing the ring buffer settings will have no effect on your symptoms.  
  • 9) However, if they are > zero, then we would suggest the steps below.
  • 10) Please note that the virtual NIC counters as seen by the vsish command are reset, and the Port Number changes (100663349 in the above example) when the VM is vMotioned to another ESXi host. 

 

To resolve the issue of significant packet loss during periods of very high traffic bursts with the VMXNET3 vNIC:

  1. Ensure that there is no traffic filtering occurring (for example, with a mail filter).
  2. Check the driver/firmware of the ESXi's physical NICs and update if needed. To check the driver/firmware, see Determining Network/Storage firmware and driver version in ESXi.
  3. After eliminating the above possibilities, slowly increase the number of buffers in the guest operating system.

To reduce burst traffic drops in Windows Buffer Settings:

  1. Click Start > Control Panel > Device Manager.
  2. Right-click vmxnet3 and click Properties.
  3. Click the Advanced tab.
  4. Click Small Rx Buffers and increase the value (The maximum value is 8192).
  5. Click Rx Ring #1 Size and increase the value (The maximum value is 4096).

Notes:
  • No reboot is required for these changes to take effect. However, any application sensitive to TCP session disruption can fail and have to be restarted. This applies to RDP, so it is better to do this work in a console window.
  • This issue is seen in the Windows guest operating system with a VMXNET3 vNIC.
  • It is important to increase the value of Small Rx Buffers and Rx Ring #1 gradually to avoid drastically increasing the memory overhead on the host and possibly causing performance issues if resources are close to capacity.
  • If this issue occurs on only 2-3 virtual machines, set the value of Small Rx Buffers and Rx Ring #1 to the maximum value. Monitor virtual machine performance to see if this resolves the issue.
  • The Small Rx Buffers and Rx Ring #1 variables affect non-jumbo frame traffic only on the adapter.
  •  Windows Servers must be on NDIS 6.3 or higher to change RX Ring #2 (See Internal Notes)
    Overview of NDIS versions - Windows driver


Additional Information