Receive length errors detected on Mellanox nmlx5_core NICs.
search cancel

Receive length errors detected on Mellanox nmlx5_core NICs.

book

Article ID: 401149

calendar_today

Updated On:

Products

VMware vSphere ESX 7.x VMware vSphere ESX 8.x VMware vSphere ESXi

Issue/Introduction

Receive length errors detected on Mellanox nmlx5_core NICs.

The vmnic is set with jumbo frames  (MTU 9000) . 

The statistics for the vmnic show similar to below.


      Packets received: 0
      Packets sent: 0
      Bytes received: 
      Bytes sent: 0
      Receive packets dropped: 0
      Transmit packets dropped: 0
      Multicast packets received: 0
      Broadcast packets received: 
      Multicast packets sent: 0
      Broadcast packets sent: 0
      Total receive errors:     51844
      Receive length errors:  51844  
      Receive over errors: 0
      Receive CRC errors: 0
      Receive frame errors: 0
      Receive FIFO errors: 0
      Receive missed errors: 0
      Total transmit errors: 0
      Transmit aborted errors: 0
      Transmit carrier errors: 0
      Transmit FIFO errors: 0
      Transmit heartbeat errors: 0
      Transmit window errors: 0

      NIC Private statistics:

      rxOutOfRangeLenPhy: 15
      rxOversizePktsPhy: 51829
      rx_8192_to_10239_bytesPhy: 1390543

 

The vmnic statistics in the host can be viewed with the following script:  /usr/lib/vmware/vm-support/bin/nicinfo.sh

Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

The vmnic has a non-zero value for "rxOversizePktsPhy"  which contributes to the packets on the receive length error counter. 

The vmnic also has a non-zero value for "rxOutOfRangeLenPhy" which also contributes to the packets on the receive length error counter.

We also see packets with size above 8192 "rx_8192_to_10239_bytesPhy"  (ESXi would be limited to frame size 8972 when jumbo frames are enabled)  being received on the vmnic port.

In this case, packets with higher size than the configured MTU are being sent from physical switch port to the mellanox adaptor and are seen as "Receive length errors" 


  

Resolution

Perform packet captures on the physical switch port to identify what these packets are, since packet capture in ESXi would not contain these higher size packets as they are already dropped. Please engage your switch vendor or your networking team to assist with this. 

Kindly ensure that the MTU settings are consistent end-to-end and aligned with the expected traffic patterns.

Additional Information

The MTU for each vmnic can be observed by running the command:  esxcli network nic list

This issue can also occur if the MTU is set to 1500 and the NIC receives a frame size larger than 1472 bytes (due to overhead). 

      rx_64_bytesPhy: 22595317
      rx_65_to_127_bytesPhy: 815736056
      rx_128_to_255_bytesPhy: 227233250
      rx_256_to_511_bytesPhy: 267986423
      rx_512_to_1023_bytesPhy: 340117906
      rx_1024_to_1518_bytesPhy: 141345162
      rx_1519_to_2047_bytesPhy: 1410456575
      rx_2048_to_4095_bytesPhy: 2182
      rx_4096_to_8191_bytesPhy: 3646
      rx_8192_to_10239_bytesPhy: 85703

The data these counters display are additive of issues that are external to the ESXi kernel and only what is being reported to the ESXi host from the NIC driver.
https://knowledge.broadcom.com/external/article/341594/troubleshooting-nic-errors-and-other-net.html