The VMware vSphere and VMware NSX environments experienced network latency and packet loss due to a faulty or congested Cisco IOM.
search cancel

The VMware vSphere and VMware NSX environments experienced network latency and packet loss due to a faulty or congested Cisco IOM.

book

Article ID: 438995

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

A vSphere environment running on Cisco UCS hardware may experience significant network performance degradation.

Symptoms observed from the guest OS or application layer include:

  • High network latency and inconsistent response times.

  • Increased TCP retransmissions.

  • Intermittent or frequent packet loss.

  • Potential link-down events on physical uplinks.

Note: Within ESXi, vmnic counters and driver logs might not explicitly show packet receive drops or errors, even when network performance is significantly degraded.

Environment

VMware vSphere

VMware NSX

Cause

The issue is typically caused by a failing or congested physical Cisco IOM. Because the IOM sits between the ESXi host and the Fabric Interconnect, drops occurring at the IOM level are often transparent to the ESXi networking stack. The ESXi host reflects the symptoms (latency and retransmits) of the degraded path rather than logging the dropped packets directly in the network driver.

Resolution

When symptoms point to a network bottleneck but ESXi vmnic counters remain clean, troubleshooting must shift to the physical infrastructure:

  1. Review the Fabric Interconnect and IOM interface counters for CRC errors, discards, or pause frames.

  2. Check for hardware alerts or "failing" states within Cisco UCS Manager (UCSM).

  3. If no errors are visible at the vSphere layer but the symptoms persist, engage Cisco Support to perform a deep-dive analysis of the IOM and FI backplane.

  4. If the IOM is found to be faulty or unable to handle the traffic load due to hardware degradation, replace the physical module.

Additional Information

 

  • This behavior is expected when the failure occurs at the physical switching layer (IOM/FEX) before the frames reach the ESXi host's PCI bus.