Network connectivity loss on ESXi hosts using Static Port Channels during silent physical switch failure
search cancel

Network connectivity loss on ESXi hosts using Static Port Channels during silent physical switch failure

book

Article ID: 432404

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • ESXi management network and hosted virtual machines become unreachable.
  • In the vCenter Server UI, hosts may appear in a "Not Responding" state.
  • Partial network connectivity loss is observed, often correlating with IP address parity (e.g., hosts with odd IP addresses lose connectivity, while hosts with even IP addresses remain connected).

Cause

This issue may occur when a physical switch in the network fabric stops processing traffic but maintains an active physical port link state (link-up).

When an ESXi host is configured with a static port channel ("Route based on IP Hash"), it relies entirely on the physical link status for uplink failure detection. Because static port channels lack a dynamic health-checking mechanism, the ESXi host is unaware of the upstream traffic flow failure. The host's load balancing algorithm continues to forward traffic for specific IP hashes to the unresponsive switch, creating a traffic blackhole.

Resolution

Investigate with the network team to identify and resolve the underlying physical network issue.

Long-Term Prevention: To prevent silent hardware failures from causing network outages, migrate the ESXi networking configuration from Static Port Channels to Link Aggregation Control Protocol (LACP).

  1. Implement a vSphere Distributed Switch (vDS), which is required for LACP support in vSphere.

  2. Configure Link Aggregation Groups (LAG) with LACP on the vDS and the physical switch fabric. LACP utilizes Protocol Data Units (PDUs) to actively negotiate and monitor link health, automatically removing unresponsive uplinks from the active forwarding path even if the physical link remains electrically active.

 

Additional Information

Understanding IP Hash load balancing (KB 321396)
Host requirements for link aggregation (etherchannel, port channel, or LACP) in ESXi (KB 324555)
Enable EtherChannel / Link Aggregation Control Protocol (LACP) in vSphere (KB 321425)
Network Loss on ESXi During Physical Switch Reboot or Hardware Failure Without Link State Change