Tenant Virtual Machines (VMs) may experience a loss of North-South network connectivity during physical leaf switch maintenance or upgrades. While East-West connectivity typically remains functional, traffic destined for external networks is dropped.
This issue is specifically observed on ESXi hosts utilizing the icen driver for Intel NICs. During a switch upgrade, the driver fails to update the link status of the physical NIC (vmnic), leading to a state where the host continues to sending traffic out of a vmnic to an upstream leaf switch which is not available. Issue seen on ICEN driver 1.14.2.0 on firmware 4.50 but may not be isolated to these versions.
Symptoms:
WARNING: Heap: 3645: Heap pfHeap-icen already at its maximum size. Cannot expandWARNING: icen: icen_GetLinkStatus:1268: XXXX:XX:00.0: Failed to get link state - status: ICE_ERR_NO_MEMORYWARNING: icen: icen_CleanControlQ:6817: XXXX:XX:00.0: Failed to allocate heap for the Admin queue eventWARNING: icen: icen_LinkEvent:2626: XXXX:XX:00.0: Failed to get LLDP status from firmware, Status: Out of memoryVMware ESXi
The root cause is a memory leak within the icen driver's private heap (pfHeap-icen).
When the driver reaches its maximum heap size, it can no longer allocate memory for the Admin Queue or handle Link Events. Consequently, if a physical switch goes down (e.g., for an upgrade), the driver cannot process the interrupt or status change. Because the driver never "sees" the link go down, it does not notify the ESXi stack. The host continues to pin VM traffic to the vmnic that is connected to the upgrading and unavailable leaf switch, and fails to initiate a failover to the secondary physical adapter.
To resolve this issue, you must address the driver-level memory management failure:
Subscribe to this knowledge article to get updates on this issue.