NetIOC change leads to restart of all uplinks attached to VDS when EDP is enabled
search cancel

NetIOC change leads to restart of all uplinks attached to VDS when EDP is enabled

book

Article ID: 428850

calendar_today

Updated On:

Products

VMware NSX VMware vSphere ESXi

Issue/Introduction

  • Changing of Network I/O Control (NetIOC) settings on an ESXi host is being performed.
  • Enhanced Data Path (EDP)—often referred to as the Enhanced Networking Stack (ENS) is enabled on the ESXi.
  • NIC flaps on the ESXi is observed.
  • TeamPolicyUpDelay is set to a higher value than the default, for instance 1800000 milliseconds.
  • Traffic outage is observed on the ESXi.
  • If the ESXi server hosts NSX Edge nodes, these NIC flaps may cause Edges loosing BGP neighborships.

Environment

  • VMware ESXi 8.x
  • VMware NSX 4x

 

Cause

NetIOC change is a  VDS wide config change that triggers all uplinks on the VDS to be closed (link down) and re-opened (link up). This occurs to apply the new settings. 

Closing and reopening of the uplinks on the VDS after NetIOC change occurs only when EDP is enabled. 

However this does not result in a traffic interruption in all environments. This is because in most environments the TeamPolicyUpDelay is set to a few seconds and shotgun mode is not enabled. As the uplink close and re-open operation is sequential on the uplinks, other uplinks in the team are still up when the operation is on one uplink. 

For instance, consider a scenario where vmnic1 and vmnic2 undergo sequential 'close and re-open' operations. Ideally, by the time vmnic2 begins its restart, vmnic1 should already be back online and forwarding traffic.

However, if the TeamPolicyUpDelay parameter is set to a high value—such as 1,800,000 milliseconds (30 minutes)—vmnic1 will remain in a blocked state for the duration of that delay after restarting. Because the interface waits half an hour before it begins forwarding packets again, a significant traffic disruption occurs.

 

Resolution

NetIOC change the uplinks becoming closed and then re-opened is expected behavior in EDP enabled host.

Workaround

  1. Perform the NetIOC change at a planned maintenance window.
  2. One possible workaround can be to set TeamPolicyUpDelay to the default value (100ms) before NetIOC change and restore it after. However, it may not be scalable for all environments.