Transport Node config update or sync to ESX hosts configured with Multi TEP HA caused NetIOC reconfiguration leading to network downtime which further triggered TEP HA actions
search cancel

Transport Node config update or sync to ESX hosts configured with Multi TEP HA caused NetIOC reconfiguration leading to network downtime which further triggered TEP HA actions

book

Article ID: 382209

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

When NSX manager pushes Transport Node config on to ESXi cluster/hosts where Multi TEP HA is configured along with NetIOC enabled on ESX hosts could lead to network disturbance and downtime 

Symptoms of issues observed

  • Flapping of VTEP interfaces on several ESX hosts
  • Several vmkpings are missed between TEP interfaces of ESX hosts 
  • Multiple ESXi hosts shows "Degraded" status
  • Multiple worker nodes deployed on ESXi hosts shows "Not Ready" state

 

 

Environment

4.1.2 

Cause

The main cause of this issue is as below

Whenever the NSX Transport Node configuration is re-applied to ESXi hosts from NSX manager, the lcore configuration also gets re-applied on ESXi hosts. This lcore re-configuration triggers ENS uplink detach and attach operations twice. These operations could take some seconds leading to network downtime. TEP HA reacts to network downtime causing further disturbance in the network 

This issue will be observed when there is a re-configuration of TNP or Multi TEP HA profile on NSXT setup where ESXI hosts are prepared with ENS and when NetIOC is configured on ESXi hosts

Note : There might not be any re-configuration of configured lcores, but lcore configuration is reapplied every time NSX manager pushes Transport Node config to an ESXi cluster/host

Resolution

The recommendation to avoid such issues of network disturbance or downtime is to disable NetIOC. This can avoid potential network downtime that can be caused by NSX config update/sync on an ESXi cluster/host.

 

Additional Information

In the upcoming NSX releases, such issues of unnecessary lcore reconfiguration while re-applying Transport Node profile or Multi TEP HA profile on ESXi hosts with NetIOC and ENS enabled will be addressed.

This will address the use cases where NetIOC has to be enabled together with ENS to enhance network performance.