Error: "TEP HA activated" and overlay network outage
search cancel

Error: "TEP HA activated" and overlay network outage

book

Article ID: 427870

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

After removing and re-adding a physical uplink (vmnic) to an existing NSX Transport Node (e.g., during troubleshooting or host profile remediation), overlay workloads experience a brief network outage. The NSX Manager reports the following alarm:

TEP:vmkX of VDS:<DVS-Name> at Transport node:<Node UUID>. Overlay workloads using this TEP will face network outage.

 

Environment

4.x

Cause

This issue is caused by a transient control plane mismatch or stale MAC/ARP cache on the upstream physical switch. When an active vmnic is removed and re-added to the NSX switch, the physical switch may retain a stale MAC table entry or fail to immediately resume forwarding for the TEP interface, causing BFD packets to be dropped. NSX detects this as a dead path and triggers a TEP HA failover.

Resolution

Reboot the affected ESXi host.

The reboot process drops the link state on the physical switch ports, forcing the switch to flush its stale MAC/ARP entries for the host. Upon boot, the NSX host agents will re-initialize and BFD sessions will be established cleanly on all uplinks.

 

If a host reboot is not immediately possible, you can attempt to clear the transient state manually:

  1. Log in to the ESXi host via SSH as root.
  2. Restart the NSX Management agents to force a BFD re-sync: /etc/init.d/nsx-opsagent restart
  3. Engage the network team to manually clear the ARP/MAC address table for the specific switch port connected to the re-added vmnic.