Title: Alarm for VTEP HA Activated.
Event ID: tep_health.tep_ha_activated
Alarm Description
In the affected host's /var/log/nsx-syslog.log
, we see messages similar to the below:Wa(180) cfgAgent[2147528]: NSX 2147528 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" s2comp="nsx-monitoring" entId="########-####-####-####-########4d11" tid="F3DF7700" level="warn" eventState="On" eventFeatureName="tep_health" eventSev="warning" eventType="faulty_tep"] TEP:vmk11 of VDS:vDS-Name at Transport node:########-####-####-####-########4d89. Overlay workloads using this TEP will face network outage. Reason: all BFD tunnels from TEP are down.
VMware NSX
VMware NSX-T Data Center
Steps to resolve
For 4.1.0 and higher
Recommended Action:
If faulty_tep alarm shows reason for failure as:
After fixing the underlay issue check for local VTEP state by below api once manual or auto recovery is done for 'bfd down' case:
GET: https://<nsx-manager-ip>/api/v1/transport-nodes/<node-id>/network/interfaces?source=realtime
Note: You should see local VTEP state as NORMAL.
Sample output:
{
interfaceId: vmk10,
linkStatus: UP,
adminStatus: UP,
mtu: 1600,
interfaceAlias: [{
broadcastAddress: 192.168.1.255,
ipAddress: {
ipv4: 2239043120
},
ipConfiguration: STATIC,
netmask: 255.255.255.0,
macAddress: 00:50:##:##:##:a6
}],
state: NORMAL
}
Maintenance window required for remediation? Yes