VTEP faulty Alarm

search cancel

VTEP faulty Alarm

book

Article ID: 322448

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm for Faulty VTEP.
Event ID: tep_health.faulty_tep
Added in release: 4.1.0/ M22
Alarm Description

Purpose: faulty vtep due to no IP on VTEP or all BFD sessions from the VTEP is down
Impact: Overlay VMs using this local vtep would face network outage.

Resolution:
1. If all BFD sessions from a local vtep is down: Check underlay configuration for packet forwarding issues at TOR and all of the next-hops involved in routing in underlay.
2. If local vtep has no IP: if provisioning type selected is dhcp for local vtep, check dhcp server configuration is proper and pool exhaustion is not seen at dhcp server.
3. Check for pnic firmware issues and upgrade pnic firmware to latest version.
After fixing the underlay issue check for local vtep state by below api once manual or auto recovery is done for 'bfd down' case:
GET: https://'nsx-manager-ip'/api/v1/transport-nodes/'node-id'/network/interfaces?source=realtime
It should show local vtep state as NORMAL.
sample output:
{
interfaceId: vmk10,
linkStatus: UP,
adminStatus: UP,
mtu: 1600,
interfaceAlias: [{
broadcastAddress: 133.117.22.255,
ipAddress: {
ipv4: 2239043120
},
ipConfiguration: STATIC,
netmask: 255.255.255.0,
macAddress: 00:50:56:66:67:a6
}],
state: NORMAL
}

Is there a way to Work Around:
Enable VTEP HA feature to failover VMs to healthy VTEP.
Maintenance window required for remediation?
Yes
API reference:
https://developer.vmware.com/apis/1583/nsx-t

Environment

VMware NSX-T Data Center

Additional Information

API Guide: https://developer.vmware.com/apis/1583/nsx-t
Admin Guide: https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-72C2F2B2-7DC6-49A2-AD74-2FBAC93E3FAC.html

Feedback

thumb_up Yes

thumb_down No