VTEP Autorecover failure Alarm
search cancel

VTEP Autorecover failure Alarm

book

Article ID: 322536

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Title: Alarm for Autorecover failure of VTEP.
Event ID: tep_health.tep_autorecover_failure
Added in release: 4.1.0/ M22
Alarm Description
  • Purpose: This alarm indicates an autorecover was attempted on faulty vtep and it has failed since all BFD sessions from that local vtep are still down.
  • Impact: Overlay VMs using this local vtep would face network outage.
Resolution:
1. Check underlay configuration for packet forwarding issue at TOR and all of the next hop involved in routing in underlay.
2. Check for pnic firmware issues and upgrade firmware to latest version.
After fixing the underlay issue wait for next autorecovery attempt or invoke manual recovery through api: POST https://'nsx-mgr'/policy/api/v1/infra/sites/'site-id'/enforcement-points/'enforcementpoint-id'/host-transport-nodes/'host-transport-node-id'/vteps/actions
{
type: TransportNodeVTEPRecoveryRequest,
device_name: vmk10
}
and then check for local vtep state through api: GET: https://'nsx-manager'/api/v1/transport-nodes/'node-id'/network/interfaces?source=realtime. It should show local vtep state as NORMAL.
sample output:
{
interfaceId: vmk10,
linkStatus: UP,
adminStatus: UP,
mtu: 1600,
interfaceAlias: [{
broadcastAddress: 133.117.22.255,
ipAddress: {
ipv4: 2239043120
},
ipConfiguration: STATIC,
netmask: 255.255.255.0,
macAddress: 00:50:56:66:67:a6
}],
state: NORMAL
}

Environment

VMware NSX-T Data Center

Additional Information


API Guide: https://developer.vmware.com/apis/1583/nsx-t
Admin Guide:
https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-72C2F2B2-7DC6-49A2-AD74-2FBAC93E3FAC.html