VPN session down logs can be seen in /various/log/syslog* with the down reason.
For versions before 9.0, the down reason is either “Peer not reachable” or “Local Endpoint not bound to interface”
For versions 9.0 onwards, the down reason is either “Peer not responding” or “Local Endpoint not bound to interface”
The Datapath traffic over VPN session on the Tier1 SR is impacted. Due to this, the session is down, traffic gets impacted for configured subnets over VPN session.
Log location:
Edge Logs:
- /var/log/syslog*
- /var/log/nsx-event.log*
VMware NSX
Issue is not reproducible always but following steps are leading to the issue:
- Let’s say Edge1 and Edge2 are in a cluster and Tier1 VPN service and session are configured.
- Tier1 “Auto Allocate Edges” config is set to “No”. Preferred Edge list of Tier1 has Edge1 and Edge2 configured.
- Tier1 is active on Edge1 and VPN session is UP. There can be following 2 cases.
Failover mode is Preemptive:
- Now remove Edge1 from the preferred edge list of the Tier1 and save the config. Failover happens and Tier1 becomes active on Edge2 with VPN session UP.
- Add back the Edge1 as preferred edge in the list and save the config. Now again failover will happen as the mode is Preemptive and Tier1 is active on Edge1.
- Here issue may hit and VPN session is down due to “Local Endpoint not bound to interface” issue.
Failover mode is Non-Preemptive:
- Remove Edge1 from the preferred edge list of the Tier1 and save the config. Failover happens and Edge2 becomes active with VPN session UP.
- Add back the Edge1 as preferred edge in the list and save the config. Here failover will not happen as the mode is Non-preemptive and Tier1 is standby on Edge1.
- Here issue may hit when next time failover happens due to some reason and Tier1 is active on Edge1. VPN session can be down due to “Local Endpoint not bound to interface” issue.
Workaround:
From above scenario, remove and add back the Edge1 in preferred edge list of the Tier1 by making sure that Tier1 is standby on Edge1
Versions where this is a known issue:
Till 9.1