This article provides troubleshooting details for IPsec VPN on VMware NSX-T. IPsec VPN services are critical for establishing secure connectivity between NSX-T environments and remote sites.
Functional Pre-requisite Requirements:
High Availability (HA) Mode: The Gateway must be configured in Active/Standby HA mode. Active/Active is not supported for IPsec VPN services.
VMware NSX
Policy Based VPNs tunnel traffic based on configured local and remote networks.
Route Based VPNs make use of forwarding table to identify traffic to be sent through the IPsec tunnel. The forwarding entry could be based on static routes or dynamically learnt over BGP.
Configuration: Any forwarding entry configured for Virtual Tunnel Interface (VTI) will make use of RBVPN.
Troubleshooting Focus: Check VTI IP connectivity and BGP neighbor states (if dynamic routing is configured).
To check VTI IP connectivity, ping to Remote VTI IP can be checked as shown as follows:
[nsx-edge(tier0_sr[1])> ping X.X.X.X source Y.Y.Y.Y
PING X.X.X.X (169.2.2.3) from Y.Y.Y.Y: 56 data bytes64 bytes from X.X.X.X: icmp_seq=0 ttl=64 time=5.146 ms64 bytes from X.X.X.X: icmp_seq=1 ttl=64 time=3.964 ms64 bytes from X.X.X.X: icmp_seq=2 ttl=64 time=3.747 ms64 bytes from X.X.X.X: icmp_seq=3 ttl=64 time=4.235 ms64 bytes from X.X.X.X: icmp_seq=4 ttl=64 time=3.692 ms^C--- X.X.X.X ping statistics ---6 packets transmitted, 5 packets received, 16.7% packet lossround-trip min/avg/max/stddev = 3.692/4.157/5.146/0.530 ms
nsx-edge(tier0_sr[1])>
To check BGP neighbor states:
[nsx-edge> get gatewaysGatewayUUID VRF Gateway-ID Name Type Ports Neighbors736a80e3-####-####-####-bb########## 0 0 TUNNEL 3 2/50001999abc1-####-####--9b############## 1 4 SR-T0-Server-A SERVICE_ROUTER_TIER0 9 1/500001bd3cb82-####-####--82############## 3 2 DR-T0-Server-A DISTRIBUTED_ROUTER_TIER0 6 2/500001298f615-####-####--11############## 5 9 SR-T1-Server-B SERVICE_ROUTER_TIER1 7 2/50000d955c6ca-####-####--2c############## 6 8 DR-T1-Server-B DISTRIBUTED_ROUTER_TIER1 4 0/5000059ad774a-####-####--6e############## 7 16 SR-VRF-test_vrf VRF_SERVICE_ROUTER_TIER0 4 0/50000b08bac8e-####-####--28############## 8 14 DR-VRF-test_vrf VRF_DISTRIBUTED_ROUTER_TIER0 3 0/50000
[nsx-edge> vrf 1[nsx-edge(tier0_sr[1])> get bgp neighbor summaryBFD States: NC - Not configured, DC - Disconnected DW - Down, IN - Init, UP - UpBGP Peer Type: * - DynamicBGP summary information for VRF default for address-family: ipv4UnicastRouter ID: A.A.A.A Local AS: ####
Neighbor AS State Up/DownTime BFD InMsgs OutMsgs InPfx OutPfx
B.B.B.B #### Estab 4d06h46m NC 531919 532064 5 0C.C.C.C #### Estab 00:01:26 NC 366 454 7 10
[nsx-edge(tier0_sr[1])>
VPN session has Download Config feature which can help basic configuration related troubleshooting. It effectively generates a "cheat sheet" of exactly what the remote device must be configured with; to successfully bring up the VPN tunnel with NSX-T Edge.
This is helpful particularly for following down reasons:
Accessing the Feature:
Navigate to Networking > VPN > IPsec Sessions.
Select the specific VPN Session.
Click the Download Config button.
This will download a text file containing the configuration parameters for Peer Device.
Here are some specific scenarios along with checklist details to help troubleshoot.
If the session status is "Down," the IKE (Phase 1) negotiation has failed completely. See screenshot below:
Checklist for specific down reasons:
1. Peer not responding:
Verify basic reachability (ping) between the Local Endpoint IP and Remote Gateway IP. Ensure UDP ports 500 (IKE), 4500 (NAT-T) and IP Proto 50 (ESP) are open in the underlay (physical) firewalls.
2. No proposal chosen / config mismatch:
Verify that the Pre-Shared Key (PSK), IKE Version (v1/v2), Encryption, Digest, and Diffie-Hellman (DH) groups match exactly on Remote VPN Gateway.
3. Authentication failed:
If using NAT-T or certificates, ensure the Local ID and Remote ID settings identify the peers correctly.
For PSK based authentication, ensure PSK exactly matches on both the sides.
4. TS unacceptable:
Check the local and remote network configuration on both sides in case of PBVPN.
"Degraded" implies that the session is UP but one or more tunnels are in "Down" state affecting particular selective traffic flows.
Checklist for specific down reasons:
The tunnel appears "Success/UP" in the UI, but data cannot pass.
Checklist:
Firewall Rules:
Check Gateway Firewall rules configurations. Ensure traffic is allowed In/Out.
Routing (RBVPN):
Verify routes exist over VTI for desired destination IPs.
Either static routes or BGP should be configured over VTI
get route
get bgp neighbor summary
Ensure the next hop for the destination is pointing to the VTI interface from the output of the get route command in the previous step.
3. MTU/MSS:
Large packets might be dropped due to fragmentation needed. Check the MTU on the uplinks and PMTU..
Try clamping TCP MSS on the VPN profile (e.g., 1350 bytes).
The tunnel comes up, stays for a while, and then drops or restarts.
Checklist:
NAT configuration
HA VIP as IPsec Local Endpoint(LEP) IP
Note: These commands must be run on the NSX Edge Node where the VPN service is active.
1. General VPN Status
# Check IPsec VPN service and session status detailsget ipsecvpn serviceget ipsecvpn session <options>
2. Configuration Details
# Check configuration pushed to the Edgeget ipsecvpn config <options>
3. IKE (Phase 1) Diagnostics
# Check IKE Security Associations (SAs)
get ipsecvpn ikesa <options>
4. IPsec (Phase 2) Diagnostics
# Check IPsec SAs (Tunnel status) in control plane
get ipsecvpn ipsecsa <options>
5. Packet counters
# Check Packet Counters (verify if traffic is hitting the tunnel)
get ipsecvpn tunnel stats
6. IPSEC SAs
# Check IPsec SAs in Datapath
get ipsecvpn sad
The primary logs for IPsec VPN troubleshooting are located on the NSX Edge Node.
Main Log File:
/var/log/syslog
Parsing Tips:
This section outlines how Firewall and NAT configurations interact with both Policy-Based and Route-Based IPsec VPN implementations.
If a Gateway Firewall is enabled on the gateway hosting the VPN configuration, packets will only be processed for IPsec VPN if the firewall policy explicitly permits them.
Policy-Based VPN: Ensure that firewall policies do not drop IP packets that match the Local and Remote Network definitions of the IPsec VPN session.
Route-Based VPN: Ensure that traffic destined for the Remote Network is permitted. This applies when a firewall policy is active over the Route-Based VPN interface (VTI).
For an IPsec VPN to establish, the gateway firewall must allow specific control packets between the Local Endpoint IP (Source) and the Remote Endpoint IP (Destination), and vice versa.
Required Protocols: UDP Port 500 and UDP Port 4500 (for IKE and NAT-T).
Provisioning Behavior:
Default Priority (VPN over NAT): By default, VPN processing takes precedence over NAT. If outbound traffic matches both an SNAT rule and an IPsec Local/Remote network pair, the NAT rule is bypassed, and the traffic is encapsulated for the VPN.
Configuring NAT Before VPN: If your architecture requires address translation before traffic enters the IPsec tunnel, you must configure the IPsec Local networks using the post-NAT (translated) IP addresses, rather than the original source IPs.
VTI Routing: Traffic routed to a VTI (via static routes or BGP) is automatically processed for IPsec encapsulation.
VTI SNAT: If SNAT is applied to the VTI, BGP advertises the translated IP to the peer, not the original source IP.
IPsec VPN supports NAT Traversal (NAT-T), allowing VPN endpoints to communicate even if one or both reside behind a NAT device.
NAT-T Compatibility: Supported on upstream nodes (e.g. an upstream Tier-0 Gateway).
Limitation: NAT cannot be applied to VPN Endpoint IPs on the gateway terminating the IPsec session (e.g., a Tier-1 Gateway).
Workaround: Create a high-priority "No NAT" rule on the terminating gateway (Source = Local Endpoint IP, Destination = Peer IP).
Roadmap: Future release will fully automate this "No NAT" rule creation and lifecycle management.
If you are contacting Broadcom support about this issue, please provide the following:
NSX Edge log bundles for all Edges in the Edge Cluster containing the T0 or T1 where the IPSEC VPN is configured
Ensure log date range covers the full date of the event(s) being investigated. When in doubt, retrieve logs for all time.
NSX Manager log bundles
ESXi host log bundles for all hosts where the affected Edge VMs are running
Text of any error messages seen in NSX GUI or command lines pertinent to the investigation
The configuration and logs from the device on the other end of the IPSEC VPN