Unable to connect with SQL server over NSX IPsec VPN tunnel
search cancel

Unable to connect with SQL server over NSX IPsec VPN tunnel

book

Article ID: 419969

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

SQL communication between the client and server over an NSX IPsec VPN fails, even if the initial TCP handshake between the two succeeds normally.

Packet captures from both the server and client end may show that initial connectivity is established, however shortly after a high number of retransmits are observed indicating a loss of communication in the datapath.

The traffic between the server and the client across a WAN, or other MTU-restricted datapath.

Interface statistics for the associated NSX Edge, Tier-0 (T0), and Tier-1 (T1) show no drops associated with the IPsec VPN path. 

 

Environment

  • NSX 4.1.x
  • NSX 4.2.x

Cause

The packet size is too large to pass through the WAN or other datapath, even though the traffic passes through the IPsec VPN normally.

To determine this, packet captures can be done at various points on the Edge where the VPN traffic passes in order to confirm the SQL traffic is passing normally, as well as the associated MTU of the packets. If the traffic is seen to leave the NSX Tier-0 with an MTU too large (the exact value will depend on the datapath) then the packets will get dropped outside of NSX. Some example capture points could be:

  • Ingress to the IPsec VPN
    • This could be either the Tier-0 or Tier-1 gateway, depending on the VPN configuration
  • If the IPsec VPN is on a Tier-1 (T1), the uplink between the T1 and T0 gateways
  • The egress from the T0 gateway to the WAN or next hop in the datapath

Because of IPsec communication encapsulation, additional filters will likely need to be applied to the capture commands to see the desired traffic, though caution must also be taken to avoid over-filtering.

For example, the below image shows packets from the SQL server leaving the T0 gateway uplink interface to the WAN as expected, however the size is too large (1582) to traverse the WAN and so are dropped before reaching the client, leading to a timeout.

NOTE: Based on the volume of traffic that may be observed with the capture, caution must be taken to prevent any impact to the Edge node itself. Therefore if additional assistance is required, please create a Broadcom Support case for assistance: Creating and managing Broadcom support request (SR) cases

The command for the above capture was as follows to filter for ESP (Encapsulating Security Payload) traffic, and source/destination IP addresses:

start capture interface ########-####-####-####-############ direction dual expression esp and host ###.###.###.### and host ###.###.###.###

For more information about NSX packet captures on an NSX Edge node, see the following: Troubleshooting NSX using Packet Captures

Resolution

Configure MSS Clamping on the NSX IPsec VPN tunnel in the IPsec Session's Advanced Settings to force smaller packet size to then allow communication:

NOTE: The exact MSS Value and Direction configurations will depend on the environment. Some tuning may be required to identify the largest value possible to still allow traffic to flow. 

After doing so, confirm in the packet captures the packet size has been lowered:

Additional Information

See Understanding TCP MSS Clamping for more information about this setting.