IPSec VPN session is flapping.
search cancel

IPSec VPN session is flapping.

book

Article ID: 398029

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Once every few minutes, in this case once every 30minutes (it depends on the remote firewall configuration), the IPSec VPN session goes down for a second and comes back up.

  • Below entries indicate that the phase-1 negotiation is successful:

    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 SA [Initiator, NAT-T] negotiation completed:
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Local Authentication Method  : Pre-shared key
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Remote Authentication Method : Pre-shared key
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   IKE algorithms : aes256-cbc, hmac-sha1, hmac-sha256-128
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Diffie-Hellman : group 16 (4096 bits)
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Local IKE peer  ##.##.##.84:4500 routing instance 2 ID ##.##.##.84 (ipv4)
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Remote IKE peer ##.##.##.177:4500 routing instance 2 ID ##.##.##.2 (ipv4)
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Initiator SPI ######e9 ######50 Responder SPI ######71 ######5b
    2025-03-28T13:10:54.411Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Local Lifetime: 86400 second

  • Below entries indicate that the phase-2 negotiation too is successful:

    2025-03-28T13:10:54.413Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IPsec SA [Initiator, NAT-T, tunnel, auto] negotiation completed:
    2025-03-28T13:10:54.413Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Local IKE peer  ##.##.##.84:4500 routing instance 2 ID ##.##.##.84 (ipv4)
    2025-03-28T13:10:54.413Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Remote IKE peer ##.##.##.177:4500 routing instance 2 ID ##.##.##.2 (ipv4)
    2025-03-28T13:10:54.413Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Local Traffic Selector  ipv4(##.##.##.0-##.##.##.0.127)
    2025-03-28T13:10:54.413Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Remote Traffic Selector ipv4(##.##.##.17)
    2025-03-28T13:10:54.413Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Routing Instance  plr_sr (2)
    2025-03-28T13:10:54.414Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Inbound SPI:      | Outbound SPI: | Algorithm:
    2025-03-28T13:10:54.414Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   ESP    [######00] | [######f6]    | aes-cbc/256 - hmac-sha256-128
    2025-03-28T13:10:54.414Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]   Local Lifetime: 3600 seconds
    2025-03-28T13:10:54.414Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] Fri Mar 28 2025 13:10:54: NOTICE: IPsec SA installed: esp: SPI ######00

  • Post successful phase-1 and phase-2 negotiations, we can see the session keep-alive packets being exchanged for the next ~29minutes:

    2025-03-28T13:11:06.066Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(##.##.##.84:4500 <- ##.##.##.177:4500): len=   52, mID=0, HDR(############50_i, ############5b_r)
    2025-03-28T13:11:06.066Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(##.##.##.84:4500 -> ##.##.##.177:4500): len=   84, mID=0, HDR(############50_i, ############5b_r)
    [....]
    2025-03-28T13:11:16.257Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(##.##.##.84:4500 <- ##.##.##.177:4500): len=   52, mID=1, HDR(############50_i, ############5b_r)
    2025-03-28T13:11:16.258Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(##.##.##.84:4500 -> ##.##.##.177:4500): len=   84, mID=1, HDR(############50_i, ############5b_r)
    [....]
    2025-03-28T13:40:45.823Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(##.##.##.84:4500 <- ##.##.##.177:4500): len=   52, mID=176, HDR(############50_i, ############5b_r)
    2025-03-28T13:40:45.824Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(##.##.##.84:4500 -> ##.##.##.177:4500): len=   84, mID=176, HDR(############50_i, ############5b_r)

  • After 30minutes of the session being established, we receive a delete packet from the remote asking to tear down the session:

    2025-03-28T13:40:55.604Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(##.##.##.84:4500 <- ##.##.##.177:4500): len=   60, mID=177, HDR(############50_i, ############5b_r), DEL

  • Edge acknowledges and honors the request by deleting the IPSec SAs:

    2025-03-28T13:40:55.605Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(##.##.##.84:4500 -> ##.##.##.177:4500): len=   84, mID=177, HDR(############50_i, ############5b_r)
    2025-03-28T13:40:55.605Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IPsec SA EVENT:
    2025-03-28T13:40:55.605Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"]         IPsec SA ESP Inbound SPI ######00, Outbound SPI ######f6: destroyed
    2025-03-28T13:40:55.606Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="iked-dp-control" level="INFO"] SA delete for SPIs 0x######00_i 0x######f6_o, dir 0x0, encr algo aes256-cbc
    2025-03-28T13:40:55.606Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="iked-dp-control" level="INFO"] Deleting IPSec inbound SA local=##.##.##.177, remote=##.##.##.84
    2025-03-28T13:40:55.606Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="iked-dp-control" level="INFO"] Sending SA delete for policy UUID 0x##########09 0x##############00, SPI 0x######00 IPv6 endpoint flag not set, IPv6 rule flag not set,
    2025-03-28T13:40:55.607Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="iked-dp-control" level="INFO"] Deleting IPSec outbound SA local=##.##.##.84, remote=##.##.##.177
    2025-03-28T13:40:55.607Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="iked-dp-control" level="INFO"] Sending SA delete for policy UUID 0x##########09 0x##############00, SPI 0x######f6 IPv6 endpoint flag not set, IPv6 rule flag not set,
    2025-03-28T13:40:55.608Z edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] Fri Mar 28 2025 13:40:55: NOTICE: IPsec SA deleted: esp: SPI ######00

  • As edge is the initiator for the IPSec session, it attempts to initiate a new connection, phase-1 and phase-2 negotiations are successful, the keep alive packets are exchanged for 29minutes until, the remote again sends a delete request. This continues once every 30minutes:

    edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(##.##.##.84:500 -> ##.##.##.177:500): len=  734, mID=0, HDR(############89_i, 0000000000000000_r), SA, KE, Nonce, N(NAT_DETECTION_SOURCE_IP), N(NAT_DETECTION_DESTINATION_IP), N(FRAGMENTATION_SUPPORTED), Vid
    edge_fqdn NSX 3126911 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(##.##.##.84:500 <- ##.##.##.177:500): len=   44, mID=0, HDR(############89_i, 0000000000000000_r), N(COOKIE)

  • The IPSec VPN local endpoint is configured as SNAT translated IP (the IPSec VPN is configured to use NAT).

  • Packet capture on the edge uplink shows that, during the time of the issue, edge is trying to re-establish the session with a random source port (in case of IPSec VPN, port 500 or, in case of IPSec with NAT, port 4500 is used):

    14:10:57.056747 ##:##:##:##:##:d8 > ##:##:##:##:##:20, ethertype IPv4 (0x0800), length 776: ##.##.##.84.22306 > ##.##.##.177.isakmp: isakmp: parent_sa ikev2_init[I]
    14:10:57.062692 ##:##:##:##:##:20 > ##:##:##:##:##:d8, ethertype IPv4 (0x0800), length 86: ##.##.##.177.isakmp > ##.##.##.84.22306: isakmp: parent_sa ikev2_init[R]
    14:10:57.064504 ##:##:##:##:##:d8 > ##:##:##:##:##:20, ethertype IPv4 (0x0800), length 792: ##.##.##.84.22306 > ##.##.##.177.isakmp: isakmp: parent_sa ikev2_init[I]
    14:10:57.076604 ##:##:##:##:##:20 > ##:##:##:##:##:d8, ethertype IPv4 (0x0800), length 917: ##.##.##.177.isakmp > ##.##.##.84.22306: isakmp: parent_sa ikev2_init[R]
    14:10:57.100628 ##:##:##:##:##:d8 > ##:##:##:##:##:20, ethertype IPv4 (0x0800), length 286: ##.##.##.84.19697 > ##.##.##.177.ipsec-nat-t: NONESP-encap: isakmp: child_sa  ikev2_auth[I]
    14:10:57.105777 ##:##:##:##:##:20 > ##:##:##:##:##:d8, ethertype IPv4 (0x0800), length 286: ##.##.##.177.ipsec-nat-t > ##.##.##.84.19697: NONESP-encap: isakmp: child_sa  ikev2_auth[R]
    14:10:57.110284 ##:##:##:##:##:d8 > ##:##:##:##:##:20, ethertype IPv4 (0x0800), length 270: ##.##.##.84.19697 > ##.##.##.177.ipsec-nat-t: NONESP-encap: isakmp: child_sa  child_sa[I]
    14:10:57.115739 ##:##:##:##:##:20 > ##:##:##:##:##:d8, ethertype IPv4 (0x0800), length 286: ##.##.##.177.ipsec-nat-t > ##.##.##.84.19697: NONESP-encap: isakmp: child_sa  child_sa[R]

Environment

VMware NSX

Cause

IPSec VPN is not supported when the local endpoint IP address goes through NAT in the same logical router that the IPSec VPN session is configured.

Owing to this, the IKE packets undergo NAT and use a random source port to establish the session. However, the ESP (tunnel) traffic do not undergo NAT. Therefore, as the remote endpoint does not receive the ESP packets on the random source port that was used to establish the connection, it tears down the connection.

This configuration limitation has been documented in Add an NSX IPSec VPN Service.

Resolution

There are two options:

  1. Do not configure IPSec VPN on the same logical router where the local endpoint IP address goes through NAT.

  2. Create a "No NAT" rule for the same local endpoint IP and configure a lower priority than the actual "SNAT" rule so that, even the IKE packets do not undergo NAT, thereby not having to use a random source port to establish the session. For example, let us assume 1.1.1.10 is the local endpoint and 1.1.1.20 is the remote endpoint. Currently, we have a SNAT rule stating Any Source --> Any Destination --> Translate to 1.1.1.10. Now, we will need to create a "No NAT" rule stating Source IP 1.1.1.10 --> Destination IP 1.1.1.20 --> Any Translated IP.