Datapath service crash on Edge Transport Node
search cancel

Datapath service crash on Edge Transport Node

book

Article ID: 367309

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

  • NSX 4.1
  • IPSec VPN service is configured and used.
  • "NSX Manager may trigger alarm "Application Crashed".
  • The impacted Edge node will have a new core dump file generated.
  • Edge Transport Node's syslog will report crash of datapath:
    2024-02-29T03:55:11.739Z <edge_name> kernel - - - [3588509.261594] grsec: Segmentation fault occurred at 0000000000000000 in /opt/vmware/nsx-edge/sbin/datapathd[dp-ipc31:10553] uid/euid:0/0 gid/egid:124/124, parent /opt/vmware/edge/dpd/entrypoint.sh[entrypoint.sh:10017] uid/euid:0/0 gid/egid:124/124
    2024-02-29T03:55:11.225Z <edge_name> NSX 878056 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="INFO"] Core dump generation received by process: 10086 [dp-ipc31]
    2024-02-29T03:55:11.226Z <edge_name> NSX 878056 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.dp-ipc31.1709178911.10086.0.11.gz
  • syslog may also report a change to IPSec SA similar to logs below:
    2024-02-28T23:10:32.276Z <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(<src_ip>:100 <- <dst_ip>:100): len=   52, mID=87, HDR(0743aa3c6976057b_i, 32fff19b1acce9f4_r)
    2024-02-28T23:10:32.276Z <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(<src_ip>:100 -> <dst_ip>:100): len=   84, mID=87, HDR(0743aa3c6976057b_i, 32fff19b1acce9f4_r)
    2024-02-28T23:10:59.893Z <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(<src_ip>:100 <- <dst_ip>:1024): len=   52, mID=95, HDR(6607c79df90b4d7e_i, d3a793696dab05c5_r)
    2024-02-28T23:10:59.894Z <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(<src_ip>:100 -> <dst_ip>:1024): len=   84, mID=95, HDR(6607c79df90b4d7e_i, d3a793696dab05c5_r)
    ..
    2024-02-28T23:11:32.282Z <edge_name> NSX 10086 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="dp-iked" tname="dp-ipc31" level="INFO"] Update inbound SA of lrouter <lrouter UUID> (vrfid 12) session <session UUID> and SPI 0xb0604baf from "<src_ip>" proto 50 v6_subnets 0 v6_EPs 0

Cause

  • There may be changes to existing IPSec security association (SA) which will cause an update of the  given security association, which in turn will cause memory corruption on the Edge node, leading to crash of datapath service.
  • This will cause a double free condition, which occurs when there is an update on existing Security Association (SA) due to change on the existing connection (e.g. port change/update).

Resolution

This is a known issue, with fix planned in future release of NSX.