Datapath service crash on Edge Transport Node
search cancel

Datapath service crash on Edge Transport Node

book

Article ID: 367309

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • IPSec VPN service is configured and used.
  • NSX Manager will trigger alarm "Application Crashed".
  • The impacted Edge node will have a new core dump file generated at /var/log/core.
  • Edge Transport Node's > /var/log/syslog will report crash of datapath:
    <timestamps> <edge_name> kernel - - - [3588509.261594] grsec: Segmentation fault occurred at 0000000000000000 in /opt/vmware/nsx-edge/sbin/datapathd[dp-ipc31:10553] uid/euid:0/0 gid/egid:124/124, parent /opt/vmware/edge/dpd/entrypoint.sh[entrypoint.sh:10017] uid/euid:0/0 gid/egid:124/124
    <timestamps> <edge_name> NSX 878056 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="INFO"] Core dump generation received by process: 10086 [dp-ipc31]
    <timestamps> <edge_name> NSX 878056 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.dp-ipc##.###.gz
  • /var/log/syslog on Edge may also report a change to IPSec SA similar to logs below:
    <timestamps> <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(<src_ip>:100 <- <dst_ip>:100): len=   52, mID=87, HDR(0743aa3c6976057b_i, 32fff19b1acce9f4_r)
    <timestamps> <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(<src_ip>:100 -> <dst_ip>:100): len=   84, mID=87, HDR(0743aa3c6976057b_i, 32fff19b1acce9f4_r)
    <timestamps> <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet R(<src_ip>:100 <- <dst_ip>:1024): len=   52, mID=95, HDR(6607c79df90b4d7e_i, d3a793696dab05c5_r)
    <timestamps> <edge_name> NSX 11945 VPN [nsx@6876 comp="nsx-edge" subcomp="iked" s2comp="ike-stack" level="INFO"] IKEv2 packet S(<src_ip>:100 -> <dst_ip>:1024): len=   84, mID=95, HDR(6607c79df90b4d7e_i, d3a793696dab05c5_r)
    ..
    <timestamps> <edge_name> NSX 10086 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="dp-iked" tname="dp-ipc31" level="INFO"] Update inbound SA of lrouter <lrouter UUID> (vrfid 12) session <session UUID> and SPI 0xb0604baf from "<src_ip>" proto 50 v6_subnets 0 v6_EPs 0


Environment

VMware NSX

Cause

  • There may be changes to existing IPSec security association (SA) which will cause an update of the  given security association, which in turn will cause memory corruption on the Edge node, leading to crash of datapath service.
  • This will cause a double free condition, which occurs when there is an update on existing Security Association (SA) due to change on the existing connection (e.g. port change/update).

Resolution

This issue is resolved in VMware NSX 4.2 or later versions, available at Broadcom downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.