After NSX-T Edge failover, datapathd service crashes on the new Edge node
search cancel

After NSX-T Edge failover, datapathd service crashes on the new Edge node

book

Article ID: 319061

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • After an Edge node fails over, the datapathd process on the new active Edge node crashes, and all BFD / BGP tunnels go down.
  • This issue occurs with the following configuration:
    • URL Database is enabled for the Edge cluster under Security -> General Settings -> URL Database 
    • A new policy (rule section) is created under Security -> Gateway Firewall -> Gateway Specific Rules -> <Select T1 GW>
      • An 'Allow' rule is added in this new policy with the default L7 Access Profile
      • The new policy is above the default policy
  • Example of core dump logging in /var/log/syslog on Edge:

    2024-04-17T13:02:01:332Z <Edge hostname> NSX 2359547 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.dp-fp:4.1687196169.9076.0.11.gz

Cause

During the Edge failover process, there is a transition of states to the new Active Edge node. However, dns_trans_ids are not transferred correctly along with flows during this transition, leading to missing initialization. Consequently, the datapathd service crashes.

Resolution

This issue is resolved in VMware NSX 3.2.3.1
This issue is resolved in VMware NSX 3.2.4
This issue is resolved in VMware NSX 4.1.1
This issue is resolved in VMware NSX 4.2.0

Additional Information

Impact/Risks:
North / South traffic through the Edge is impacted when the datapathd service crashes.