NSX Edges and Tier 0 gateway display as down on NSX version 3.2.1.2
search cancel

NSX Edges and Tier 0 gateway display as down on NSX version 3.2.1.2

book

Article ID: 312628

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • You have recently upgraded NSX from 3.1.3.7 to 3.2.1..2 or running version 3.2.1.2.
  • You observe that the PNIC/Bond status on the T0 gateway is down
  • The overall status of the Edge node is down.
  • There is a disruption in North-South traffic.
  • You may see similar entries in the Edge Node log file /var/log/syslog.
2022-10-21T11:46:04.913Z hostname.example.com NSX 23204 ROUTING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="lrouter" tname="dp-ipc19" level="INFO"] Update lpm tables: DR (########-####-####-####-########4210), v4: 0x7babfc1ed380, v6: (nil)

2022-10-21T11:46:04.913Z hostname.example.com NSX 23204 ROUTING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="lrouter" tname="dp-ipc19" level="INFO"] Update lpm tables: DR (########-####-####-####-########4210), v4: 0x7babfc1ed380, v6: 0x7babfc1eb080

2022-10-21T11:46:04.662Z hostname.example.com kernel - - - [10425.332018] grsec: Segmentation fault occurred at (nil) in /opt/vmware/nsx-edge/sbin/datapathd[dp-ipc19:23607] uid/euid:0/0 gid/egid:124/124, parent /usr/bin/containerd-shim-runc-v2[containerd-shim:23181] uid/euid:0/0 gid/egid:0/0



Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center

Cause

Bad prefixes are being advertised via route advertisement rule as aggregate CIDR from the T1 to the T0 gateway. The user has given invalid network prefix but the problem is missing validation check on Policy/Provider.
 

Resolution

This issue is fixed in NSX version 3.2.3.

Workaround:
To find which route:
  1. Update the lpm table is causing the segmentation fault and crashing the dataplane.
  2. Enable debug level logging of dataplane and we can see the exact prefix that is causing it.
  3. Once the prefix is available search the prefix in elastic search of the UI and find which T1 is advertising it and then stop the advertisement or detach the t1 which is advertising that prefix from T0.