NSX-T edge nodes manual core dump generation steps for frr
search cancel

NSX-T edge nodes manual core dump generation steps for frr

book

Article ID: 322548

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

When the FRR service crashes on an NSX-T edge node, it should generate a core dump and it should be located:
/var/dump
However, this is not the case and has been like this in all GA versions: 3.1.x, 3.2.x, 4.0.x.
This KB outlines the steps required to configure the FRR daemons (listed below) to generate a core dump if the FRR service crashes.
FRR daemons:
  • zebra
  • ospfd
  • staticd
  • pimd
  • bgpd


Environment

VMware NSX-T Data Center

Resolution

These steps will set all the FRR daemons to generate a core dump when it crashes.
As the root user on the NSX-T edge node, on each edge node, run the following command to find the PID for each of the daemons listed above:
ps -aux | grep -i frr

For example:

ps -aux | grep -i frr
...
frr       6441  0.0  0.0 892236  6156 ?        Ssl  Jan06  10:00 /usr/lib/frr/zebra -d -e8 -M fpm:netlink -Xx -A 127.0.0.1 -s 90000000
frr       6536  0.1  0.1 290584 10136 ?       Ssl  Jan06  80:10 /usr/lib/frr/bgpd -d -A 127.0.0.1
frr       6561  0.0  0.0 137112  4624 ?        Ssl  Jan06   6:27 /usr/lib/frr/ospfd -d -A 127.0.0.1
frr       6578  0.0  0.0  98036  3688 ?         Ss   Jan06   5:24 /usr/lib/frr/pimd -d -A 127.0.0.1
frr       6597  0.0  0.0  94372  2976 ?         Ss   Jan06   4:02 /usr/lib/frr/staticd -d -A 127.0.0.1
...
Run the following command replacing the PID for each daemon with the one found in the above command, this command should be run 5 times with the 5 different PID's:
prlimit --pid <pid of the FRR daemon> --core=500000000:500000000
For example:
prlimit --pid 6441 --core=500000000:500000000
prlimit --pid 6536 --core=500000000:500000000
prlimit --pid 6561 --core=500000000:500000000
prlimit --pid 6578 --core=500000000:500000000
prlimit --pid 6597 --core=500000000:500000000
To validate the core dumps are now set, run the following command on each FRR daemon PID found above:
#prlimit --pid <PID>
For example:
#prlimit --pid 6441
RESOURCE   DESCRIPTION               SOFT           HARD        UNITS
AS         address space limit       unlimited      unlimited   bytes
CORE       max core file size        500000000      500000000   bytes
...
Note: These settings are local per edge node, so when an edge node is replaced, the setting will need to be added again.