Core dumps are very useful for debugging a database panic/crash. It is recommend to enable core file generation on the master/coordinator and segment servers for Greenplum Database clusters.
This article covers how to configure the cluster to generate core dump files.
[gpadmin@mdw ~]$ gpssh -f hostfile ulimit -c
[sdw1] 0
[sdw2] 0
[ mdw] 0
# Core file size set to unlimited for user gpadmin
gpadmin - core unlimited
[gpadmin@mdw ~]$ gpssh -f hostfile ulimit -cNote: Log out and back in to pick up the ulimit changes for the current login session.
[sdw1] unlimited
[sdw2] unlimited
[ mdw] unlimited
kernel.core_uses_pid = 1where:
kernel.core_pattern = /<directory>/core-%e-%s-%u-%g-%p-%t # Replace <directory> with an appropriate location for the core files, their size may range in GB
[root@mdw ~]$ gpssh -f hostfile chmod 1777 /var/crash
[root@mdw ~]$ source /usr/local/greenplum-db/greenplum_path.shVerify the settings :
[root@mdw ~]$ gpssh -f hostfile sysctl -p /etc/sysctl.d/corefiles.conf
[ mdw] kernel.core_uses_pid = 1
[ mdw] kernel.core_pattern = /var/crash/core-%e-%s-%u-%g-%p-%t
[sdw1] kernel.core_uses_pid = 1
[sdw1] kernel.core_pattern = /var/crash/core-%e-%s-%u-%g-%p-%t
[sdw2] kernel.core_uses_pid = 1
[sdw2] kernel.core_pattern = /var/crash/core-%e-%s-%u-%g-%p-%t
root@mdw ~]$ gpssh -f hostfile sysctl kernel.core_uses_pid
[sdw2] kernel.core_uses_pid = 1
[ mdw] kernel.core_uses_pid = 1
[sdw1] kernel.core_uses_pid = 1
root@mdw ~]$ gpssh -f hostfile sysctl kernel.core_pattern
[ mdw-lab1] kernel.core_pattern = /var/crash/core-%e-%s-%u-%g-%p-%t
[sdw2-lab1] kernel.core_pattern = /var/crash/core-%e-%s-%u-%g-%p-%t
[sdw1-lab1] kernel.core_pattern = /var/crash/core-%e-%s-%u-%g-%p-%t
See Using systemd-coredump to debug application crashes. The documentation is for SUSE linux, but applies to other versions of Linux also.
Run a simple process like "sleep 600 &" in the background and kill the process with "kill -11 <PID>" to generate a coredump file. For example:
[gpadmin@mdw ~]$ sleep 600 &
[1] 3040
[gpadmin@mdw ~]$ kill -11 3040
[gpadmin@mdw ~]$
[1]+ Segmentation fault (core dumped) sleep 600
[gpadmin@mdw ~]$ ls -l /var/crash/core-*
-rw------- 1 gpadmin gpadmin 385024 Oct 3 13:42 /var/crash/core-sleep-11-1000-1000-3040-1727959336
Greenplum database needs to be restarted to ensure that the ulimit changes are effective.
Log out and back in as gpadmin to pick up the changes in the ulimits before restarting the database.