Excessive SWAP usage leading to CPU saturation in-turn database HUNG or GPSTOP stuck

Products

VMware Tanzu Greenplum Greenplum Pivotal Data Suite Non Production Edition VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

Here are few symptoms to check the nature of the issue -

+ Queries are not progressing or even cancelling from End-User perspective.

+ Database connection are not successfully establishing.

+ gpstop not progressing.

+ gpssh to all or some segment hosts not responding.

+ Load it excessively high on all segment hosts.

Environment

All Greenplum versions.

Cause

Congestion and Slowness concluded as side effect of CPU usage towards excessive SWAP usage

+ Stuck spinlocks shown in logs:

2024-06-13 21:30:35.569294 UTC,"read_only","coredw",p28261,th-1391740800,"xx.xx.0.11","40302",2024-06-13 20:30:54 UTC,0,con368574,cmd44,seg224,slice239,,,sx1,"PANIC","XX000","stuck spinlock (0x7f55981a300c) detected at instrument.c:398 (s_lock.c:42)",,,,,,,0,,"s_lock.c",42," Stack trace:
1    0xc015e7 postgres errstart (elog.c:557)
2    0xc0447e postgres elog_finish (elog.c:1728)
3    0xa8787e postgres <symbol not found> (s_lock.c:41)
4    0x8dd972 postgres <symbol not found> (discriminator 1)
5    0xc420e2 postgres <symbol not found> (discriminator 3)
6    0xc41ff0 postgres <symbol not found> (discriminator 3)
7    0xc427be postgres ResourceOwnerRelease (discriminator 2)
8    0x732b8c postgres <symbol not found> (xact.c:3365)
9    0x735455 postgres AbortCurrentTransaction (xact.c:3982)
10   0xa993b0 postgres PostgresMain (postgres.c:5069)
11   0x6b3553 postgres <symbol not found> (postmaster.c:4492)
12   0xa1ecb6 postgres PostmasterMain (postmaster.c:1517)
13   0x6b7431 postgres main (main.c:205)
14   0x7f55a9a67555 libc.so.6 __libc_start_main + 0xf5
15   0x6c32ac postgres <symbol not found> + 0x6c32ac

+ CPU saturation with high system user & IO Wait usage

                CPU      %usr     %nice      %sys   %iowait    %steal      %irq     %soft    %guest    %gnice     %idle
09:30:40 PM     all      0.34      0.00     82.75     16.20      0.00      0.00      0.69      0.00      0.00      0.02
09:31:26 PM     all      0.37      0.00      0.90     97.82      0.00      0.00      0.91      0.00      0.00      0.00

+ Excessive Swap usage during congestion.

            kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
09:30:40 PM 250979304  34575112     12.11    276084      0.80
09:31:26 PM 253616356  31938060     11.18    392280      1.23
09:32:26 PM 254341568  31212848     10.93    579800      1.86
09:33:26 PM 255950616  29603800     10.37    638020      2.16

+ SWAP usage usage during normal processing.

12:01:01 AM 275034356  10520060      3.68     75528      0.72

+ Swap requests metrics shows SWAP demand

               pswpin/s pswpout/s
08:32:02 PM    702.19  13874.26
   :
09:30:40 PM     19.25   3329.45

Resolution

1. Reduce Swap Usage

+ Change Kernel parameter vm.swappiness from 10 to 1. This will help cluster reduce overhead of swap usage and in-turn save CPU processing spent towards swap processing and use the saved cpu cycles towards user workloads.

+ Create a copy of exiting file /etc/sysctl.conf as a backup and modify the current file for vm.swappiness = 1 on Coordinator and all segment hosts in the cluster. Host restart is not required to apply the changes.

Refer for more detailed info Overview of memory tuning best practices for Greenplum Database

2. Optimize Memory usage - Refer for more details Resource Queues and Memory Management

3. Check if cgroups are enabled. See Premature swapping while there is still plenty of pagecache to be reclaimed KB on the RedHat web site.

If Greenplum DB is using Resource Queues, then disable cgroups completely on all host in the cluster. "systemctl disable cgconfig" and reboot the hosts.
if cgroups v1 is used (cgroups v2 would be preferable):

Set "vm.force_cgroup_v2_swappiness = 1" in the /etc/sysctl.conf
Check that the kernel parameter "vm.swappiness = 10" in the /etc/sysctl.conf file. (It is possible to set this to 1 if 10 seems too high. But never set it to 0 as it will cause issues with the oom-killer)

if cgroups v2 is used:

Check that the kernel parameter "vm.force_cgroup_v2_swappiness = 0"
Check that the kernel parameter "vm.swappiness = 1"

If Greenplum DB is using Resource Groups, ensure cgroups are configured as described in the documentation