gpstart/gpinitsystem failed with "could not create semaphores: No space left on device"

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

During the creation of new database cluster with "gpinitsystem" utility or during GPDB restart, the database could not start some of the segments. A sample log of gpstart failure states:

20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-dumping success segments: []
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:2 FAILED host:'sdw5' datadir:'/data1/njonna/gp4300/gpdb_p1/gp0' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:3 FAILED host:'sdw5' datadir:'/data2/njonna/gp4300/gpdb_p2/gp1' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:4 FAILED host:'sdw6' datadir:'/data1/njonna/gp4300/gpdb_p1/gp2' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:5 FAILED host:'sdw6' datadir:'/data2/njonna/gp4300/gpdb_p2/gp3' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:9 FAILED host:'sdw8' datadir:'/data2/njonna/gp4300/gpdb_p2/gp7' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:8 FAILED host:'sdw8' datadir:'/data1/njonna/gp4300/gpdb_p1/gp6' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:6 FAILED host:'sdw7' datadir:'/data1/njonna/gp4300/gpdb_p1/gp4' with reason:
20140205:23:00:05:027234 gpstart:mdw:gpadmin-[INFO]:-DBID:7 FAILED host:'sdw7' datadir:'/data2/njonna/gp4300/gpdb_p2/gp5' with reason:

And, on checking the segment log

 ssh sdw1
      less /data1/primary/gpdb_p1/gpseg1/pg_log/gpdb-2014-02-05_000000.csv
      less /data1/primary/gpdb_p1/gpseg1/pg_log/startup.log

To understand the encountering for the startup failure, you will witness the following:

2014-02-05 23:00:58.735743 CET,,,p5575,th-41367808,,,,0,,,seg-1,,,,,"FATAL","XX000","could not create semaphores: No space left on device (pg_sema.c:132)","Failed system call was semget(50002001, 17, 03600).",".
It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded.  You need to raise the respective kernel parameter.  Alternatively, reduce PostgreSQL's consumption of semaphores by reducing its max_connections parameter (currently 250).
The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.",,,,,,"InternalIpcSemaphoreCreate","pg_sema.c",132,1    0xa739de postgres errstart + 0x4ee
2    0x85c6a8 postgres PGSemaphoreCreateInitVal + 0x328
3    0x8e8635 postgres InitProcGlobal + 0x3a5
4    0x8d745c postgres CreateSharedMemoryAndSemaphores + 0x59c
5    0x87b574 postgres PostmasterMain + 0xc64
6    0x780ada postgres main + 0x4da
7    0x31ba21ecdd libc.so.6 __libc_start_main + 0xfd
8    0x47cc19 postgres  + 0x47cc19

Environment

Cause

As indicated by the error, the issue is due to improper kernel settings for semaphore and here there are not enough semaphores available at the operating system (OS) level.

Note: This error does not mean that you have run out of disk space.

Resolution

Please refer Greenplum Database Installation Guide for recommended OS kernel parameters.
If the values are default as recommended on the Installation Guide, please verify if the values are set in double quotes, the kernel will consider default values when used with quotes.

Current setting:

$ cat /etc/sysctl.conf |grep sem kernel.sem="250 512000 100 2048”

Suggested setting without double quotes:

$ cat /etc/sysctl.conf |grep sem 
kernel.sem = 250 512000 100 2048

Verify with “sysctl -a |grep sem” command to see current setting rather than checking in /etc/sysctl.conf file.

sysctl -a |grep sem
kernel.sem = 250     512000     200     2048

Update (2020-05-25) :

Another possible problem may be a Linux Defect

https://access.redhat.com/solutions/4968021

Any value higher than 32768 will cause the system to set to default 128 which is not enough for greenplum.

The evidence would be in /var/log/messages

May 25 09:23:47 mdw systemd-sysctl[366]: Failed to write '500 2048000 200 40960' to '/proc/sys/kernel/sem': Numerical result out of range

Resolution: Set the value to 32768 and contact RHEL Support if required.

Update (2021-09-18) :
The third reason might be incorrect value based on a scale of a cluster:
5.28 installation guide advice to use

kernel.sem = 500 2048000 200 4096

Problematic cluster has 48 primary segments per host.
When tested in the lab, 1 primary segment consumes 96 semaphore arrays, so 48 would consume 4608. When checking the actual used number we get:

ipcs -s -u
[sdw2] ------ Semaphore Status --------
[sdw2] used arrays = 4734
[sdw2] allocated semaphores = 79998

It looks like for big systems our recommendation is not enough. The 48 segment system would consume more than 4096 arrays
Intersting enough, the arrays are not fully used, 79998/4734=16,89 semaphores per array. We set up 4096 arrays with max 500 semaphores but in fact we do have on average 17 semaphores per array.

Workaround:
option a)
we keep the 2M max semaphores in a system but we do increase the number of arrays and decrease the size of them

kernel.sem = 250 2048000 200 8192

option b)
increase total number of semaphores and array

kernel.sem = 500 4096000 200 8192