"too many open files" error in Tanzu Greenplum
search cancel

"too many open files" error in Tanzu Greenplum

book

Article ID: 295345

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

Errors similar to the ones below are found in the database log files:

  • "could not create socket: Too many open files"
  • "could not create temporary file base/19781/pgsql_tmp/workfile_set_HashJoin_Slice-1.XXXXDtKNoF/spillfile_f341:Too many open files"
  • Errors in GPDB utility logs containing "Too many open files"

Environment


Cause

"Too many open files" errors happen when a process needs to open more files than it is allowed by the operating system. This number is controlled by the maximum number of file descriptors the process has.


The number of file descriptors for the current process can be shown with the following commands:

[root@mdw ~]# ulimit -a | grep open
open files (-n) 524288
[root@mdw ~]# ulimit -n 524288
[root@mdw ~]#

When a process is created, it will inherit the limits from the environment which may be different compared to the current settings. With regard to GPDB, all database related processes will inherit the limits from the postmaster. The current limits set for the postmaster can be verified in the /proc file system:

[gpadmin@mdw ~]$ ps -ef | grep silent
gpadmin 50746 1 0 Jul25 ? 00:00:00 /usr/local/greenplum-db-4.3.5.2/bin/postgres -D /data/master/gp4352seg-1 -p 5432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E

[gpadmin@mdw ~]$ cat /proc/50746/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            10485760             unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             131072               131072               processes
Max open files            65536                65536                files
Max locked memory         32768                32768                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       385257               385257               signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0

The maximum number of file descriptors is controlled two different ways:
 

1. /etc/security/limits.conf
 

Focus on the following lines:

* soft nofile 65536
* hard nofile 65536

2. Explicitly set the number of file descriptors using the ulimit command.

Note: This can be done from one of the automatically run scripts on login/etc.

[root@mdw ~]# ulimit -n 524288
[root@mdw ~]#

For more information about ulimit and the number of file descriptors, see the ulimit man page and Linux documentation.

DCAv1 originally set the max number of open files per process to 64K (65536). This limit proved to be too low for many of the GPDB workloads, so recommend increasing this value to 256K or 512K.


DCAv2 standardized to 512K and this is the current recommendation.


DCA upgrade

Unfortunately we found out recently that DCA upgrade does not preserve this setting as it replaces the file, /etc/security/limits.conf with the original file from the ISO.

Resolution

First check the configured number of file descriptors per process. If the value is lower than 512K (524288), increase the limit. The new value can be set in /etc/security/limits.conf (requires OS restart to take effect) or in yout "gpadmin" account (.bashrc or other automatically executed script).


Running out of maximum file descriptors for the system

file-max is the maximum file descriptors (FD) enforced on a kernel level, which cannot be surpassed by all processes. The ulimit is enforced on a process level, which can be less than the file-max. In some scenarios, even though the ulimit has been correctly configured, the total number of open files allowed for the entire system might be configured to a value less than the total number of files opened (by all processes). If a process now tries to open file and you will hit the maximum number of files allowed for the system. The following error message is produced: 

Too many open files in system

To fix this, check the value for fs.file-max in /etc/sysctl.conf. If it configured a value that is lower than the total number of open files for the entire system (lsof  or wc -l) then you must increase this value. To increase this value follow the below steps:
 

1. Edit the following line in the /etc/sysctl.conf file:


fs.file-max = value

Note: value is the new file descriptor limit that you want to set.


2. Apply the change by running the following command:


# /sbin/sysctl -p

Note: Before running the DCA upgrade on DCAv1, save the contents of /etc/security/limits.conf.