gpfdist process has very high memory usage
search cancel

gpfdist process has very high memory usage

book

Article ID: 296853

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

gpfdist process was started with "-m" option to increase the buffer size form the default 32KB.

[gpadmin@host01 ~]$ ps -aef | egrep '[g]pfdist'
gpadmin 31939 31908 0 Feb07 pts/1 00:06:50 gpfdist -p 8881 -d /home/datain -m 268435456

The gpfdist process frequently fails with the error:
FATAL [0:2431:0:112] out of memory when allocating buffer: 268435456 bytes

gpfdist process memory usage becomes very high over time:
USER     PID   %CPU %MEM  VSZ       RSS      TTY STAT START TIME   COMMAND
gpadmin  12224 0.1  85.1  27580592  26149896 ?   S    Jan21 31:52  gpfdist -p 8881 -d /home/datain -m 268435456


Environment

Product Version: 6.21

Resolution

Each time a query accesses an external table via gpfdist process each segment will connect to gpfdist and cause it to allocate a buffer of the size specified in the "-m" option.
If there are 32 segments in the cluster and gpfdist uses "-m 268435456" (256MB), it will use 8GB of RAM when accessing one table. If there are 2 tables accessed concurrently, then 16GB of RAM is required, etc.

When there are a number of external tables being accessed concurrently, then the memory usage will rise rapidly.
Unix processes will not release the memory back to the host until it dies. The memory is kept for re-use within the process. This can cause gpfdist to consume large amounts of RAM. 

This is functioning as designed.

Workaround options:
1. Reduce the buffer size used by gpfdist
2. Reduce the GUC gp_external_max_segs to reduce the number of segments connecting each time and so reducing the number of buffers required.
3. Increase the RAM available on the host