The purpose of this article is to discuss different types of out of memory (OOM) messages produced during query processing in Greenplum or HAWQ, possible causes, and recommendations to address specific issues.
Below is the list of different types of out of memory messages that are found in Greenplum database segments/master logs.
First, understand the essential key terms required to solve a specific issue.
1. Virtual Memory is the amount of RAM + SWAP configured on the system.
Use the "free" Linux command to view the amount of memory available on the server.
Example output:
mdw$ free -g total used free shared buffers cached Mem: 126 120 5 0 0 115 -/+ buffers/cache: 4 121 Swap: 63 0 63
In the above output, RAM available on the system is 126 GB, SWAP configured on the system is 63 GB. So, the total Virtual Memory available on the system is 126 + 63 = 189 GB.
Note: There are other parameters which control the amount of actual virtual memory available for the applications (example: GPDB) use:
See Set the required operating system parameters for the recommended settings for these parameters.
Essentially the available virtual memory on the system for applications (ex: GPDB) based on the default values of above parameters is calculated as below:
Formula: RAM/vm.overcommit_ratio + SWAP
Example based on free output shared above: 126 / (50 / 100) + 63 = 63 + 63 = 126 GB
While configuring the application, consider the upper limit of memory as 126 GB and not 189 GB in the above examples.
2. gp_vmem_protect_limit
Sets the amount of memory (in number of MBs) that all postgres processes of an active primary segment instance can consume. If a query causes this limit to be exceeded, the memory will not be allocated and the query will fail.
Refer to Greenplum documentation to understand how to calculate this parameter for a GPDB cluster based on the available virtual memory on the servers.
3. gp_vmem_limit_per_query
This GUC is not exposed (for the customer or in documentation), use it cautiously. The purpose of this GUC is to control the amount of Virtual memory which can be consumed by any individual query. If any query hits the limit defined by gp_vmem_limit_per_query, it will be canceled.
Now let's move to review the possible causes and recommendations if a specific error message is seen.