Out of memory or VM Protect errors

Products

VMware Tanzu Greenplum Greenplum Pivotal Data Suite Non Production Edition VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

The purpose of this article is to discuss different types of out of memory (OOM) messages produced during query processing in Greenplum or HAWQ, possible causes, and recommendations to address specific issues.

Environment

Resolution

Below is the list of different types of out of memory messages that are found in Greenplum database segments/master logs.

VM Protect failed to allocate %d bytes, %d MB available.
Per-query VM protect limit reached: current limit is %d kB, requested %d bytes, available %d MB.
VM protect failed to allocate %d bytes from the system, VM Protect %d MB available.
Failed to allocate memory under virtual memory protection.

First, understand the essential key terms required to solve a specific issue.

1. Virtual Memory is the amount of RAM + SWAP configured on the system.

Use the "free" Linux command to view the amount of memory available on the server.

Example output:

mdw$ free -g
              total           used       free     shared    buffers     cached
Mem:           126        120          5          0          0        115
-/+ buffers/cache:          4        121
Swap:           63          0         63

In the above output, RAM available on the system is 126 GB, SWAP configured on the system is 63 GB. So, the total Virtual Memory available on the system is 126 + 63 = 189 GB.

Note: There are other parameters which control the amount of actual virtual memory available for the applications (example: GPDB) use:

vm.overcommit_memory (default value: 0)
vm.overcommit_ratio (default value: 50%)

See Set the required operating system parameters for the recommended settings for these parameters.

Essentially the available virtual memory on the system for applications (ex: GPDB) based on the default values of above parameters is calculated as below:

Formula: RAM/vm.overcommit_ratio + SWAP

Example based on free output shared above: 126 / (50 / 100) + 63 = 63 + 63 = 126 GB

While configuring the application, consider the upper limit of memory as 126 GB and not 189 GB in the above examples.

2. gp_vmem_protect_limit

Sets the amount of memory (in number of MBs) that all postgres processes of an active primary segment instance can consume. If a query causes this limit to be exceeded, the memory will not be allocated and the query will fail.

Refer to Greenplum documentation to understand how to calculate this parameter for a GPDB cluster based on the available virtual memory on the servers.

3. gp_vmem_limit_per_query

This GUC is not exposed (for the customer or in documentation), use it cautiously. The purpose of this GUC is to control the amount of Virtual memory which can be consumed by any individual query. If any query hits the limit defined by gp_vmem_limit_per_query, it will be canceled.

Now let's move to review the possible causes and recommendations if a specific error message is seen.

What is "VM Protect failed to allocate %d bytes, %d MB available" error?
What are " VM protect failed to allocate %d bytes from the system, VM Protect %d MB available"errors?
Per-query VM protect limit reached: current limit is %d kB, requested %d bytes, available %d MB (Yet to update with link).
Failed to allocate memory under virtual memory protection (yet to update with link).