Gemfire Cluster startup fails with error "[error][gc] Failed to commit memory (Not enough space)"
search cancel

Gemfire Cluster startup fails with error "[error][gc] Failed to commit memory (Not enough space)"

book

Article ID: 434651

calendar_today

Updated On:

Products

VMware Tanzu Data Suite

Issue/Introduction

The error message 'Failed to commit memory (Not enough space)' when you start locator is a direct signal from the JVM that it’s asking the Operating System for memory that isn't available.

[0.014s][error][gc] Failed to commit memory (Not enough space)
[0.015s][error][gc] Failed to commit memory (Not enough space)
[0.015s][error][gc] Forced to lower max Java heap size from 2048M(100%) to 784M(38%)
[0.015s][error][gc] Failed to allocate initial Java heap (2048M)
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Exception occurred when starting the locator, please check the locator log for details

 

This article goes over OS-level configurations to ensure that sufficient memory is allocated to Gemfire at startup.

Environment

All supported Gemfire versions on Linux

Resolution

Follow these steps to identify and resolve OS-level memory bottlenecks.

1. Verify Physical Memory Availability

First, ensure the host has enough physical RAM available to satisfy the -Xms (Initial Heap) request.

bash-4.4$ free -m
              total        used        free      shared  buff/cache   available
Mem:         128400       40103       84102           0        4194       8716

Swap:          5119        2457        2662
 

Note: If available memory is less than your GemFire heap settings, you must increase physical RAM.

2. Check and Set memlock Limits

The max locked memory (memlock) must be greater than the heap size.

Check current limit:

Bash
 
ulimit -l

If the output is a small value (e.g., 64), it will cause startup failure.

How to Fix:

Edit /etc/security/limits.conf and set the value to unlimited or a value larger than your total JVM memory for the user running GemFire:

Plaintext
 
<user_name> soft memlock unlimited
<user_name> hard memlock unlimited

Note: You must log out and log back in for ulimit changes to take effect.

3. Configure Huge Pages

If you are using -XX:+UseLargePages, the OS must have enough pre-allocated HugePages available. If the sum of all JVM heaps on a physical host(locators and servers) exceeds the available Huge Pages, the JVM will fail to commit memory with the -XX:+AlwaysPreTouch flag.

Calculate Required Huge Pages:

To find the required number of pages, use the following formula:

 
vm.nr_hugepages = (Total Heap MB / HugePage Size in MB) * 1.05

(The 1.05 adds a 5% safety buffer for JVM overhead).

Check Current OS Huge Page Status:

Bash
 
bash-4.4$ grep -i huge /proc/meminfo
AnonHugePages:   1165312 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:   18278
HugePages_Free:    12087
HugePages_Rsvd:      193
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        37433344 kB

Look for Hugepagesize (usually 2048 kB) and HugePages_Total. If HugePages_Total is less than requested then the OS is fragmented and you need to reboot the machine.

How to Fix:

sudo nano /etc/sysctl.d/99-hugepages.conf
vm.nr_hugepages = <calculated_number>
sudo sysctl --system


4. Check Memory Overcommit Policy

Set overcommit_memory to 0.

Check setting:

Bash
 
cat /proc/sys/vm/overcommit_memory

How to Fix:

sudo sysctl -w vm.overcommit_memory=0

Note on Garbage Collection (GC)

In GemFire 10+, the default GC for many configurations is ZGC which is recommended only for heaps greater than 32GB. Since Locators typically have small heaps (< 4GB), it is recommended to use G1GC (-XX:+UseG1GC). ZGC attempts to reserve the entire max heap size immediately on startup. This is why a Locator using ZGC is much more likely to trigger this error than one using G1GC.