GemFire: How to troubleshoot native memory issues
search cancel

GemFire: How to troubleshoot native memory issues

book

Article ID: 293984

calendar_today

Updated On:

Products

VMware Tanzu Gemfire VMware Tanzu Data Intelligence VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

This article applies to all supported VMware GemFire versions. Java applications allocate thread stacks in native memory, which exists outside the JVM heap (Xmx). Consequently, standard heap tuning has no direct effect on the OS's ability to spawn new threads.

Native memory exhaustion typically occurs when the process hits OS-level limits or when the system lacks sufficient virtual memory to back new thread stacks. An application can encounter these errors even while having significant free heap space.

Resolution

A java.lang.OutOfMemoryError with the message "unable to create new native thread" indicates the Operating System cannot allocate a new thread. This is a native resource issue, not a Java Heap (Xmx) issue.

Recommended Troubleshooting & Mitigation:

  1. Increase OS Process Limits (nproc): In Linux, threads are treated as light-weight processes. If the user running the GemFire process reaches the ulimit -u limit, the JVM will fail to start new threads.
  2. Verify Virtual Memory and Swap: Thread stacks are allocated from virtual memory. While adjusting the stack size with -Xss changes the memory mapping size, it does not directly prevent thread creation unless the system is completely exhausted of virtual address space or physical backing (RAM/Swap).
  3. Check for Thread Leaks: If thread counts (visible in vmStats) climb steadily without returning to baseline, the system may be leaking connections or failing to terminate tasks.
    • Action: Capture thread dumps using jstack <pid> or kill -3 <pid> to identify if a specific component is hanging or accumulating threads.
  4. Balance Heap vs. Native Memory: On machines with limited physical RAM, an oversized Java Heap can leave insufficient native memory for the OS to manage thread stacks and other overhead.
    • Action: If native exhaustion persists despite high ulimit values, consider slightly reducing -Xmx to provide more "headroom" for the OS.
  5. Increase Physical RAM: If the OS is consistently hitting limits during peak traffic despite tuning, additional physical memory may be needed.