GemFire: How to troubleshoot native memory issues
search cancel

GemFire: How to troubleshoot native memory issues

book

Article ID: 293984

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

This article applies to all supported VMware GemFire versions.

Java applications running on virtual machines (VMs) allocate thread stacks in native memory. Native memory is outside of the heap. Thus, the -Xms and -Xmx VM arguments have no effects on it. The -Xss VM argument is used to determine the thread stack size. The default is operating system and VM version dependent. Usually, the thread stack size default is 512kb on Sparc and 256kb on Intel for 1.3 and 1.4 32-bit JVMs, 1mb with the 64-bit Sparc 1.4 JVM; and 128k for 1.2 JVMs

An application can exhaust the native memory with thread allocations and still have plenty of heap space. It is important to note that the thread stack size needed for high query loads with big result sets is usually higher than the default 1024 KB to get good performance. So it is a balancing act of having sufficient native memory for the thread stack depending on GemFire usage. Most importantly the queries, number of threads, and some tuning iterations will be needed to find the right configuration where issues can be avoided and performance is optimal.

Resolution

One way to know there is a native memory issue is when an OutOfMemoryError with the message "unable to create new native thread" is thrown either by GemFire or the application. The error must contain the "unable to create new native thread" message and not the "Java heap space" message. See the document Managing Heap Memory for more details. An example is shown below:

[severe 2008/09/29 10:56:12.919 EDT <Message Dispatcher for 127.0.0.1:2879> tid=0x56f]
Uncaught exception in thread <Message Dispatcher for 127.0.0.1:2879>
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)

Another way to check whether there is a native memory issue is to use either the GemFire stats command or vsd to display the number of threads contained in a given GemFire statistics archive. The vmStats category shows the number of threads in the VM at any time.

 

Thread Dump

You can dump all the live threads of a running VM using kill -3 <pid>. This does not kill the VM. Instead, it signals it to dump the current state of all of its live threads. An example is shown below:

[severe 2009/02/20 21:13:10.024 UTC libgemfire.so nid=0x40a18940] SIGQUIT received, dumping threads 
Full thread dump Java HotSpot(TM) 64-Bit Server VM (1.5.0_16-b02 mixed mode):
"Pooled Message Processor548" daemon prio=1 tid=0x00 nid=0x197d in Object.wait()
 at java.lang.Object.wait(Native Method)
 at java.lang.Object.wait(Object.java:432)
 ...
 
"ServerConnection on port 42400 Thread 262" prio=1 tid=0x00 nid=0x1829 in Object.wait()
 at java.lang.Object.wait(Native Method)
 at java.lang.Object.wait(Object.java:432)
 ...
 
"P2P message reader for server(32508):35047/56260 SHARED=false ORDERED=true" daemon prio=1 tid=0x00 nid=0x1800 runnable
 at sun.nio.ch.FileDispatcher.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
 ...


In simple terms, native memory is the difference between the physical RAM on the machine and the heap size of the VM(s). The operating system also uses some of this memory. The following are some ways to eliminate a native memory issue:
 

  1. In Linux, increasing the maximum number of processes shown by the "ulimit -u" command, can provide some relief. Linux creates native threads as light-weight processes. So, creating a large number of threads can cause the maximum number of processes to be exceeded. This will cause the JVM to throw an "OutOfMemory: unable to create new native thread" exception.
  2. Reduce the thread stack size of the VM using -Xss. Something like -Xss256k or -Xss384k is sufficient in many cases but note that queries with big result sets perform much better with a bigger thread stack. For clusters with heavy query load and big result sets, it will usually yield better performance with a Xss of 2MB which will raise the thread stack from the 1024 byte default.
  3. Reduce the max heap size of the VM using -Xmx. This will provide a greater difference between the physical RAM and the heap, thus you will have more native memory.
  4. In case connections are leaking, the above will only push the issue out further in the future. In the case where connections keep building up, it is important to find the root cause. Connections can be built, for instance, when using a version of Hyperic that isn't supported by the version of GemFire.
  5. Add RAM to the machine.