CMS Fragmentation Issues Causing Members to be Removed from the Cluster in Pivotal GemFire

Products

VMware Tanzu Gemfire

Issue/Introduction

Symptoms:

Long Java Virtual Machine (JVM) pauses, potentially removing members of the GemFire distributed system from the cluster.

Error Message:

2016-06-30T07:17:16.701-0500: 10899296.854: [GC2016-06-30T07:17:16.701-0500: 10899296.854: [ParNew (promotion failed) Desired survivor size 85878368 bytes, new threshold 1 (max 1) - age 1: 706584 bytes, 706584 total : 839908K->839926K(943744K), 0.2458850 secs]2016-06-30T07:17:16.947-0500: 10899297.100: [CMS: 22365807K->10421671K(30408704K), 31.1687460 secs] 23205326K->10421671K(31352448K), [CMS Perm : 41841K->41830K(262144K)], 31.4150540 secs] [Times: user=31.66 sys=0.00, real=31.41 secs]

Note: We are focussing on the "promotion failed" part of the message. The symptom shows "concurrent mode failure" instead of promotion failed.

Environment

Cause

CMS is a non-compacting collector. Over time, it is very likely that your tenured heap will become fragmented. If the fragmentation prevents the promotion of objects from Survivor space to Tenured (old generation) space, you will experience a FULL Garbage Collector (GC). This FULL GC will likely cause enough of a JVM pause for the GemFire member, such that other members of the cluster will deem the member unresponsive, and remove the member from the distributed system.

Root Cause Analysis

The primary symptom, where GemFire removes an unresponsive member from the distributed system, can have many causes. In this case, it is important to use the GC logging output to identify the promotion failure.

In the error message above, you can see the "promotion failed" component of the message. Also, you can see the real time and user time consumed by the given GC. Such high values are indicative of promotion or concurrent mode failures. Various settings that make fragmentation issues more likely include the following:

Consuming a high percentage of the tenured heap with live objects
Having a high CMS InitiatingOccupancyFraction
Having very large objects to promote
Having a very large Survivor space, in relation to the size of the tenured heap

Resolution

Any of the following options can be used to reduce the likelihood of heap fragmentation impacting your GemFire cluster:

1. Increase your Xms and Xms heap size.
2. Reduce your configured CMS InitiatingOccupancyFraction to maintain more free tenured space and, thus, be able to handle promotions.
3. Lower your eviction percentage(s) if any eviction is being used.
4. Adding CPU resources has also been known to reduce heap fragmentation over time.
5. As a final option, changing CMS to the G1 collector may be worth considering, as G1 is a compacting collector.

There are various flags that one can use for GC logging that provide more details in the logs to help diagnose issues. These can prove to be very helpful and add confidence that the issue has been diagnosed correctly. More importantly, some flags are available that could serve to provide some warning that fragmentation is increasing in the JVM. This could help to prevent any unplanned removal of a GemFire node from the cluster. Specifically, consider looking at the following flags to incorporate into your heap/GC configuration:

-XX:PrintFLSStatistics=2
-XX:+CMSDumpAtPromotionFailure

If you were to monitor your GC log files, when using the PrintFLSStatistics option, you would find output similar to the following:

Statistics for BinaryTreeDictionary: 
------------------------------------ 
Total Free Space: 382153298 
Max Chunk Size: 382064598 
Number of Blocks: 28 Av. 
Block Size: 13648332 
Tree Height: 8 
Statistics for BinaryTreeDictionary: 
------------------------------------ 
Total Free Space: 382153298 
Max Chunk Size: 382064598 
Number of Blocks: 28 Av. 
Block Size: 13648332 
Tree Height: 8

Such output, if monitored proactively, could provide insight as to when your tenured heap is becoming increasingly fragmented. This would be when your maximum chunk size continues to shrink to levels approaching the amount of memory that might be promoted in one GC. This would be something like the maximum survivor space size.

If the maximum chunk size that is available in the tenured heap, decreases to something like 10 times the maximum survivor space size over time, perhaps a planned event to defragment the heap would be warranted. This could be a bounce off of the GemFire member during some planned maintenance window.

Of course, all recommendations include the required testing in your development and lab environments prior to using them in production.