GemFire: JVM Pauses and Cluster Degradation Triggered by Thread Monitoring in GemFire with Azul Zulu
search cancel

GemFire: JVM Pauses and Cluster Degradation Triggered by Thread Monitoring in GemFire with Azul Zulu

book

Article ID: 396045

calendar_today

Updated On:

Products

VMware Tanzu Data Suite

Issue/Introduction

GemFire cluster experience JVM pauses across multiple nodes without kicking the member out.

Environment

GemFire 10.1.x with Azul Zulu (an OpenJDK-based distribution)

Cause

When a thread becomes stuck or takes an extended time to complete (e.g., during authentication or classloading), GemFire’s Thread Monitor may activate to collect diagnostic data. In version 10.1, this includes capturing thread lock information if the JVM vendor is Azul. While optimized products like Azul Zing handle this efficiently, Azul Zulu (an OpenJDK-based distribution) performs this task through safepoint operations that can trigger significant JVM pauses. These pauses may cascade across the cluster, degrading overall performance and availability.

Resolution

To mitigate this, it is recommended to disable thread lock collection on all nodes using Azul Zulu by adding the following JVM property:

 
-Dgemfire.threadmonitor.showLocks=false

This reverts the behavior to a less intrusive monitoring mode, consistent with earlier GemFire versions.

This behavior will be fixed in the future GemFire 10 versions.