GemFire cluster experience JVM pauses across multiple nodes without kicking the member out.
GemFire 10.1.x with Azul Zulu (an OpenJDK-based distribution) and Zing
When a thread becomes stuck or takes an extended time to complete (e.g., during authentication or classloading), GemFire’s Thread Monitor may activate to collect diagnostic data. In version 10.1, this includes capturing thread lock information if the JVM vendor is Azul. This triggers safepoint operations that can cause significant JVM pauses, which can degrade overall performance and cluster stabilty.
This can occur even with the significant improvements in Azul Zing.
1. Upgrade to Gemfire version 10.2.0 which has significant improvements to the thread monitor's impact, as described below.
2. If upgrading to Gemfire version 10.2.0 doesn't address the issue, you can disable thread lock collection on all nodes using Azul Zulu or Zing by adding the JVM property, -Dgemfire.threadmonitor.showLocks=false.This reverts the behavior to a less intrusive monitoring mode, consistent with earlier GemFire versions.
3. The most conservative approach is to disable thread monitoring completely.