GemFire: "Query execution canceled due to memory threshold crossed" when using ZGC as Garbage Collector
search cancel

GemFire: "Query execution canceled due to memory threshold crossed" when using ZGC as Garbage Collector

book

Article ID: 409659

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

When using ZGC as the garbage collector, users may encounter the following exception during query execution:

 
Exception: Query execution canceled due to memory threshold crossed in system, memory used: xxx bytes.
org.apache.geode.cache.query.QueryExecutionLowMemoryException: Query execution canceled due to memory threshold crossed in system, memory used: xxx bytes.
	at org.apache.geode.cache.query.internal.DefaultQueryService.newQuery(DefaultQueryService.java:144)
	at org.apache.geode.internal.cache.tier.sockets.command.Query651.cmdExecute(Query651.java:116)
	at org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:193)
	at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:901)
	at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doOneMessage(ServerConnection.java:1113)
	at org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1394)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:707)
	at org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:124)
	at java.base/java.lang.Thread.run(Thread.java:840)

 

  • Monitoring shows that Critical threshold is being crossed based on tenured heap usage.

  • At the same time, overall heap utilization remains low.

  • This discrepancy occurs only when ZGC is in use, since ZGC does not follow traditional “tenured” memory semantics.


 

Environment

  • Versions affected: Prior to 9.15.13, 10.0.5, and 10.1.2
  • Garbage Collector: ZGC enabled

  • Configuration: Resource Manager thresholds set

Cause

The issue is caused by the way Resource Manager calculates memory usage thresholds when ZGC is used.

  • With ZGC, “tenured heap” metrics are not a reliable indicator of memory pressure.

  • The Resource Manager may incorrectly assume a critical threshold has been crossed, which triggers query cancellation.

  • This behavior was identified as a product defect (GEM-7732).

Resolution

Workaround

Configure SoftMaxHeapSize on the server side to provide a more accurate signal of memory usage under ZGC.

Permanent Fix

Upgrade to a version where the defect is resolved:

  • 9.15.13

  • 10.0.5

  • 10.1.2 and later

These versions include the fix for GEM-7732, ensuring ZGC behavior is properly handled by the Resource Manager.

Additional Information