Processes failing with Timeout: Pool empty. Unable to fetch a connection - busy:20
search cancel

Processes failing with Timeout: Pool empty. Unable to fetch a connection - busy:20

book

Article ID: 397024

calendar_today

Updated On:

Products

Clarity PPM SaaS Clarity PPM On Premise

Issue/Introduction

Custom processes are stuck in Clarity and seeing multiple process failures with this message in the process logs:

Unable to retrieve Configuration Details: org.apache.commons.jelly.JellyTagException: null:30:65: <sql:query> Unable to get connection, DataSource invalid: "[Custom script execution pool-62-thread-14] Timeout: Pool empty. Unable to fetch a connection in 30 seconds, none available[size:20; busy:20; idle:0; lastwait:30000]."
 
Some process instances are successful and some not. Having many processes in RUNNING state without completing and failing eventually.

Environment

All Supported Clarity Releases

Cause

  • This may start happening after deploying a new version of a Clarity process with updated SQL query
  • Example cause: The query is not optimized and running for 1-2 hours, keeping the process engine 20 GEL connections busy and creating bottlenecks

Resolution

  • Take a heap dump and thread dumps of the BG service for troubleshooting purposes, provide to Broadcom Support to assist
  • BG restart may help for some time, however if the process issue is not addressed, the issue may reoccur
  • Check the database for any slow running queries with DBA and investigate the root cause
  • If you have just deployed a new version of a process or a new process, revert the changes back, or put it on Hold
  • Work with DBA to identify if there was any row lock contention or similar database bottleneck

If row lock contention is identified, use the below best practice guidance: 

  • Batching: Process the updates in smaller batches (100-300 instances at a time) to prevent hitting the database at once
  • REST API: Transition from direct database update on database tables to using the REST API for updates. This is considered a safer, more throttled update.
  • Workflow Redesign:Use a single master process loop through all instances for the update, rather than spinning off individual subprocesses for each. This would reduce the number of concurrent process instances and the risk of running into bottlenecks