Greenplum deadlock detection with resource queue
search cancel

Greenplum deadlock detection with resource queue

book

Article ID: 296047

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

This article explains how Greenplum Database (GPDB) detects deadlock when the resource queue is involved.


Environment


Cause

Resolution

GPDB deadlock detection with resource queue

GPDB triggers a deadlock check process after a deadlock_timeout while waiting for a lock to be released by another process. A "wait-for" graph will be built for the involved running process. When there is a cycle in the graph, GPDB claimed a deadlock. 
 

During the deadlock detection process, resource queue slot lock is also identified. The resource queue is waiting for any of the running processes. Some non-genuine deadlocks that might be detected are shown below: 


For example, there are four concurrent transactions in one resource queue RQ1 with a three concurrency limit. T1, T2, T3 are running and T4 is pending. However, T4 already holds a table lock on table Table1 before it is pending in the resource queue.  When T3 runs a statement that expects a table lock on Table1, the following wait-for graph will be created:

Note: T3 and T4 are cycled to wait for locks. GPDB will think that there is one deadlock. However, this deadlock is a not a genuine deadlock and can be break when either T1 or T2 finish.


Best practices for 4.3.x customers with Resource Queue deadlock issue

To avoid such scenarios involving non-genuine deadlocks, use the GUC deadlock_timeout setting. Increasing this timeout will let Greenplum check the deadlock less frequently. As a result, some non-genuine deadlock will be cleared.

For example, the default value for this GUC is 1 second, so if T1/T2 execution time is 10 seconds. T3/T4 will wait for their locks longer than 1 second. T3/T4 will be marked as a deadlock. Increasing the value to 1 minute will let the transactions to continue. With an increased deadlock_timeout, T1/T2 will release the resource queue slot for T4 before T4's lock waiting timeout. 


In short, deadlock_timeout is not intended to reduce real deadlock in the system. However, it does help to avoid some scenarios involving resource queue slot lock as deadlock. Much less resource queue deadlock will be experienced after the GUC setting is increased.
 

Real deadlocks and deadlock_timeout

In v4.x, deadlock involving the resource queue cannot be avoided at all since we use statement level concurrency control in the resource queue. In some cases, real deadlock can happen with the resource queue.

Note: There are a couple negatives associated with defining a large value for deadlock_timeout.
Some real deadlocks cannot be detected and there is a break in time.

Resource queue logic will be redesigned for v5.x. This will help in scenarios where deadlock is involved with the resource queue.