Stalled or Hung gprestore artifact collection
search cancel

Stalled or Hung gprestore artifact collection

book

Article ID: 399007

calendar_today

Updated On:

Products

VMware Tanzu Data Suite VMware Tanzu Greenplum VMware Tanzu Greenplum / Gemfire

Issue/Introduction

If a gprestore job appears to stall or hang, gather the following artifacts while the job is still in progress. 

 

 

Environment

GPDB 6.28.2 and later

Resolution

1. Identify long running restore copy jobs

gpssh ps -ef | grep con<session_id_of_handing_copy>

2. Use gpmt analyze session to collect the core, strace, pstack of the stalled/hung process.

gpmt analyze session for <session_id_of_handing_copy>

-> this should collect the core, strace, pstack of the running process. There might be many segment hosts where its hanging, just pick one.
-> if for some reason it didn't, please collect the pstack (5 pstacks 1s apart), core (please also pack w/ packcore), lsof and strace of one of the hanging processes (non-idle processes running for a long time) from one of the hosts.

3. lsof -E on the host

4. ps -aef --forest

5. Also, segment and master server pg_log covering the full copy lifecycle.