COPY command for gprestore is causing out of memory errors
search cancel

COPY command for gprestore is causing out of memory errors

book

Article ID: 296820

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Unable to restore data from backup due to memory error. The gprestore utility reports "ERROR: Canceling query because of high VMEM usage"
20230301:13:18:11 gprestore:gpadmin:eric_gp-mdw:044034-[DEBUG]:-COPY public.table01(col01, col02, col03) FROM PROGRAM 'cat /gpbackup__20230217164706_pipe_44034_77412 | cat -' WITH CSV DELIMITER ',' ON SEGMENT;
20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[DEBUG]:-Restored data to table public.table01 from file 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[ERROR]:-Error loading data into table public.table01: COPY table01, line 78480700: "2021-07-12 00:00:00+01,week ,190,190,18202,234,20,190,6700,,,2021-07-18 18:14:56.468897+01,f,45360,4...": ERROR: Canceling query because of high VMEM usage. Used: 24356MB, available 2948MB, red zone: 27000MB (runaway_cleaner.c:202)  (seg220 172.19.1.11:6004 pid=92768) (runaway_cleaner.c:202) (SQLSTATE XX000)
20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[ERROR]:-Encountered 1 error(s) during table data restore; see log file /home/gpadmin/gpAdminLogs/gprestore_20230301.log for a list of table errors.
20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[INFO]:-Data restore completed with failures 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[INFO]:-Restoring post-data metadata

This issue can be caused by restoring a partition table where one or more partitions are append optimized (AO) AND one or more partitions are heap tables. This is an issue with the COPY command logic. Therefore it could happen with COPY which is not part of a gprestore.
 

The following is the code logic used when restoring/copy command:

  • Heap partition does batch inserts which means inserting some tuples into a memory batch buffer till it reaches a threshold e.g.(nTotalBufferedTuples = 1000), then flush this batch of inserts to disk and set nTotalBufferedTuples to 0.

  • AO partition does single tuple insert and is expected to cleanup the per-tuple memory after every insert because nTotalBufferedTuples is always 0 on pure AO partitioned table.

  • So both above per-batch and per-tuple memory cleanups are based on the same condition (nTotalBufferedTuples == 0).

  • When the partitioned table contains both Heap and AO partitions, there could be a possibility that after doing Heap partitions batch inserts, it didn’t reach to the threshold to trigger the batch inserts flushing which caused nTotalBufferedTuples != 0, then subsequent AO tuple inserts could accumulate per-tuple memory allocation as no chance to cleanup per-tuple memory in the condition (nTotalBufferedTuples != 0).

 


Environment

Product Version: 6.20

Resolution

Workaround

Create a table with the same columns as the original table but with one of the following conditions:
  1. without partitions
  2. all partitions are AO
  3. all partitions are heap
Restore to newly created table

Fix

A fix is available in Greenplum 6.24.1 and above