20230301:13:18:11 gprestore:gpadmin:eric_gp-mdw:044034-[DEBUG]:-COPY public.table01(col01, col02, col03) FROM PROGRAM 'cat /gpbackup__20230217164706_pipe_44034_77412 | cat -' WITH CSV DELIMITER ',' ON SEGMENT; 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[DEBUG]:-Restored data to table public.table01 from file 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[ERROR]:-Error loading data into table public.table01: COPY table01, line 78480700: "2021-07-12 00:00:00+01,week ,190,190,18202,234,20,190,6700,,,2021-07-18 18:14:56.468897+01,f,45360,4...": ERROR: Canceling query because of high VMEM usage. Used: 24356MB, available 2948MB, red zone: 27000MB (runaway_cleaner.c:202) (seg220 172.19.1.11:6004 pid=92768) (runaway_cleaner.c:202) (SQLSTATE XX000) 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[ERROR]:-Encountered 1 error(s) during table data restore; see log file /home/gpadmin/gpAdminLogs/gprestore_20230301.log for a list of table errors. 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[INFO]:-Data restore completed with failures 20230301:13:41:12 gprestore:gpadmin:eric_gp-mdw:044034-[INFO]:-Restoring post-data metadata
This issue can be caused by restoring a partition table where one or more partitions are append optimized (AO) AND one or more partitions are heap tables. This is an issue with the COPY command logic. Therefore it could happen with COPY which is not part of a gprestore.
The following is the code logic used when restoring/copy command:
Heap partition does batch inserts which means inserting some tuples into a memory batch buffer till it reaches a threshold e.g.(nTotalBufferedTuples
= 1000), then flush this batch of inserts to disk and set nTotalBufferedTuples
to 0.
AO partition does single tuple insert and is expected to cleanup the per-tuple memory after every insert because nTotalBufferedTuples
is always 0 on pure AO partitioned table.
So both above per-batch and per-tuple memory cleanups are based on the same condition (nTotalBufferedTuples
== 0).
When the partitioned table contains both Heap and AO partitions, there could be a possibility that after doing Heap partitions batch inserts, it didn’t reach to the threshold to trigger the batch inserts flushing which caused nTotalBufferedTuples
!= 0, then subsequent AO tuple inserts could accumulate per-tuple memory allocation as no chance to cleanup per-tuple memory in the condition (nTotalBufferedTuples
!= 0).