In the event of a segment failure, customers may notice a large number of orphaned files in the (ex. /data/primary/gpseg19/base/39393 ) directory. This will cause the impacted segment to have higher disk usage than the remaining segments. In GPDB versions 6.25.x and earlier, customers had to manually identify orphaned files and delete the files. Deleting these files can be fatal if the customer accidentally removes the wrong data file(s). To streamline the process of identifying and removing orphaned files, the gp_check_functions utility was introduced in GPDB versions 6.26.x and later.
===========================================================
Orphaned files in /base directory:
[20240308 16:56 7419 gpadmin@myprod-sdw-20:/gpdata/primary/gpseg19/base/3699698]$ ls -l 6023012* -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 22:53 6023012 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 22:54 6023012.1 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:00 6023012.10 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:47 6023012.100 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:47 6023012.101 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:48 6023012.102 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:48 6023012.103 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:49 6023012.104 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:49 6023012.105 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:49 6023012.106 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:50 6023012.107 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:50 6023012.108 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:50 6023012.109 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:01 6023012.11 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:51 6023012.110 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:51 6023012.111 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:51 6023012.112 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:52 6023012.113 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:52 6023012.114 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:52 6023012.115 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:53 6023012.116 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:53 6023012.117 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:54 6023012.118 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:54 6023012.119 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:02 6023012.12 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:55 6023012.120 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:55 6023012.121 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:56 6023012.122 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:56 6023012.123 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:57 6023012.124 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:57 6023012.125 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:57 6023012.126 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:58 6023012.127 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:58 6023012.128 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:59 6023012.129 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:02 6023012.13 -rw------- 1 gpadmin gpadmin 1073741824 Nov 27 23:59 6023012.130 -rw------- 1 gpadmin gpadmin 1073741824 Nov 28 00:00 6023012.131 -rw------- 1 gpadmin gpadmin 1073741824 Nov 28 00:00 6023012.132 -rw------- 1 gpadmin gpadmin 1073741824 Nov 28 00:01 6023012.133 -rw------- 1 gpadmin gpadmin 1073741824 Nov 28 00:01 6023012.134 -rw------- 1 gpadmin gpadmin 1073741824 Nov 28 00:01 6023012.135 -rw------- 1 gpadmin gpadmin 1073741824 Nov 28 00:02 6023012.136
===============================================================
Customer tried to find what table these files map but didn't find any active table these files map to:
prdwh=# select gp_segment_id, oid, relname, relfilenode from gp_dist_random('pg_class') where relfilenode::text like '%6023012%';
gp_segment_id | oid | relname | relfilenode
---------------+-----+---------+-------------
(0 rows)
Customer then logged directly into the database on the segment to see if the segment level catalog had anything:
PGOPTIONS='-c gp_session_role=utility' psql -p 6000 -d dbname
dbname=# select gp_segment_id, oid, relname, relfilenode from gp_dist_random('pg_class') where relfilenode::text like '%6023012%';
gp_segment_id | oid | relname | relfilenode
---------------+-----+---------+-------------
(0 rows)
Greenplum Database 6.25.x and earlier.
If a customer is running 6.25.x or older, recommendation is to upgrade to GPDB 6.26.x or later to utilize the gp_check_orphaned_files utility to identify the orphaned files.
The gp_check_orphaned_files view scans the default and user-defined tablespaces for orphaned data files. Greenplum Database considers normal data files, files with an underscore (_) in the name, and extended numbered files (files that contain a .<N> in the name) in this check. gp_check_orphaned_files gathers results from the Greenplum Database master and all segments.
Reference:
gp_check_functions
gp_move_ophaned_files