Script - Checking the workfile, spill file, and processing skew usage

search cancel

Script - Checking the workfile, spill file, and processing skew usage

book

Article ID: 297003

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

This article discusses ways to check workfiles, spillfiles, and processing skew usage for Greenplum Database (GPDB).

The queries use the gp_toolkit schema.

Environment

Product Version: 6.x and above

Resolution

The values of the workfile limits control the amount or size of the spill files.

SELECT name, setting
FROM pg_settings 
WHERE name like 'gp_workfile_limit%';

Check the process information with the highest amount of spill files or processing skew.

SELECT datname database_name,
       pid process_id,
       sess_id session_id,
       sum(size)/1024::float total_size_kb,
       sum(numfiles) total_num_files
FROM  gp_toolkit.gp_workfile_usage_per_query
GROUP BY 1,2,3
ORDER BY 4 DESC;

Check the query that is executed by the process. Supply the processID and sessionID of interest from the first query.

SELECT distinct datname database_name,
       pid process_id,
       sess_id session_id,
       substring(query from 1 for 300) as query
FROM  gp_toolkit.gp_workfile_usage_per_query
WHERE pid=<process_id>
AND sess_id=<session_id>
ORDER BY 2;

Check that the total amount of workfiles generated and the total size per segment by the process. Supply the processID and sessionID of interest from the first query.

SELECT datname database_name,
       pid process_id,
       sess_id session_id,
       segid segment_id,
       command_cnt command_num,
       optype operator_type,
       slice executing_slice,
       size/1024 total_size_kb,
       numfiles total_num_files
FROM gp_toolkit.gp_workfile_entries
WHERE pid=<Process ID>
AND sess_id=<Session ID>
ORDER BY 8 DESC;

Additional Information

For more information on the tables or columns used in the queries please refer to the documentation.

Feedback

thumb_up Yes

thumb_down No