When checking for datab skew using the view gp_toolkit.gp_skew_coefficients, it can be quite slow due to the fact that the view checks based on the row count of the table, as described in the article linked here. As a result, the larger the size or number of relations in the database, the longer it will take to complete the check.
For GPDB Version 4.x or 5.x
The approach outlined in this method utilizes a view that examines the file sizes for each table, segment by segment. The output will include only those tables that have at least one segment with more than 20% more bytes than expected.
Two variants of the same script have been provided for your convenience. The first variant gives a quick summary of the OS file size distribution, while the second variant provides more detailed information.
Script 1
1. Execute the following script in your database to create the user defined function (UDF) called "fn_create_db_files()":
CREATE OR REPLACE FUNCTION fn_create_db_files() RETURNS void AS $$ DECLARE v_function_name text := 'fn_create_db_files'; v_location int; v_sql text; v_db_oid text; v_num_segments numeric; v_skew_amount numeric; v_version text; BEGIN v_location := 1000; SELECT oid INTO v_db_oid FROM pg_database WHERE datname = current_database(); v_location := 2000; v_sql := 'DROP VIEW IF EXISTS vw_file_skew'; v_location := 2100; EXECUTE v_sql; v_location := 2200; v_sql := 'DROP EXTERNAL TABLE IF EXISTS db_files'; v_location := 2300; EXECUTE v_sql; v_location := 3000; v_sql := 'CREATE EXTERNAL WEB TABLE db_files ' || '(segment_id int, relfilenode text, filename text, ' || 'size numeric) ' || 'execute E''ls -l $GP_SEG_DATADIR/base/' || v_db_oid || ' | ' || 'grep gpadmin | ' || E'awk {''''print ENVIRON["GP_SEGMENT_ID"] "\\t" $9 "\\t" ' || 'ENVIRON["GP_SEG_DATADIR"] "/' || v_db_oid || E'/" $9 "\\t" $5''''}'' on all ' || 'format ''text'''; v_location := 3100; EXECUTE v_sql; v_location := 4000; SELECT count(*) INTO v_num_segments FROM gp_segment_configuration WHERE preferred_role = 'p' AND content >= 0; v_location := 4100; v_skew_amount := 1.2*(1/v_num_segments); v_location := 4200; SELECT CASE WHEN POSITION('Greenplum Database 5' in v) > 0 THEN 'GPDB5' WHEN POSITION('Greenplum Database 4' in v) > 0 THEN 'GPDB4' ELSE 'other' END INTO v_version FROM version() AS v; v_sql := 'CREATE OR REPLACE VIEW vw_file_skew AS ' || 'SELECT schema_name, ' || 'table_name, ' || 'max(size)/sum(size) as largest_segment_percentage, ' || 'sum(size) as total_size ' || 'FROM ( ' || 'SELECT n.nspname AS schema_name, ' || ' c.relname AS table_name, ' || ' sum(db.size) as size ' || ' FROM db_files db '; IF v_version = 'GPDB4' THEN v_sql := v_sql || ' JOIN pg_class c ON '; ELSE v_sql := v_sql || ' JOIN gp_dist_random(''pg_class'') c ON '; END IF; v_sql := v_sql || ' (split_part(db.relfilenode, ''.'', 1))::text = c.relfilenode::text ' || ' JOIN pg_namespace n ON c.relnamespace = n.oid ' || ' WHERE c.relkind = ''r'' and c.relstorage not in (''x'', ''v'', ''f'')' || ' GROUP BY n.nspname, c.relname, db.segment_id ' || ') as sub ' || 'GROUP BY schema_name, table_name ' || 'HAVING sum(size) > 0 and max(size)/sum(size) > ' || v_skew_amount::text || ' ' || 'ORDER BY largest_segment_percentage DESC, schema_name, ' || 'table_name'; v_location := 4300; EXECUTE v_sql; EXCEPTION WHEN OTHERS THEN RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm; END; $$ language plpgsql;
2. Execute the function so it will create the necessary objects in your database:
select * from fn_create_db_files();
3. Eelect from the view to see the tables that are skewed:
SELECT * FROM vw_file_skew ORDER BY 3 DESC;
Check the value of largest_segment_percentage
in the view, closer to 1 indicates that one segment has more data compared to rest of the segments in the cluster.
For example, largest_segment_percentage
= 0.83 means 83% of the table size is on one segment.
This is the same script as script 1, but with more information on the table distribution (i.e based on table's OS file size).
1. Create the function:
CREATE OR REPLACE FUNCTION public.fn_get_skew(out schema_name varchar, out table_name varchar, out pTableName varchar, out total_size_GB numeric(15,2), out seg_min_size_GB numeric(15,2), out seg_max_size_GB numeric(15,2), out seg_avg_size_GB numeric(15,2), out seg_gap_min_max_percent numeric(6,2), out seg_gap_min_max_GB numeric(15,2), out nb_empty_seg int) RETURNS SETOF record AS $$ DECLARE v_function_name text := 'fn_get_skew'; v_location int; v_sql text; v_db_oid text; v_num_segments numeric; v_skew_amount numeric; v_res record; BEGIN v_location := 1000; SELECT oid INTO v_db_oid FROM pg_database WHERE datname = current_database(); v_location := 2200; v_sql := 'DROP EXTERNAL TABLE IF EXISTS public.db_files_ext'; v_location := 2300; EXECUTE v_sql; v_location := 3000; v_sql := 'CREATE EXTERNAL WEB TABLE public.db_files_ext ' || '(segment_id int, relfilenode text, filename text, ' || 'size numeric) ' || 'execute E''ls -l $GP_SEG_DATADIR/base/' || v_db_oid || ' | ' || 'grep gpadmin | ' || E'awk {''''print ENVIRON["GP_SEGMENT_ID"] "\\t" $9 "\\t" ' || 'ENVIRON["GP_SEG_DATADIR"] "/' || v_db_oid || E'/" $9 "\\t" $5''''}'' on all ' || 'format ''text'''; v_location := 3100; EXECUTE v_sql; v_location := 4000; for v_res in ( select sub.vschema_name, sub.vtable_name, (sum(sub.size)/(1024^3))::numeric(15,2) AS vtotal_size_GB, --Size on segments (min(sub.size)/(1024^3))::numeric(15,2) as vseg_min_size_GB, (max(sub.size)/(1024^3))::numeric(15,2) as vseg_max_size_GB, (avg(sub.size)/(1024^3))::numeric(15,2) as vseg_avg_size_GB, --Percentage of gap between smaller segment and bigger segment (100*(max(sub.size) - min(sub.size))/greatest(max(sub.size),1))::numeric(6,2) as vseg_gap_min_max_percent, ((max(sub.size) - min(sub.size))/(1024^3))::numeric(15,2) as vseg_gap_min_max_GB, count(sub.size) filter (where sub.size = 0) as vnb_empty_seg from ( SELECT n.nspname AS vschema_name, c.relname AS vtable_name, db.segment_id, sum(db.size) AS size FROM ONLY public.db_files_ext db JOIN pg_class c ON split_part(db.relfilenode, '.'::text, 1) = c.relfilenode::text JOIN pg_namespace n ON c.relnamespace = n.oid WHERE c.relkind = 'r'::"char" and n.nspname not in ('pg_catalog','information_schema','gp_toolkit') and not n.nspname like 'pg_temp%' GROUP BY n.nspname, c.relname, db.segment_id ) sub group by 1,2 --Extract only table bigger than 1 GB -- and with a skew greater than 20% /*having sum(sub.size)/(1024^3) > 1 and (100*(max(sub.size) - min(sub.size))/greatest(max(sub.size),1))::numeric(6,2) > 20 order by 1,2,3 limit 100*/ ) loop schema_name = v_res.vschema_name; table_name = v_res.vtable_name; total_size_GB = v_res.vtotal_size_GB; seg_min_size_GB = v_res.vseg_min_size_GB; seg_max_size_GB = v_res.vseg_max_size_GB; seg_avg_size_GB = v_res.vseg_avg_size_GB; seg_gap_min_max_percent = v_res.vseg_gap_min_max_percent; seg_gap_min_max_GB = v_res.vseg_gap_min_max_GB; nb_empty_seg = v_res.vnb_empty_seg; return next; end loop; v_location := 4100; v_sql := 'DROP EXTERNAL TABLE IF EXISTS public.db_files_ext'; v_location := 4200; EXECUTE v_sql; return; EXCEPTION WHEN OTHERS THEN RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm; END; $$ language plpgsql;
2. Use the below SQL to get the table distribution (based on table's OS file size):
select * from public.fn_get_skew();
For GPDB Version 6.x
Starting with GPDB 6, the relfilenode no longer remains the same across segments. As a result, the script mentioned above may produce errors. If you are utilizing GPDB 6 or above, we recommend using the alternative script provided below to extract the skew via the OS filesystem level.
1. Create the function
DROP FUNCTION IF EXISTS public.greenplum_check_skew(); CREATE FUNCTION public.greenplum_check_skew() RETURNS TABLE ( relation text, vtotal_size_gb numeric, vseg_min_size_gb numeric, vseg_max_size_gb numeric, vseg_avg_size_gb numeric, vseg_gap_min_max_percent numeric, vseg_gap_min_max_gb numeric, vnb_empty_seg bigint) LANGUAGE plpgsql AS $$ DECLARE v_function_name text := 'greenplum_check_skew'; v_location int; v_sql text; v_db_oid text; BEGIN /* The function checks the skew on greenplum table via OS file size * The table "greenplum_get_refilenodes" collects all the relfilenodes found on the database from all segments * The external "greenplum_get_db_file_ext" tables collects all the files that are on the base directory * The view "greenplum_get_file_statistics" combines the relfilenodes and displays the size of the relation by segments * The view "greenplum_get_skew_report" provide high level overview of the skew */ -- Set the client min messages to just warning v_location := 1000; SET client_min_messages TO WARNING; -- Get the database oid v_location := 2000; SELECT d.oid INTO v_db_oid FROM pg_database d WHERE datname = current_database(); -- Drop the temp table if it exists v_location := 3000; v_sql := 'DROP TABLE IF EXISTS public.greenplum_get_refilenodes CASCADE'; v_location := 3100; EXECUTE v_sql; -- Temp table to temporary store the relfile records v_location := 4000; v_sql := 'CREATE TABLE public.greenplum_get_refilenodes (' ' segment_id int,' ' o oid,' ' relname name,' ' relnamespace oid,' ' relkind char,' ' relfilenode bigint' ')'; v_location := 4100; EXECUTE v_sql; -- Store all the data related to the relfilenodes from all -- the segments into the temp table v_location := 5000; v_sql := 'INSERT INTO public.greenplum_get_refilenodes SELECT ' ' s.gp_segment_id segment_id, ' ' s.oid o, ' ' s.relname, ' ' s.relnamespace,' ' s.relkind,' ' s.relfilenode ' 'FROM ' ' gp_dist_random(''pg_class'') s ' -- all segment 'UNION ' ' SELECT ' ' m.gp_segment_id segment_id, ' ' m.oid o, ' ' m.relname, ' ' m.relnamespace,' ' m.relkind,' ' m.relfilenode ' 'FROM ' ' pg_class m '; -- relfiles from master v_location := 5100; EXECUTE v_sql; -- Drop the external table if it exists v_location := 6000; v_sql := 'DROP EXTERNAL WEB TABLE IF EXISTS public.greenplum_get_db_file_ext'; v_location := 6100; EXECUTE v_sql; -- Create a external that runs a shell script to extract all the files -- on the base directory v_location := 7000; v_sql := 'CREATE EXTERNAL WEB TABLE public.greenplum_get_db_file_ext ' || '(segment_id int, relfilenode text, filename text, ' || 'size numeric) ' || 'execute E''ls -l $GP_SEG_DATADIR/base/' || v_db_oid || ' | ' || 'grep gpadmin | ' || E'awk {''''print ENVIRON["GP_SEGMENT_ID"] "\\t" $9 "\\t" ' || 'ENVIRON["GP_SEG_DATADIR"] "/base/' || v_db_oid || E'/" $9 "\\t" $5''''}'' on all ' || 'format ''text'''; v_location := 7100; EXECUTE v_sql; -- Drop the datafile statistics view if exists v_location := 8000; v_sql := 'DROP VIEW IF EXISTS public.greenplum_get_file_statistics'; v_location := 8100; EXECUTE v_sql; -- Create a view to get all the datafile statistics v_location := 9000; v_sql :='CREATE VIEW public.greenplum_get_file_statistics AS ' 'SELECT ' ' n.nspname || ''.'' || c.relname relation, ' ' osf.segment_id, ' ' split_part(osf.relfilenode, ''.'' :: text, 1) relfilenode, ' ' c.relkind, ' ' sum(osf.size) size ' 'FROM ' ' public.greenplum_get_db_file_ext osf ' ' JOIN public.greenplum_get_refilenodes c ON (' ' c.segment_id = osf.segment_id ' ' AND split_part(osf.relfilenode, ''.'' :: text, 1) = c.relfilenode :: text' ' ) ' ' JOIN pg_namespace n ON c.relnamespace = n.oid ' 'WHERE ' ' osf.relfilenode ~ ''(\d+(?:\.\d+)?)'' ' -- ' AND c.relkind = ''r'' :: char ' ' AND n.nspname not in (' ' ''pg_catalog'', ' ' ''information_schema'', ' ' ''gp_toolkit'' ' ' ) ' ' AND not n.nspname like ''pg_temp%'' ' ' GROUP BY 1,2,3,4'; v_location := 9100; EXECUTE v_sql; -- Drop the skew report view view if exists v_location := 10000; v_sql := 'DROP VIEW IF EXISTS public.greenplum_get_skew_report'; v_location := 10100; EXECUTE v_sql; -- Create a view to get all the table skew statistics v_location := 11100; v_sql :='CREATE VIEW public.greenplum_get_skew_report AS ' 'SELECT ' ' sub.relation relation,' ' (sum(sub.size)/(1024^3))::numeric(15,2) AS vtotal_size_GB,' --Size on segments ' (min(sub.size)/(1024^3))::numeric(15,2) AS vseg_min_size_GB,' ' (max(sub.size)/(1024^3))::numeric(15,2) AS vseg_max_size_GB,' ' (avg(sub.size)/(1024^3))::numeric(15,2) AS vseg_avg_size_GB,' --Percentage of gap between smaller segment and bigger segment ' (100*(max(sub.size) - min(sub.size))/greatest(max(sub.size),1))::numeric(6,2) AS vseg_gap_min_max_percent,' ' ((max(sub.size) - min(sub.size))/(1024^3))::numeric(15,2) AS vseg_gap_min_max_GB,' ' count(sub.size) filter (where sub.size = 0) AS vnb_empty_seg ' 'FROM ' 'public.greenplum_get_file_statistics sub' ' GROUP BY 1'; v_location := 11100; EXECUTE v_sql; -- Return the data back RETURN query ( SELECT * FROM public.greenplum_get_skew_report a); -- Throw the exception whereever it encounters one EXCEPTION WHEN OTHERS THEN RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm; END; $$;
2. Execute the function to generate the skew report
SELECT * FROM public.greenplum_check_skew();
3. For details on the table size by segments, you can make use of this view
SELECT * FROM public.greenplum_get_file_statistics;
How to Read the output?
gpadmin=# SELECT * FROM public.greenplum_check_skew(); relation | vtotal_size_gb | vseg_min_size_gb | vseg_max_size_gb | vseg_avg_size_gb | vseg_gap_min_max_percent | vseg_gap_min_max_gb | vnb_empty_seg ---------------------------------------+----------------+------------------+------------------+------------------+--------------------------+---------------------+--------------- .... .... public.skewed_table | 2.56 | 0.00 | 2.56 | 1.28 | 100.00 | 2.56 | 0 .... ....
gpadmin=# select * FROM public.greenplum_get_file_statistics where relation='public.skewed_table'; relation | segment_id | relfilenode | relkind | size ---------------------+------------+-------------+---------+------------ public.skewed_table | 1 | 214664 | r | 2745303040 public.skewed_table | 0 | 214664 | r | 98304 (2 rows)
NOTE:
While the method described above (calculating data skew using OS file size) can be useful, it does have some limitations. The most significant limitation is the potential for "table bloat."
When DML operations are performed on a table, the space is not always returned to the OS. As a result, the calculation of skew using OS file size will include the space occupied by table bloat.
To understand this issue in more detail, we recommend reviewing the information on "Bloat in HEAP Tables" and the various options for removing bloat from a table.