The view gp_toolkit.gp_skew_coefficients is used to show data distribution skew and this article describes how Greenplum calculates the value of the column "skccoeff" on gp_toolkit.gp_skew_coefficients view.
According to the Greenplum Administrator Guide, the column "skccoeff" is "The coefficient of variation (CV), which is calculated as the standard deviation divided by the average." In addition, the "standard deviation" and "average" are based on (select gp_segment_id,count(*) from group by gp_segment_id).
For example, The "skccoeff" on gp_toolkit.gp_skew_coefficients for testtable is:
gpadmin=# Select * from gp_toolkit.gp_skew_coefficients where skcrelname='testtable'. skcoid | skcnamespace | skcrelname | skccoeff --------+--------------+------------+----------------------- 252064 | public | testtable | 624.49979983984029000
And, that SQL that is used to calculate the column "skccoeff" value is:
gpadmin=# Select STDDEV(cnt)/avg(cnt)*100 from ( select gp_segment_id segid,count(*) cnt from testtable group by gp_segment_id union all ( select tmp2.content segid,0 cnt from gp_segment_configuration tmp2 where tmp2.content not in (select gp_segment_id segid from testtable group by gp_segment_id) and tmp2.role='p' and tmp2.content<>-1 ) ) tmpc; ?column? ---------------------- 624.4997998398402900
Pivotal Greenplum Database (GPDB) all versions