In Greenplum 5.x, when an external table is created with "DISTRIBUTED BY" option, the distribution columns are now displayed in the "\d+" output for the table. For example:
gpadmin=# CREATE writable EXTERNAL TABLE ext_tab(id int, name varchar) LOCATION ('gpfdist://<ip:port>/1.dat') FORMAT 'TEXT' (DELIMITER '|' NULL ' ') DISTRIBUTED BY(name);
CREATE EXTERNAL TABLE
Time: 12.337 ms
gpadmin=# \d+ ext_tab
External table "public.ext_tab"
Column | Type | Modifiers | Storage | Description
--------+-------------------+-----------+----------+-------------
id | integer | | plain |
name | character varying | | extended |
Type: writable
Encoding: UTF8
Format type: text
Format options: delimiter '|' null ' ' escape '\'
External options: {}
External location: gpfdist://<ip:port>/1.dat
Execute on: all segments
There is no "Distributed by" output.
This article explains how to find the "distributed by" columns for the table.
gpadmin=# select oid from pg_class where relname = 'ext_tab'; oid ------- 32786 (1 row)
2. Then check the column number of the distribution key:
gpadmin=# select attrnums from gp_distribution_policy where localoid = 32786;
attrnums
----------
{2}
(1 row)
3. The above {2} indicates the redistribution column is the second column of the external table. This could be a list of columns. To get the name of the column(s):
gpadmin=# select attname from pg_attribute where attrelid=32786 and attnum in (2); attname --------- name (1 row)
Note: In Greenplum 6.x, the output from "\d+" for the table displays the "Distributed by" information.