How to know the distribution column(s) in an writable external table in Greenplum 5.x
search cancel

How to know the distribution column(s) in an writable external table in Greenplum 5.x

book

Article ID: 296601

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

In Greenplum 5.x, when an external table is created with "DISTRIBUTED BY" option, the distribution columns are now displayed in the "\d+" output for the table.  For example:

gpadmin=# CREATE writable EXTERNAL TABLE ext_tab(id int, name varchar) LOCATION ('gpfdist://<ip:port>/1.dat') FORMAT 'TEXT' (DELIMITER '|' NULL ' ') DISTRIBUTED BY(name);
CREATE EXTERNAL TABLE
Time: 12.337 ms
gpadmin=# \d+ ext_tab
                 External table "public.ext_tab"
 Column |       Type        | Modifiers | Storage  | Description 
--------+-------------------+-----------+----------+-------------
 id     | integer           |           | plain    | 
 name   | character varying |           | extended | 
Type: writable
Encoding: UTF8
Format type: text
Format options: delimiter '|' null ' ' escape '\'
External options: {}
External location: gpfdist://<ip:port>/1.dat
Execute on: all segments

There is no "Distributed by" output.
This article explains how to find the "distributed by" columns for the table.


Environment

Product Version: 5.29

Resolution

1. Firstly, get the oid of the table:
gpadmin=# select oid from pg_class where relname = 'ext_tab';
  oid  
-------
 32786
(1 row)

2. Then check the column number of the distribution key:

gpadmin=# select attrnums from gp_distribution_policy where localoid = 32786;
 attrnums 
----------
 {2}
(1 row)

3. The above {2} indicates the redistribution column is the second column of the external table. This could be a list of columns. To get the name of the column(s):

gpadmin=# select attname from pg_attribute where attrelid=32786 and attnum in (2);
 attname 
---------
 name
(1 row)




Note: In Greenplum 6.x, the output from "\d+" for the table displays the "Distributed by" information.