When creating a new table and when no distribution policy is mentioned, gpdb does not auto-pick the first column as its distribution column. Therefore, all new tables are distributed at random.
This is happening because the GUC "gp_create_table_random_default_distribution" is turned "ON" on your cluster.
Turn the parameter gp_create_table_random_default_distribution to OFF to ensure that the table when created auto picks a column for distribution of data.
For example,
1. Creating a table with the parameter OFF.
flightdata=# show gp_create_table_random_default_distribution; gp_create_table_random_default_distribution --------------------------------------------- off (1 row) flightdata=# create table p1 ( a int , b int ); NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'a' as the Greenplum Database data distribution key for this table. HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew. CREATE TABLE flightdata=# \d p1 Table "public.p1" Column | Type | Modifiers --------+---------+----------- a | integer | b | integer | Distributed by: (a)
2. Turning ON the parameter leads to any new table created with a random distribution.
flightdata=# set gp_create_table_random_default_distribution=on; SET flightdata=# create table p2 ( a int , b int ); NOTICE: Using default RANDOM distribution since no distribution was specified. HINT: Consider including the 'DISTRIBUTED BY' clause to determine the distribution of rows. CREATE TABLE flightdata=# \d p2 Table "public.p2" Column | Type | Modifiers --------+---------+----------- a | integer | b | integer | Distributed randomly