New Tables are Created with Random Distribution
search cancel

New Tables are Created with Random Distribution

book

Article ID: 296012

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

When creating a new table and when no distribution policy is mentioned, gpdb does not auto-pick the first column as its distribution column. Therefore, all new tables are distributed at random. 

 

Environment


Cause

This is happening because the GUC "gp_create_table_random_default_distribution" is turned "ON" on your cluster.

 

Resolution

Turn the parameter gp_create_table_random_default_distribution to OFF to ensure that the table when created auto picks a column for distribution of data.

For example,

1. Creating a table with the parameter OFF.

flightdata=# show gp_create_table_random_default_distribution;
 gp_create_table_random_default_distribution
---------------------------------------------
 off
(1 row)


flightdata=# create table p1 ( a int , b int );
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'a' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE

flightdata=# \d p1
      Table "public.p1"
 Column |  Type   | Modifiers
--------+---------+-----------
 a      | integer |
 b      | integer |
Distributed by: (a)

2. Turning ON the parameter leads to any new table created with a random distribution.

flightdata=# set gp_create_table_random_default_distribution=on;
SET

flightdata=# create table p2 ( a int , b int );
NOTICE:  Using default RANDOM distribution since no distribution was specified.
HINT:  Consider including the 'DISTRIBUTED BY' clause to determine the distribution of rows.
CREATE TABLE

flightdata=# \d p2
      Table "public.p2"
 Column |  Type   | Modifiers
--------+---------+-----------
 a      | integer |
 b      | integer |
Distributed randomly