This article explains how to assign a default value to a column that does not have values on the input file.
The table "test" has a column "c" which has the default value of "0" if no records are assigned to the column.
flightdata=# \d test Table "public.test" Column | Type | Modifiers --------+---------+----------- a | text | b | text | c | integer | default 0 Distributed by: (a)Load the data. Observe that the last row has NULL value in last column.
[gpadmin@gpdbsne tmp]$ cat gpload.data a|b|1 c|d|2 e|f|3 g|h|Load the data with gpload using the following command:
[gpadmin@gpdbsne tmp]$ cat /tmp/gpload.paramters VERSION: 1.0.0.1 DATABASE: flightdata USER: gpadmin HOST: gpdbsne PORT: 3014 GPLOAD: INPUT: - SOURCE: LOCAL_HOSTNAME: - gpdbsne PORT: 8081 FILE: - /tmp/gpload.data - COLUMNS: - a: text - b: text - c: int - FORMAT: text - DELIMITER: '|' - NULL_AS: '' OUTPUT: - TABLE: public.test - MODE: insert [gpadmin@gpdbsne tmp]$ gpload -f /tmp/gpload.paramters 2016-03-09 12:05:05|INFO|gpload session started 2016-03-09 12:05:05 2016-03-09 12:05:05|INFO|started gpfdist -p 8081 -P 8082 -f "/tmp/gpload.data" -t 30 2016-03-09 12:05:05|INFO|running time: 0.14 seconds 2016-03-09 12:05:06|INFO|rows Inserted = 4 2016-03-09 12:05:06|INFO|rows Updated = 0 2016-03-09 12:05:06|INFO|data formatting errors = 0 2016-03-09 12:05:06|INFO|gpload succeededObserve that he column is not populated with the default value:
flightdata=# select * from test; a | b | c ---+---+--- a | b | 1 c | d | 2 e | f | 3 g | h | (4 rows)
flightdata=# truncate table test ; TRUNCATE TABLE flightdata=# \q2. Modify the YAML file (gpload control file) and use the "MAPPING" keyword, to assign values to the column if found.
[gpadmin@gpdbsne tmp]$ cat /tmp/gpload.paramters VERSION: 1.0.0.1 DATABASE: flightdata USER: gpadmin HOST: gpdbsne PORT: 3014 GPLOAD: INPUT: - SOURCE: LOCAL_HOSTNAME: - gpdbsne PORT: 8081 FILE: - /tmp/gpload.data - COLUMNS: - a: text - b: text - c: int - FORMAT: text - DELIMITER: '|' - NULL_AS: '' OUTPUT: - MAPPING : a : a b : b c : 'case coalesce(c,0) when 0::numeric then 0::numeric else c end' - TABLE: public.test - MODE: insert3. Execute the gpload command:
[gpadmin@gpdbsne tmp]$ gpload -f /tmp/gpload.paramters 2016-03-09 12:07:43|INFO|gpload session started 2016-03-09 12:07:43 2016-03-09 12:07:43|INFO|started gpfdist -p 8081 -P 8082 -f "/tmp/gpload.data" -t 30 2016-03-09 12:07:43|INFO|running time: 0.20 seconds 2016-03-09 12:07:43|INFO|rows Inserted = 4 2016-03-09 12:07:43|INFO|rows Updated = 0 2016-03-09 12:07:43|INFO|data formatting errors = 0 2016-03-09 12:07:43|INFO|gpload succeeded [gpadmin@gpdbsne tmp]$ psql psql (8.2.15) Type "help" for help.4. The data loaded is now formatted correctly, as shown below:
flightdata=# select * from test; a | b | c ---+---+--- a | b | 1 c | d | 2 e | f | 3 g | h | 0 (4 rows)