The embedded function array_agg() in GPDB only perform aggregation on a one-dimensional array. This article introduces a way to aggregate the data of several columns into one column of a two-dimensional array.
In order to aggregate multiple columns of one table to one column of a two-dimensional array, a new AGGREGATE type needs be defined.
CREATE AGGREGATE array_agg_mult (anyarray) ( SFUNC = array_cat ,STYPE = anyarray ,INITCOND ='{}');
Then use array_agg_mult(array[[col,...]]) in a SQL statement to aggregate data of multiple columns to a 2-dimensional array.
Note: Please test this function thoroughly before implementing this solution on the production system.
Example:
This example demonstrates how to perform a 2-dimensional array aggregation with a user-defined AGGREGATE type.
Create table t1 with several columns. Insert some rows of data into t1. Keep the value of the first column "id" identical among all rows for the aggregation test.
apj_4400=# create table t1(id int, c1 text, c2 text, c3 text) distributed by (id); CREATE TABLE apj_4400=# \d t1 Table "public.t1" Column | Type | Modifiers --------+---------+----------- id | integer | c1 | text | c2 | text | c3 | text | Distributed by: (id) apj_4400=# select * from t1; id | c1 | c2 | c3 ----+--------+--------+-------- 1 | col1-3 | col2-3 | col3-3 1 | col1-2 | col2-2 | col3-2 1 | col1-1 | col2-1 | col3-1 (3 rows)
Create another table t2. Column "carray" is defined as a type of 2-dimensional array text:
apj_4400=# create table t2(id int, cArray text[][]) distributed by (id); CREATE TABLE apj_4400=# \d t2 Table "public.t2" Column | Type | Modifiers --------+---------+----------- id | integer | carray | text[] | Distributed by: (id)
Define AGGREGATE.
apj_4400=# CREATE AGGREGATE array_agg_mult (anyarray) ( SFUNC = array_cat , STYPE = anyarray , INITCOND ='{}'); CREATE AGGREGATE;
Insert data into table t2 by selecting data from t1 with array_agg_mult([[...]]) to aggregate columns "c1, c2, c3" of t1 to "carray" of t2 with the 2-dimensional array.
apj_4400=# insert into t2 select id ,array_agg_mult(array[[c1, c2, c3]]) from t1 group by id; INSERT 0 1 apj_4400=# select * from t2; id | carray ----+------------------------------------------------------------------------ 1 | {{col1-3,col2-3,col3-3},{col1-2,col2-2,col3-2},{col1-1,col2-1,col3-1}} (1 row) apj_4400=# select carray[1][2] from t2; carray -------- col2-3 (1 row)