Window functions provide the ability to perform calculations across sets of rows that are related to the current query row. If you want to use FIRST_VALUE window function ORDER BY some column with NULLs, then NULLs are always LAST when we expect the NULL as FIRST.
In Greenplum (GPDB), when FIRST_VALUE window function is used, then it will not take NULL as the FIRST value.
Example:
The source data is like below:
[source data] Table Name: Employees dept_id salary ------------- 10 1000 10 1000 10 2000 10 3000 20 5000 20 6000 20 NULL a1=# SELECT dept_id, salary, FIRST_VALUE(salary) OVER(ORDER BY salary) first_sal FROM Employees; DEPT_ID SALARY FIRST_SAL 20 NULL NULL 10 1000 NULL 10 1000 NULL 10 2000 NULL 10 3000 NULL 20 5000 NULL 20 6000 NULL a1=# SELECT dept_id, salary, FIRST_VALUE(salary) OVER(PARTITION BY dept_id ORDER BY salary) group_first FROM Employees; DEPT_ID SALARY GROUP_FIRST 10 1000 1000 10 1000 1000 10 2000 1000 10 3000 1000 20 NULL NULL 20 5000 NULL 20 6000 NULL
In Greenplum, we will have to modify the query as below:
a1=# SELECT dept_id, salary, FIRST_VALUE(salary) OVER(ORDER BY (salary is not null),salary) first_sal FROM Employees; dept_id | salary | first_sal ---------+--------+----------- 20 | | 10 | 1000 | 10 | 1000 | 10 | 2000 | 10 | 3000 | 20 | 5000 | 20 | 6000 | (7 rows) a1=# SELECT dept_id, salary , FIRST_VALUE(salary) OVER(PARTITION BY dept_id ORDER BY (salary is not null),salary) group_first FROM Employees; dept_id | salary | group_first ---------+--------+------------- 10 | 1000 | 1000 10 | 1000 | 1000 10 | 2000 | 1000 10 | 3000 | 1000 20 | | 20 | 5000 | 20 | 6000 | (7 rows)
If you are concerned about displaying NULL in the output, modify the query using COALESCE function. Below is an example to get the same output:
a1=# SELECT dept_id, coalesce(salary::text, 'NULL'), coalesce((FIRST_VALUE(salary) OVER(ORDER BY (salary is not null),salary))::text, 'NULL') as first_sal FROM Employees; dept_id | coalesce | first_sal ---------+----------+----------- 20 | NULL | NULL 10 | 1000 | NULL 10 | 1000 | NULL 10 | 2000 | NULL 10 | 3000 | NULL 20 | 5000 | NULL 20 | 6000 | NULL (7 rows) a1=# SELECT dept_id, coalesce(salary::text,'NULL') , coalesce((FIRST_VALUE(salary) OVER(PARTITION BY dept_id ORDER BY (salary is not null),salary))::text, 'NULL') group_first FROM Employees; dept_id | coalesce | group_first ---------+----------+------------- 20 | NULL | NULL 20 | 5000 | NULL 20 | 6000 | NULL 10 | 1000 | 1000 10 | 1000 | 1000 10 | 2000 | 1000 10 | 3000 | 1000 (7 rows)
Pivotal Greenplum Database (GPDB) all versions