A basic "SELECT *" with "ORDER BY" clause runs for longer than expected - exceeds the Greenplum Hardware-based expectation for performance throughput according to the number of nodes and segments.
While a highly scaled Greenplum Database cluster should provide significantly positive impact on runtime performance, especially during execution of basic SQL, a potential common factor to the execution runtime is the rate of fetching the result-set.
It is recommended to assess the impact of result-set data fetching through comparing the EXPLAIN ANALYZE output with the actual execution time from the client application standpoint.
The following example demonstrates various levels of performance impact based on controlling the row count in the resultset:
testdb=# CREATE TABLE test AS SELECT generate_series(1, 1000000);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy entry.
SELECT 1000000
testdb=#
The following execution time is result of running basic "SELECT *" with "ORDER BY" and fetching all 1-Million rows by the client application:
testdb=# \timing
Timing is on.
testdb=# \o /dev/null
testdb=# select * from test order by 1;
Time: 9328.913 ms
testdb=#
The following execution time is result of eliminating both the result-set row fetching overhead as well as the overhead of the high row count during Gather Motion (actual time=2022.187..2022.187 rows=1)
testdb=# explain analyze select * from test order by 1 fetch first 1 row only;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..658.75 rows=1 width=4) (actual time=2022.237..2022.261 rows=1 loops=1)
-> Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..658.75 rows=1 width=4) (actual time=2022.187..2022.187 rows=1 loops=1)
Merge Key: generate_series
-> Limit (cost=0.00..658.75 rows=1 width=4) (actual time=2019.755..2019.803 rows=1 loops=1)
-> Sort (cost=0.00..658.75 rows=500000 width=4) (actual time=2019.651..2019.651 rows=1 loops=1)
Sort Key: generate_series
Sort Method: top-N heapsort Memory: 18kB
-> Seq Scan on test (cost=0.00..440.35 rows=500000 width=4) (actual time=0.157..717.498 rows=500129 loops=1)
Planning time: 56.815 ms
(slice0) Executor memory: 59K bytes.
* (slice1) Executor memory: 58K bytes avg x 2 workers, 58K bytes max (seg0). Work_mem: 33K bytes max, 11723K bytes wanted.
Memory used: 128000kB
Memory wanted: 12122kB
Optimizer: Pivotal Optimizer (GPORCA)
Execution time: 2026.356 ms
(15 rows)
The following execution time is result of eliminating the overhead of fetching the SQL result-set with no limit. This is a direct impact of higher actual time in Gather Motion (actual time=2662.428..6631.003 rows=1000000)
testdb=# explain analyze select * from test order by 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..676.71 rows=1000000 width=4) (actual time=2662.428..6631.003 rows=1000000 loops=1)
Merge Key: generate_series
-> Sort (cost=0.00..658.75 rows=500000 width=4) (actual time=2642.417..3070.718 rows=500129 loops=1)
Sort Key: generate_series
Sort Method: quicksort Memory: 57330kB
-> Seq Scan on test (cost=0.00..440.35 rows=500000 width=4) (actual time=0.134..581.057 rows=500129 loops=1)
Planning time: 31.944 ms
(slice0) Executor memory: 183K bytes.
(slice1) Executor memory: 28706K bytes avg x 2 workers, 28706K bytes max (seg0). Work_mem: 28665K bytes max.
Memory used: 128000kB
Optimizer: Pivotal Optimizer (GPORCA)
Execution time: 6940.240 ms
(12 rows)