This article applies to all versions of Greenplum Database (GPDB.
This article covers some frequently asked performance related questions.
A. if you are using a remote client, check that your connection to the Greenplum cluster is still good. This can be done by running the SQL locally to the GP cluster.
B. Check that the system tables and user tables involved are not bloated or skewed. Refer to Greenplum gp_toolkit documentation for instructions how to do this.
C. Check with your DBA that the Greenplum interconnect is still performing correctly.
This is done by checking for dropped packets on the interconnect interfaces "netstat -i" and by running gpcheckperf. It is also possible that a segment is experiencing hardware problems. Hardware problems can be found in the output of dmesg or in /var/log/messages* files.
A. You can turn on timing per session before you run your SQL query with the "\timing on" command.
B. You can run explain analyze against your SQL statement to retrieve the timing.
When the session starts in master and segments, what the child processes in segments is identified with master session_id connection string (con+sess_id).
For example:
gpdb=# select * from pg_stat_activity; datid | datname | procpid | sess_id |.. .. -------+---------+---------+---------+ 16986 | gpdb | 18162 | 76134 | .. ..
In all segments, the child processes for session 76134:
[gpadmin@########]/export/home/gpadmin/gp40>gpssh -f hostfile /usr/ucb/ps -auxww | egrep con76134 [########] gpadmin 18162 1.7 6.0386000124480 ? S 09:57:55 0:04 postgres: port 4000, gpadmin gpdb [local] con76134 [local] cmd3 CREATE DATABASE [########] gpadmin 18625 0.3 2.726056455932 ? S 10:01:56 0:01 postgres: port 40000, gpadmin gpdb 10.5.202.12(18864) con76134 seg0 cmd4 MPPEXEC UTILITY [########] gpadmin 18669 0.0 0.1 3624 752 pts/2 S 10:02:36 0:00 egrep con76134 [########] gpadmin 22289 0.8 9.4531860196404 ? S 09:36:20 0:05 postgres: port 40000, gpadmin gpdb 10.5.202.12(18866) con76134 seg1 cmd4 MPPEXEC UTILITY
:
4. How to check if my session queries are running or waiting for locks?
Check the "waiting" column in pg_stat_activity and the "granted" column in pg_locks for any object level locks or refer to Locks - How to detect blocking and waiting queries on Tanzu Greenplum.
5. What kind of locks should we focus on MPP system when the system is slow/hung?
Locks that are held for a very long time and over multiple other queries are waiting for that lock also.
6. How do I monitor user activity history in Greenplum database?
Use Greenplum Command Center (GPCC), which has GUI to monitor and query performance history.
7. What is Greenplum Command Center and how to install it?
Its a monitoring tool that collects statistics on system and query performance. It also builds historical data.
Please refer to Greenplum command center documentation for steps to install and use.
8. What is an orphan process?
Orphan processes are Postgres/Greenplum processes that exists in some segments, but have no corresponding processes on the coordinator/master host. There is no information about its gang members (or child process as explained in question 3 above) on any other segments.
9. Will these orphan processes cause a performance issue?
Yes, they can if the process is holding locks on a relation on that segment. The subsequent queries are waiting for this lock to be released.
10. Is there a way to detect if the orphan process is running on my database?
Refer to Script for detecting orphan process and locks in Tanzu Greenplum
11. What is the "Out of Memory / VM Protect Limit" errors?
Please refer How to identify out of memory (OOM) errors.
12. How do I understand which process/query is consuming most of the segment memory?
Execute gpmemwatcher utility to track the memory consumption of the queries on the segments.
Review the output once an OOM is observed to identify possible queries or processes using the gpmemreport utility.
13. How do I determine if my table in a query needs a analyze?
See the gpstatscheck for 6.x / gpstatscheck for 7.x utility for more information.
14. How to determine the disk/network speed between servers?
GPDB has a utility called "gpcheckperf" that can help to determine the speed of the Disk/Network/Memory bandwidth.
15. What are the different Query tuning parameters in GPDB?
Please check Query Profiling for the different parameters that are involved in changing/altering the behaviour of the optimizer.
16. What is table bloat and what are the different options available to remove table bloat?
Please refer this How to deal with Bloat in Heap tables in VMware Tanzu Greenplum to understand what table bloat means and how to remove bloat.
Note: For more detailed information please refer to the Greenplum Documentation.