FAQ - Greenplum Administration for DBA Part II (Admin)
search cancel

FAQ - Greenplum Administration for DBA Part II (Admin)

book

Article ID: 295165

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction


This article pertains to all versions of Pivotal Greenplum Database (GPDB).
This article covers some frequently asked Admin DBA related questions.


Resolution

1. gpstart failed what should I do?

Check gpstart logfile using the following command:

~gpadmin/gpAdminLogs/gpstart_yyyymmdd.log

Look in $MASTER_DATA_DIRECTORY/pg_log/startup.log at the pg startup log file for more details. Also look at the last master log generated after the startup command was issued.

If the segments failed to startup the log, use the following command:

~gpadmin/gpAdminLogs/gpstart_yyyymmdd.log

This command lists the failed segments. Navigate to the host/segment directory to check the logs for more details on the failure.
 

2. Why do we need gpstop -m and gpstart -m? How to find the uptime of the Greenplum database?

The gpstart -m command allows a user to start the master only and none of the data segments. It is used primarily by support to get system level information/configuration. An end user would not regularly or normally use it.

To find the uptime of the GPDB use the following command:

select now() - pg_postmaster_start_time() ;

 

3. What is the procedure to get rid of mirror segments?

There are no utilities available to remove mirrors from Greenplum. The cluster will need to reinitialize.

 

4. How to run gpcheckcat?

The gpcheckcat tool is used to check catalog inconsistencies between the master and segments. It can be found in the $GPHOME/bin/lib directory:

Usage: gpcheckcat [dbname]

        -?
        -B parallel: number of worker threads
        -g dir     : generate SQL to rectify catalog corruption, put it in dir
        -h host    : DB host name
        -p port    : DB port number
        -P passwd  : DB password
        -o         : check OID consistency
        -U uname   : DB User Name
        -v         : verbose

Example: gpcheckcat gpdb > gpcheckcat_gpdb_logfile.log

 

5. What is gpdetective and how do I run it in Greenplum?

The gpdetective utility collects information from a running Greenplum Database system and creates a bzip2-compressed tar output file. This output file helps with the diagnosis of Greenplum Database errors or system failures. For more details check help.

gpdetective --help

 

6. How to delete a standby?

To remove the currently configured standby master host from your Greenplum Database system, run the following command in the master only:

# gpinitstandby -r

 

7. How to re-sync a standby?

Use this command only if a standby master is already configured. Use this method when attempting to resynchronize the data between the primary and backup master host. The Greenplum system catalog tables will not be updated.

# gpinitstandby -n (resynchronize)

 


8. How to recover an invalid segment?

Use the gprecoverseg tool, which recognizes which segments need recovery and will initialize recovery.

Note: Without the "-F" option, the change tracking log will be sent and applied to the mirror. With the "-F" option, the entire data directory will be resynced.


9. How to add mirrors to the array?

The gpaddmirrors utility configures mirror segment instances for an existing Greenplum Database system that was initially configured with primary segment instances only.

For more details check help:

# gpaddmirrors --help

 

10. How to see primary to mirror mapping?

From the database catalog, following query list configuration on content ID, figure out the primary and mirror for each content.

gpdb=# select * from gp_configuration order by content. 

Note: Starting from GPDB 4.x, the gp_segment_configuration table is used.

gpdb=# select * from gp_segment_configuration order by dbid; 


11. How to start/stop db in admin mode?

  • Admin mode - The gpstart with option (-R) stands for Admin mode or a restricted mode. Only superusers can connect to the database when database opened using this option.
  • Utility mode - Utility mode allows you to connect to only individual segments when started using gpstart -m
PGOPTIONS='-c gp_session_role=utility' psql 

 

12. How to run gpcheckperf IO/netperf?

Create a directory where space is available and common in all hosts.

  • For network I/O test for each nic card:
gpcheckperf -f seg_host_file_nic-1 -r N -d /data/gpcheckperf > seg_host_file_nic_1.out
gpcheckperf -f seg_host_file_nic-2 -r N -d /data/gpcheckperf > seg_host_file_nic_2.out 
  •  For disk I/O:
gpcheckperf -f seg_host_file_nic-1 -r ds -D -d /data/gpdb_p1 -d  /data/gpdb_p2 -d /data/gpdb_m1 -d  /data/gpdb_m2 


13. How to update postgresql.conf and reload it?

In GP 4.0 version, check gpconfig utility to change postgres.conf parameters.

 

14. How to manage pg_hba.conf?

The pg_hba.conf file of the master instance controls client access and authentication to the Greenplum system. 


15. How to add a new user to the database?

Use "createuser" utility to create users. See "createuser --help" for more details.

Another option is to use SQL commands in the psql prompt to create users.

CREATE USER or ROLE <ROLE_NAME> .... 

 

16. How to create a password-free trusted env b/w the all the segment hosts?

Use gpssh-exkeys:

gpssh-exkeys -h hostname1 -h hostname2 ..  -h hostnameN 

 

17. How to check db version and version at init db?

To check the version:

psql> select version(); 

or

postgres --gp-version 

To check the gp version at install:

psql> select * from gp_version_at_initdb; 

 

18. How to see the value of GUC?

Connect the GPDB database using psql query catalog or use the show parameter command.

gpdb# select name,setting from pg_settings where name='GUC'; 

or

gpdb# show <GUC_NAME>; 

 

19. What is the location of pg_hba/logfile/master_data_directory?

The location of the Master directory is cd $MASTER_DATA_DIRECTORY, pg_hba.conf, postgres.conf and other GPDB internal directories. The location of Master database logfiles is cd $MASTER_DATA_DIRECTORY/pg_logs.

 

20. How to find errors/fatal from log files?

grep for ERRORS, FATAL, SIGSEGV in the pg_logs directory.


21. What is a vacuum and when should I run this?

VACUUM reclaims storage occupied by deleted tuples. In normal GPDB operation, tuples that are deleted or obsoleted by an update are not physically removed from their table. They remain present on disk until a VACUUM is done. Therefore, it is necessary to do VACUUM periodically, especially on the frequently-updated table.


22. What is the difference between vacuum and vacuum full?

Unless you need to return space to the OS so that other tables or other parts of the system can use that space, you should use VACUUM instead of VACUUM FULL.


VACUUM FULL is only needed when a table that is mostly dead rows. That is, the vast majority of the contents of the table have been deleted. Even then, VACUUM FULL should be used when disk space is urgently needed or the table will never again grow to its past size. Do not use it for table optimization or periodic maintenance as it is counterproductive.


Check the guide for more information.

 

23. What is Analyze and how frequency should I run this?

ANALYZE collects statistics about the contents of tables in the database, and stores the results in the system table pg_statistic. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries.


It is a good idea to run ANALYZE periodically, or just after making major changes in the contents of a table. Accurate statistics will help the query planner to choose the most appropriate query plan, thereby improving the speed of query processing. A common strategy is to run VACUUM and ANALYZE once a day during a low-usage time of day.

 

24. What is resource queues?

Resource queues are used to manager Greenplum database workload management. All user/queries can be prioritized using Resource queues. Refer Admin guide for more details.

 

25. What is gp_toolkit?

The gp_toolkit is a database schema. It has many tables, views, and functions to better manage Greenplum Database when the DB is up. In 3.x earlier versions, it was referred to as gp_jetpack.

Additional Information