The minirepro utility is a smaller version of the gpsd (Greenplum Schema Dump) utility that collects the schema and statistics information for reproducing issues. gpsd retrieves the complete database metadata dump along with the statistics information for all relations.
Please note that the minirepro will collect pg_statistic
entries for the tables involved in the query. These entries may have sampling values of user data. It is therefore recommended to sanitize the output file for these entries and remove them if user data is not supposed to be shared.
Minirepro makes this easy by only extracting the information needed for the affected query and not for the complete database. It reads the query and gets DDL and statistics information for the relations involved. Usage examples are shown below:
[gpadmin@mdw]$ /usr/local/GP-4.3.7.2/bin/minirepro --version minirepro 1.0 [gpadmin@mdw]$ /usr/local/GP-4.3.7.2/bin/minirepro --help Usage: minirepro [options] Options: --version show program's version number and exit -?, --help Show this help message and exit -h HOST, --host=HOST Specify a remote host -p PORT, --port=PORT Specify a port other than 5432 -U USER, --user=USER Database user to connect with.This is mandatory as of now -q QUERY_FILE file name that contains the query -f OUTPUT_FILE minirepro output file name. Must be absolute path Example: minirepro database-name -q sql-file-name -f /tmp/output-file-name -U gpadmin
Note: This utility is made available for v4.3.7.0 and above. The query file should have a single valid formatted query.
An example for the database db01
is shown below:
gpadmin@mdw]$ minirepro db01 -q query.sql -f /data/oufile_msq -U gpadmin Connecting to database: host=mdw, port=5432, user=gpadmin, db=db01 ... Extracting metadata from query file query.sql ... Invoking pg_dump to dump DDL ... pg_dump -h mdw -p 5432 -U gpadmin -sxO db01 -t 'schema01.table01|table02|table03|table04' -f /tmp/20160419101152/xxxxx.dp.sql pg_dump -h mdw -p 5432 -U gpadmin -sxO db01 -t 'schema02.tablename' -f /tmp/20160419101152/yyyy.dp.sql Writing schema DDLs ... Writing table & view DDLs ... Writing table statistics ... Writing column statistics ... Attaching raw query text ... --- MiniRepro completed! ---
Below is a list of possible errors that occur while running this utility:
Error 1
Error while running gp_toolkit.gp_dump_query_oids(text). Please make sure the function is installed and the query file contains single valid query. error 'ERROR: Cannot parse query. Please make sure the input contains a single valid query. (gpoptutils.c:53)
Please make sure the input contains a single valid query (gpoptutils.c:53
).
Solution
Make sure that the query file used with -q
flag has no format/indent errors. Manual syntax and a semantic check is required. As mentioned in the error, there must be only one query in the file.
Error 2
Error while running gp_toolkit.gp_dump_query_oids(text). Please make sure the function is installed and the query file contains single valid query.
Solution
The tool needs to be upgraded to the most recent version. The functions need to be installed using the script file located at:
cp $GPHOME/share/postgresql/gp_toolkit.sql gp_toolkit_2.sql
Modify the gp_toolkit_2.sql
using vi
:
1. Remove the following statement in the file. This should be in the beginning of the file..
Begin;
2. Save the file.
Interactively run this file in the database using the following command:
prod_db=# \i gp_toolkit_2.sql ( or ) psql -f gp_toolkit_2.sql <dbname>
When running the gp_toolkit_2.sql
, the following errors will be produced:
relation "gp_pgdatabase_invalid" already exists
or
function "gp_skew_idle_fraction" already exists with same argument types
These errors can be safely ignored.
Once done, check the minirepro again.
Possible Error 3
raise ValueError(errmsg("Extra data", s, end, len(s))) ValueError: Extra data: line 2 column 1 - line 3 column 1 (char 134 - 150)
Solution
This issue is produced when timing has been enabled on the system. Edit the .psqlrc
file to remove the "\timing
" option and save the file. Try minirepro again.
Possible Error 4
$ minirepro db01 -q query.sql -f ./outfile -U gpadmin Connecting to database: host=mdw, port=5432, user=gpadmin, db=db01 ... Traceback (most recent call last): File "/opt/gpdb/greenplum-db/./bin/minirepro", line 413, in <module> main() File "/opt/gpdb/greenplum-db/./bin/minirepro", line 329, in main conn = pgdb.connect(connectionString) File "/opt/gpdb/greenplum-db/lib/python/pygresql/pgdb.py", line 483, in connect dbtty, dbuser, dbpasswd) pg.InternalError: fe_sendauth: no password supplied
Solution: Environment doesn't have gpadmin account. Check with customer to see what is the default admin role they use and use it with the -U
option instead of "gpadmin
"