gp_toolkit.gp_log_database
:
gpadmin=# select * from gp_toolkit.gp_log_database limit 100; ERROR: missing data for column "loguser" (seg35 slice1 sdw5:1028 pid=753186) DETAIL: External table __gp_log_segment_ext, line 12417992 of execute:cat $GP_SEG_DATADIR/pg_log/*.csv: "total 1091860"
This error indicates there is something unusual with the format or contents of some of the .csv files under the pg_log
directory of the segment that reported the error.
In this particular case, there was a file named "s gpdb-2014-07-24_000000.csv
" and its contents were not pg_log
files but it had been named as such.
The resolution was to rename the file name to a non csv extension.
Other useful troubleshooting steps:
The table gp_toolkit.gp_log_database
is referencing the external table gp_toolkit.__gp_log_segment_ext
, which reads all .csv files under $GP_SEG_DATADIR/pg_log/
as a single file, so the line number reported on the error is cumulative of all the .csv files together as one single file.
In order to determine what file is the conflicting one, you can run the following command under the pg_log
directory of the segment reporting the error to see a list of the line numbers:
[gpadmin@mdw_lab2 pg_log]$ wc -l *.csv 137 gpdb-2019-10-20_203215.csv 168 gpdb-2019-10-20_203258.csv 11259 gpdb-2019-10-20_203306.csv 1324 gpdb-2019-10-21_000000.csv 405 gpdb-2019-10-22_000000.csv 13293 total
Then work out what file contains the line reported on the error. For example for the above output, if the line reported is 2500, we know it will be in file gpdb-2019-10-20_203306.csv
. If we rename it to .txt or other extension, the query should now succeed.
In order to determine what is wrong with the file, we can create the following test external table based on gp_toolkit.__gp_log_segment_ext
but pointing to the specific segment and file:
CREATE EXTERNAL WEB TABLE test_pivotal ( logtime timestamp with time zone, loguser text, logdatabase text, logpid text, logthread text, loghost text, logport text, logsessiontime timestamp with time zone, logtransaction integer, logsession text, logcmdcount text, logsegment text, logslice text, logdistxact text, loglocalxact text, logsubxact text, logseverity text, logstate text, logmessage text, logdetail text, loghint text, logquery text, logquerypos integer, logcontext text, logdebug text, logcursorpos integer, logfunction text, logfile text, logline integer, logstack text ) EXECUTE E'cat $GP_SEG_DATADIR/pg_log/gpdb-*.txt' ON SEGMENT 35 FORMAT 'csv' (delimiter E',' null E'' escape E'"' quote E'"') ENCODING 'UTF8';
Note: We have specified segment 35 and gpdb-*.txt as the file name (we would have previously renamed the conflictive log file into .txt)
When querying the table, it will now point to the exact line number having the issue, so we can open the file and examine it.