The gpcc_table_info_history table stores wrong a daily snapshot of statistics about tables
search cancel

The gpcc_table_info_history table stores wrong a daily snapshot of statistics about tables

book

Article ID: 297055

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

There are requirements to find the amount of data inserted, updated or deleted on a table over the last 24 hours and the following query could find some statistics similarly as expected.

gpperfmon=# SELECT table_name, n_tup_ins, last_ins, n_tup_del, last_ins, n_tup_upd, last_upd, size, last_size_ts FROM gpmetrics.gpcc_table_info_history WHERE table_name = '<table name>'and ctime BETWEEN NOW() - INTERVAL '24 HOURS' AND NOW();"

However both of our GPDB/GPCC cluster has the data being collected at 13:00 each day as below query output since the GPDB 6.x was initially deployed in contrast with explanation that table is updated each day "before midnight each night" at described at the following official manual, https://docs.vmware.com/en/VMware-Tanzu-Greenplum-Command-Center/6.8/tanzu-greenplum-command-center/GUID-topics-ref-gpmetrics.html#gpcc_table_info ​​​

gpperfmon=# SELECT distinct ctime FROM gpmetrics.gpcc_table_info_history ORDER BY ctime;
        ctime
---------------------
 2022-04-24 13:00:00
 2022-04-25 13:00:00
 2022-04-26 13:00:00
 2022-04-27 13:00:00
 2022-04-28 13:00:00
 2022-04-29 13:00:00
 2022-04-30 13:00:00
 2022-05-01 13:00:00
 2022-05-02 13:00:00
 2022-05-03 13:00:00
~~ snip
As it's the issue of time difference, timezone and locate settings has been checked with the following commands. But, it looks the outcomes are all good for local region.
$ psql -c "select now()"
              now
-------------------------------
 2023-05-22 12:10:23.187025+10
(1 row)


$ psql -c "show timezone"
      TimeZone
--------------------
 Australia/Brisbane
(1 row)

$ gpssh -f $HOME/hosts.all -e "timedatectl"
~~ snip
[sdw2]       DST active: n/a
[sdw1] timedatectl
[sdw1]       Local time: Mon 2023-05-22 12:11:32 AEST
[sdw1]   Universal time: Mon 2023-05-22 02:11:32 UTC
[sdw1]         RTC time: Mon 2023-05-22 02:11:32
[sdw1]        Time zone: Australia/Brisbane (AEST, +1000)
[sdw1]      NTP enabled: yes
[sdw1] NTP synchronized: yes
[sdw1]  RTC in local TZ: no
[sdw1]       DST active: n/a

testdb | 5432 | 6.18.1 $$ gpssh -f $HOME/hosts.all -e "ls -l /etc/localtime"
[sdw2] ls -l /etc/localtime
[sdw2] lrwxrwxrwx. 1 root root 40 Feb  7  2020 /etc/localtime -> ../usr/share/zoneinfo/Australia/Brisbane
~~ snip

Why it is collected at 13:00 every day on GPDB clusters and not 23:55 and what's the resolution for it?


Environment

Product Version: 6.18
OS: RHEL or CentOS 7.x

Resolution

it's known bug and fixed in GPCC 6.9.0 and higher versions. After GPCC 6.9.0, the history data would be saved at 03:00 AM every day.