gppkg remove and install are failing

Products

VMware Tanzu Greenplum VMware Tanzu Greenplum / Gemfire VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

Removing or adding a gppkg to Greenplum cluster fails with the messages below:

[gpadmin@emea-cdw bin]$ gppkg -r MetricsCollector
20251124:10:57:40:862614 gppkg:emea-cdw:gpadmin-[INFO]:-Starting gppkg with args: -r MetricsCollector
20251124:10:57:40:862614 gppkg:emea-cdw:gpadmin-[INFO]:-Uninstalling package MetricsCollector-6.15.0_gp_6.30.0-rhel8-x86_64.gppkg
20251124:10:57:40:862614 gppkg:emea-cdw:gpadmin-[INFO]:-Validating rpm uninstallation cmdStr='rpm --test -e MetricsCollector-6.15.0-6.30.0 --dbpath /usr/local/greenplum-db-6.30.0/share/packages/database'
20251124:10:57:41:862614 gppkg:emea-cdw:gpadmin-[CRITICAL]:-gppkg failed. (Reason='') exiting...

[gpadmin@emea-cdw bin]$ gppkg -i /usr/local/greenplum-cc-6.15.0/gppkg/MetricsCollector-6.15.0_gp_6.30.0-rhel8-x86_64.gppkg 
20251124:10:57:55:862648 gppkg:emea-cdw:gpadmin-[INFO]:-Starting gppkg with args: -i /usr/local/greenplum-cc-6.15.0/gppkg/MetricsCollector-6.15.0_gp_6.30.0-rhel8-x86_64.gppkg
20251124:10:57:55:862648 gppkg:emea-cdw:gpadmin-[INFO]:-Installing package MetricsCollector-6.15.0_gp_6.30.0-rhel8-x86_64.gppkg
20251124:10:57:55:862648 gppkg:emea-cdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.30.0/.tmp/MetricsCollector-6.15.0-6.30.0.x86_64.rpm --dbpath /usr/local/greenplum-db-6.30.0/share/packages/database --prefix /usr/local/greenplum-db-6.30.0'
20251124:10:57:55:862648 gppkg:emea-cdw:gpadmin-[CRITICAL]:-gppkg failed. (Reason='MetricsCollector-6.15.0_gp_6.30.0-rhel8-x86_64.gppkg is already installed.') exiting...

Cause

There are 2 known causes for this error:

The gp_segment_configuration table has 2 different "hostname" for the same host.
The coordinator and standby coordinator are running on hosts that are also running segments.

Resolution

Solution 1

Check gp_segment_configuration table, the "hostname" for each host should be correct.

The "hostname" column should be the result of the command "hostname" when run on the host. For example:

# Run "hostname" on all hosts
$ gpssh -f /tmp/hostfile
=> hostname
[ cdw] emea-cdw
[sdw1] emea-sdw1
[sdw2] emea-sdw2
=>

Then check the "hostname" entries in gp_segment_configuration

gpadmin=# select dbid,content,hostname from gp_segment_configuration order by content,dbid;
 dbid | content | hostname  
------+---------+-----------
    1 |      -1 | emea-cdw
   10 |      -1 | sdw1
    2 |       0 | emea-sdw1
    6 |       0 | emea-sdw2
    3 |       1 | emea-sdw1
    7 |       1 | emea-sdw2
    4 |       2 | emea-sdw2
    8 |       2 | emea-sdw1
    5 |       3 | emea-sdw2
    9 |       3 | emea-sdw1
(10 rows)

In the above example dbid=10 has the incorrect hostname "sdw1". It should be "emea-sdw1"

To correct this run the following:

gpadmin=# BEGIN;
BEGIN
gpadmin=# set allow_system_table_mods=true;
SET
gpadmin=# update gp_segment_configuration set hostname='<HOSTNAME>' where dbid=<DBID>;  -- replace <HOSTNAME> and <DBID> with appropriate values.
UPDATE 1
gpadmin=# -- re-run the above SELECT to check all is OK and then COMMIT the changes, or ROLLBACK if a mistake was made.

Solution 2

If the coordinator has segments running on it AND the standby coordinator host also has segments running on it and the Greenplum version is below 6.31.2, then raise a ticket with Broadcom support and refer to this article.

The code fix for this issue is released in Greenplum 6.31.2 and above.