Clone-db errand fails in verifying checksum for UAA and TELEMETRY in VMware Tanzu Kubernetes Grid Edition (TKGI)
search cancel

Clone-db errand fails in verifying checksum for UAA and TELEMETRY in VMware Tanzu Kubernetes Grid Edition (TKGI)

book

Article ID: 298615

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

You are upgrading VMware Tanzu Kubernetes Grid Integrated Edition (TKGI) to 1.7 with the clone-db errand turned on, this fails on verifying checksum for UAA and TELEMETRY:
Migrating the data...
CHECKING IF CLONE ALREADY HAPPENED
CHECKING FOR MAINTENANCE USER
CHECKING IF FRESH INSTALL
INSERT MAINTENANCE RECORD BEFORE STARTING CLONE
SET DB TO READ ONLY ON NEW DB
PERFORMING COPY FOR PKS DATABASE
PERFORMING COPY FOR UAA
PERFORMING COPY FOR TELEMETRY AND BILLING
PERFORMING CLONE FOR PKS DATABASE
PERFORMING CLONE FOR UAA DATABASE
PERFORMING CLONE FOR TELEMETRY, BILLING DATABASES
RECORD PKS TABLES FROM OLD DB FOR VERIFICATION LATER
RECORD PKS TABLES FROM NEW DB FOR VERIFICATION
VERIFY CHECKSUM FOR PKS DATABASE
VERIFY UAA.GROUP_MEMBERSHIP
VERIFY UAA.GROUPS
CONVERTING A LIST OF GROUP IDS SEPARATED BY NEWLINE INTO A STRING WITH DOUBLE QUOTED IDS TO BE USED in IN CLAUSE('id1', 'id2'...)
VERIFY CHECKSUM FOR UAA AND TELEMETRY
Database cloning failed; checksums for UAA and/or Telemetry do not match!

This issue only happens with upgrade paths that started with VMware Tanzu Kubernetes Grid Integrated Edition (TKGI) 1.1. This is due to a column order change in the external_group_mappings table in the UAA database.

You can verify this by checking the difference between cloned_mysql_backup.sql and original_verify_mysql_backup.sql which is located at /var/vcap/bosh:
# diff cloned_mysql_backup.sql original_verify_mysql_backup.sql

36c36

< REPLACE INTO `external_group_mapping` (`group_id`, `external_group`, `added`, `origin`, `id`, `identity_zone_id`) VALUES ('7128cfd4-ad3a-4a4e-a8a1-9fd87d6615bb','cn=test_org,ou=people,o=springsource,o=org','2018-02-21 19:58:03','ldap',1,'uaa'),('e36e739f-8d22-49e9-9e7c-fdd74a75c2c2','cn=cf_nonprod_admin_dl,ou=posse,ou=engineering,ou=core,dc=prod,dc=travp,dc=net','2018-07-23 16:10:13','ldap',2,'uaa'),('652091fd-3bc9-4ebb-be2a-3464e29b1872','cn=cf_nonprod_auditors_dl,ou=posse,ou=engineering,ou=core,dc=prod,dc=travp,dc=net','2020-04-29 14:30:35','ldap',4,'uaa');

---

> REPLACE INTO `external_group_mapping` (`group_id`, `external_group`, `added`, `origin`, `identity_zone_id`, `id`) VALUES ('7128cfd4-ad3a-4a4e-a8a1-9fd87d6615bb','cn=test_org,ou=people,o=springsource,o=org','2018-02-21 19:58:03','ldap','uaa',1),('e36e739f-8d22-49e9-9e7c-fdd74a75c2c2','cn=cf_nonprod_admin_dl,ou=posse,ou=engineering,ou=core,dc=prod,dc=travp,dc=net','2018-07-23 16:10:13','ldap','uaa',2),('652091fd-3bc9-4ebb-be2a-3464e29b1872','cn=cf_nonprod_auditors_dl,ou=posse,ou=engineering,ou=core,dc=prod,dc=travp,dc=net','2020-04-29 14:30:35','ldap','uaa',4);


Environment

Product Version: 1.7

Resolution

Workaround

As a workaround, you need to edit the clone-db scripts and prevent the script from exiting when it fails checksum on UAA and TELEMETRY.

1.Run the command: bosh -d <pks-deployment> ssh pks-db-vm

2. Run the command: cd /var/vcap/packages/clone-db/bin

3. [Optional] Make a backup of clone-db in case you run into issues. You can revert back to the old clone-db script with the commnad: cp clone-db clone-db-backup 

4. Edit clone-db and go to line 201. Comment out exit 1 as shown below and save the file.
original_db_checksum=($(md5sum "${ORIGINAL_VERIFY_BACKUP_FILENAME}.sql"))

clone_db_checksum=($(md5sum "${CLONED_BACKUP_FILENAME}.sql"))

if [[ $original_db_checksum != $clone_db_checksum ]]; then

    echo "Database cloning failed; checksums for UAA and/or Telemetry do not match!"

    # exit 1

else

    rm -f "${CLONED_BACKUP_FILENAME}.sql" "${ORIGINAL_BACKUP_FILENAME}.sql" "${ORIGINAL_VERIFY_BACKUP_FILENAME}.sql" "PKS_${ORIGINAL_BACKUP_FILENAME}.sql" "UAA_${ORIGINAL_BACKUP_FILENAME}.sql"

fi

5. Click Apply Change to continue upgrading.