Cluster startup errors on failed CRC verification on the primary database dump

book

Article ID: 223640

calendar_today

Updated On:

Products

CA Privileged Access Manager (PAM)

Issue/Introduction

The cluster startup details page shows error from all but the first node in the primary site "SEVERE: CRC verification on the primary database dump failed. Please stop the cluster and retry". Download from secondary site leaders is in progress still, or shows the same error.

Cause

The first primary site node (master node) is missing a metafile that contains the sha sum of the database backup. After downloading the backup, the other nodes will calculate their own sha sum and compare it to what the master calculated, to confirm that the file was not tampered with during transfer. This check fails because the sha sum cannot be retrieved from the master.

A likely reason of the missing sha file is that the cluster was turned on prematurely, before it had completed cluster shutdown. As of September 2021 there is a known issue where the "TURN CLUSTER ON" button on the Configuration > Clustering page is activated shortly after the "TURN CLUSTER OFF" action was started, and long before the cluster in fact is off.

Environment

PAM 3.4 multi-site cluster

Resolution

In this state individual node or site synchronization likely will not work. Wait for the current (failing) cluster startup to finish. If you have the SSH debug patch installed, enable remote debugging services from the Configuration > Diagnostics > System page, just in case Support needs to get involved. Then turn the cluster off from the first primary site node. Make sure this completes, i.e. the "Turning cluster off" message disappears and the UI goes back to a white background. Logout and log in again to confirm that login works and that the cluster status page shows it as off and ready to be turned on. Then turn it on.

If the cluster does not turn off gracefully, please raise a case with PAM Support.