Error PAM-CMN-5199 The cluster configuration has changed on XXXX. adding node to CA PAM Cluster
search cancel

Error PAM-CMN-5199 The cluster configuration has changed on XXXX. adding node to CA PAM Cluster

book

Article ID: 226572

calendar_today

Updated On:

Products

CA Privileged Access Manager (PAM)

Issue/Introduction

After removing a node from a cluster site to perform some maintenance, dynamically adding it back is not possible as the following message is displayed

Error: PAM-CMN-5198: Failed to join the cluster. PAM-CMN-5199: The cluster configuration has been changed on XXXX. Please re-download and try again.

Where cluster member XXXX is not even from the same cluster site the node is being added to, but from a completely different site

Environment

CA PAM 3.3.X, 3.4.X, 4.X

Cause

There are multiple possible causes for this error. In the present article we will describe two of the possible root causes resulting in this error.

When the process of adding a node to a cluster is initiated, the node queries the rest of the nodes in the cluster (in its own site and elsewhere) to obtain their configuration. 

This process may fail with any of the cluster nodes, either because there is indeed a configuration mismatch or a communications problem between itself and the node being added, and then this problem will happen

  • Configuration mismatch

Cluster configuration is stored in file /var/uag/config/failover.cfg and it contains a description of all the nodes, their sites and IP addresses. Especially if the node was removed earlier, or if a previous addition process has failed for whatever reason, it may happen that one or several of the other cluster nodes contain an incorrect failover.cfg file which- for instance- already has the node we are trying to add as part of the cluster, or any other mismatch

Under this circumstances, when the new node is added to the cluster, its failover.cfg will be compared with the rest and, in particular, this comparison will fail against the nodes having differing information, so the process will fail.

It is sometimes easy to determine if this is the problem by just looking at the cluster view from the node which the message is warning against and comparing it with the view from the primary or from the node we are trying to add once the cluster configuration has been loaded there.

If no difference is spotted at first sight there may still be differences. Support can help determine if this is the case by checking directly the failover.cfg files in the different nodes

If for some reason communications are blocked to one of the other nodes, this message will be displayed.

  • Communications problems between nodes

This may also result from failure to communicate between the member being added and the peer it is complaining about. As part of the joining process, the failover.cfg file from the remote system needs to be retrieved in order to compare it with the local one. This needs communications through port 8443 between both members. If this is not working, in the php_error.log file of the member being added a message similar to the following will be present:

[ 10:31:28 09/20/21 ] [ error ] [Request-614862f558238]:  CURL request to scheme=https&host=XXX.XXX.XXX.XXX&port=8443&path=%2Fajax_cmd.php&query=cmd%3DACTACT%26cmdtype%3DGETCONFS returned error (7):  Failed to connect to XXX,XXX.XXX.XXX port 8443: Connection refused [ /var/www/htdocs/uag/hconfig/functions/failover_functions.php : 59 ]

Resolution

  • Configuration mismatch

If the mismatch is on a node which is not the primary replication lead, there is no way to correct it without actually expelling the node from the cluster. If it is the replication lead you may change the cluster configuration by making sure any spurious data, or the IP and name of the node we are trying to add is removed from the cluster configuration

If the error message refers to a secondary node or to a member of the primary site which is not the replication lead, then the best option is to expel the node which the error message refers to from the cluster and add it back so that it picks up the right cluster configuration.

Upon having done so, open the Cluster interface for it to make sure the information it has is now consistent with the rest of the cluster nodes and the node which is being added.

Once we have made sure the information is consistent i all cluster members and sites, repeat the node addition process

  • Communications problem between nodes

Make sure port 8443 is open between the different cluster sites both ways, as specified in the documentation:

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/privileged-access-manager/4-0-1/deploying/set-up-a-cluster/cluster-deployment-requirements.html

See section TCP/Clustered appliances in the above document