Cluster cannot be created in CA PAM
search cancel

Cluster cannot be created in CA PAM

book

Article ID: 205387

calendar_today

Updated On:

Products

CA Privileged Access Manager (PAM)

Issue/Introduction

Trying to add a node to a cluster fails all the time with the following error: "Error restarting cluster. Please refer to appliance logs". If the node is in a secondary site, or if the node is in a primary site with more than two nodes, it shows replication "Timeout" and it is inaccessible.

Checking the aactrl entries for the node having trouble, the following messages appear

11/09/20 12:33:41 - aactrl.sh:  Requesting a full database dump
11/09/20 12:33:42 - aactrl.sh:  Waiting for dump to be ready

11/09/20 12:33:47 - aactrl.sh:  Requesting if the database is ready return error:  Please check your Synchronization settings and try again., aborting ...!
11/09/20 12:33:47 - Syncing with the master database failed.

We can see that the DB dump is requested but it almost immediately fails. The same messages appear irrespective of whether the node is in a primary or secondary site

In the session logs there will be messages like the following

2020/11/09 12:34:02,system,alert, --, --, --, --, --, --,10.10.10.10, --, --, --, --,"PAM-CMN-1417: PAM appliance (0.0.0.0) attempted to perform cluster operation, but is not part of the cluster list.",0, --,,0

 

Environment

Release : 3.3.X and 3.4.X

Component : PRIVILEGED ACCESS MANAGEMENT

Cause

This is a communications error caused by a mismatch in the communications settings of the node with respect to the rest of the cluster

For instance, let's imagine we have two nodes, node A and node B. 

node A is configured to use one NIC interface: GB1 with IP address 10.10.10.10/24 and gateway 10.10.10.1

node B is configured with two NIC interfaces: GB1 with IP address 192.168.10.100/24 and gateway 192.168.10.1, and GB2, with address 10.10.10.10/24 and gateway 10.10.10.1. This cluster member is configured to use GB2 for the cluster, which is in the same subnet ad node A

This may cause a problem with the cluster communication, since the DB may be requested through the subnet intended to provide cluster communication, 10.10.10.0, but the information may be coming through the other defined interface.

If this is the case we will see messages like the ones indicated previously in the session logs, that is

2020/11/09 12:34:02,system,alert, --, --, --, --, --, --,10.10.10.10,10.10.10.10, --, --, --, --,"PAM-CMN-1417: PAM appliance (0.0.0.0) attempted to perform cluster operation, but is not part of the cluster list.",0, --,,0

The (0.0.0.0) message is caused by the inability of the cluster to recognize the origin of the cluster packets coming from the node we are trying to add to the cluster

 

Resolution

Set the cluster to use the same interface in each node and make sure that traffic is always going through that interface. For instance in the previous case make sure that in both nodes GB1 points to the 10.10.10.0 subnet IPs and that for both nodes as well the cluster is configured to use GB1 as the cluster interface. The other interface, GB2, may be added as well once it is clear that communications between nodes follow the correct path.