Seting up multisite PAM clustering Site for Disaster Recovery
search cancel

Seting up multisite PAM clustering Site for Disaster Recovery

book

Article ID: 215717

calendar_today

Updated On:

Products

CA Privileged Access Manager (PAM)

Issue/Introduction

CA PAM is multisite clustered application which entails that CA PAM can failover service from problematic nodes within each site but there is no automatic failover from one site to another without using a third-party load balancer. The specific requirements and expected configuration should be considered when designing the architectural flow to allow for your desired results.

Resolution

You should always be using three or more nodes in the primary cluster for a fault tolerant cluster. If you configure your cluster with only two nodes, you have a processing cluster where if either node is inaccessible the entire cluster is inaccessible and will require manual intervention. This is documented in the on-line manuals, see page Primary Site Fault Tolerance for more information

To configure disaster recovery to handle an entire site level outage you will have to configure a CA PAM secondary site. You can have a single node in the secondary site which would receive updates from the primary site and can manually be started as the master server assuming the entire production site became inaccessible, please review topic Site Promotion Using Replication Analysis on page Cluster Synchronization, Promotion, and Recovery. The concept of having at least one node in an offsite location is important to ensure quick recoverability for certain types of outages. If you choose to have more than one node in this site for this type of disaster recovery scenario, you should consider the possibility of using the disaster recovery location for the CA PAM primary site and have end users connect only to the secondary site for normal daily operations. This idea is based on the following considerations:

  1. From the end user's perspective, connecting to a secondary site is no different than connecting to a primary site. Switching end users to accessing secondary site nodes would have no impact to their daily usage.
  2. The primary site holds all the administrative and scheduled tasks so if you have secondary site nodes that are not being used then it makes more sense to move the end user access to these nodes freeing up resources on the primary nodes currently being used for access. The primary site would still be used by the CA PAM administrators so they would also still be used as well on a regular basis. For details and  a graphic in our manual see page How to Set Up a Cluster.
  3. The addition and subtraction of secondary site nodes has little impact to the overall cluster stability so having any kind of maintenance work to do on a secondary node will have less impact than having the same on a primary node. Also you can easily add or remove the number of secondary site nodes to manage the overall utilization from end users very easily.
  4. If you have access to a third-party load balancer you could automate failovers from a problematic secondary site back to the DR site if the DR site is the CA PAM primary but you might not be able to do this if the primary site was the site with the issue. This is because the secondary site relies on the primary site being fully functional but the primary site does not require the secondary sites stay functional (see link above on site promotion).

Basically the health of the overall cluster depends on the health of the primary cluster site. If you only have a single CA PAM appliance set aside for disaster recovery then changing this may not be cost effective. If you do have several nodes provisioned simply for disaster recovery then it may be beneficial to consider changing your architecture to distribute the load. Before making any such changes you should evaluate your specific environment for any additional concerns this type of change may impact.

Additional Information

See also documentation page Restore the Database to a New Appliance