How does ITPAM behave in a cluster configuration
search cancel

How does ITPAM behave in a cluster configuration

book

Article ID: 275976

calendar_today

Updated On:

Products

CA Process Automation Base

Issue/Introduction

In general, how does ITPAM operate in a clustered architecture?

Environment

ITPAM 4.3 and 4.4

All Supported Operating Systems

Resolution

In general with ITPAM cluster recovery, every process as a whole is assigned to be executed by a single node (a single node orchestrates the execution of a process operator by operator) and the status of execution is updated in database at every step.

If a node comes down during the execution of a process, as part of the recovery process, another active cluster node resumes execution of that process and it resumes from the part where the initial node went down, so it will not execute the operators which are already completed in the recovered process. It will start the execution from the operator which is in running state.

So the operator which is in the middle of execution is executed again as part of the recovery process. There is an exception for few operators like delay, assign user task operators which can go to waiting state - these operators are not executed again, but resumed from their waiting state.

Please refer to the following scenarios:

Scenario #1:

An ITPAM process with an operator writes data to a database, but in the middle of the write process, the ITPAM node that was executing the process becomes unavailable, what happens? 

The other ITPAM node will not restart the whole process, but will execute the operator which is in running state while the first node is down. There is a chance of DB queries getting re-executed if the DB queries executing operator is in running state while the node went down.

Scenario #2:

An ITPAM process with operators that depends on the manual action of an analyst, if in the middle of the execution, if one of the ITPAM nodes becomes unavailable, what happens? 

If the manual action is responding to a task, that task will not be re-triggered by node #2 as the operator is in a waiting state when the node went down. Once this process is picked up by the other node, it will resume the waiting operator and the operator gets completed once it receives input from the user (i.e. task is replied).

Additional Information

Install a Cluster Node for an Orchestrator