Replication Job Timeouts and Back off and retry of Replication Jobs

book

Article ID: 180316

calendar_today

Updated On:

Products

Management Platform (Formerly known as Notification Server)

Issue/Introduction

 

Resolution

Replication Job Timeouts 

Hierarchy replication rules when run are split out into replication jobs. These replication jobs will timeout and be aborted once the specified timeout limit is met.

There are two types of timeouts, Global and per individual rule:

- The global timeout exists within the CoreSettings.config file

It is called ‘ReplicationMaxJobTimeout’ with a default value of ‘172800’ (48 hours).

- The individual rule timeout exists within the XML of each replication rule

There is a tag within the XML called <maximumRunTime>172800</maximumRunTime> which is also a default of 48 hours.

Note: If the two timeout types are configured to different values, the timeout with the lesser value will be enforced.

 - Using Chevron the XML tag value for the individual replication rule can be modified and re-imported to be the desired timeout value.
 
Back off and retry of Replication Jobs
 

When a replication job fails to run for whatever reason, the job will enter a back off retry mode. The job will retry after 1 minute, then 2 minutes, followed by 4 minutes and so on, doubling the last retry time up until the maximum back off time of 1024 minutes. This is incremental back off is controlled by the following core setting. "ReplicationJobBlackoutMultiplier"

If the replication job has been retrying for 24 hours or reaches the 1024 minute retry attempt, it will then abort the replication.