To resolve the issue correct the underlying network issue that's causing the replication delays between Identity Manager nodes.
Workaround:
To workaround the issue increase the auto_recovery_replication_delay_threshold setting
1. Find the typical replication delays on this setup:
cd /var/log/pgService
zgrep "SECONDARY_1_REPLICATION_DELAY" * | cut -d '=' -f2 | uniq | sort -rn | head -n50
# Similarly for SECONDARY_2
zgrep "SECONDARY_2_REPLICATION_DELAY" * | cut -d '=' -f2 | uniq | sort -rn | head -n50
2. Based on the highest delays reported above, configure the new threshold in /usr/local/etc/lcm-pgpool.conf
for example:
auto_recovery_replication_delay_threshold=100000
Note: 100000 is just a reference value, we recommend setting the actual threshold value based on the max values seen in the output of the commands listed in the step 1.
3. Perform this change on all the identity manager nodes in the cluster.