Why is the default disaster.switch.time so long with 10min?
Release : 4.3
The reason for the default switch time to 10 minutes is to allow the other/passive PAM node(s) to be sure the current active node(s) are in real trouble before the other/passive node(s) take over the PAM requests.
If you set this to shorter time (like 1 minute, which is the minimal you can set), then this could happen: the passive nodes check the active nodes and don't get respond in 1 minute because the active nodes are busy handling requests, the passive node would think the active nodes are down and thus take over the requests and at this moment it could be the case that there are a lot of requests are queued and then the other nodes would try to take over. This will cause a nodes keep taking over others in a loop.
10 minutes is the right setting in practice - if passive nodes do not get response from active nodes in 10 minutes, the active nodes are in real trouble and yes the passive nodes should take over and become new active nodes