When customer deployed redundancy policy servers, and fail back does not work properly.
The work sequence is as below.
1. Stop primary policy server.
2. Request 50 thread per second and all of them fail over to secondary policy server.
3. Start primary policy server.
4. Request 50 thread per second again, but about 10 of them threads does not fail back soon.
5. After about 10 minutes, all threads fail back completely.
Why does it takes some time to fail back all threads completely ?
This issue is related to Apache prefork mode.
In case of prefork mode, Apache webserver create a number of child process, difference of worker mode.
By this reason, some of existing child processes which made instance with secondary policy server is still connecting with secondary policy server for a while,
even if primary policy server startup.
Workarround is either one of the following.
1. Change Apache mode from prefork to worker.
2. Set '0' or '2' to 'KeepAgentConnections' registry key in policy server.
0 - After the TTL expires, the Policy Server closes the agent connection by sending a TCP FIN packet to the Agent.
2 - After the TTL expires, the Policy Server closes the agent connection by sending a TCP RST packet to the Agent.