Hi,
We are facing intermittent connectivity issue for our agents during deployments. When checked the agents seems to show online on Agent management under ROC-> Administration.
Release : 6.7
Component : CA RELEASE AUTOMATION ACTION PACK
It was observed the most common cause for this type of issue occurring intermittently is either of below mentioned facts
The network outage if any is beyond the scope of this document as it need to be checked and fixed by Infrastructure team at your end. We will cover the possibility of misconfiguration of capacity and warn-capacity in this article.
Please refer the document Execution Server High Availability and Scalability in document link shared in additional references below for more details. We will try to provide more intuitive intention for making right choice.
Scenario:
As per guide Execution Server High Availability and Scalability the scenario explained is having 3 NES's and total 2000 agents and hence estimating warn-capacity as 1/3 of 2000 rounding of to 650/666. In this scenario we will consider a case where we have total 1200 agents and 40% of those i.e. 480 agents have single parents with capacity of 1000 and warn-capacity of 650 will not a good choice.
The reason of above being not good choice is that warn-capacity of 650 will only provided a bandwidth of 170 agents to be connected additionally in case of any NES failure. As NES cross this warn-capacity will cause the agents to seek another supernode, closing existing connection to accommodate new connections, rejecting new connections, closing existing connection if not in use for long.
Please note the agent status check is a periodic check happening with a gap of x minutes whereas during deployment connectivity check is real time check made for deployment. There are chances whereas agent when check in last periodic check was available but during the deployment may be offline due to above mentioned scenarios.
In such scenario we recommend to estimate warn-capacity based on understanding of your environment and NES topology. For example a warn-capacity of 800/900 in above scenario make more sense where if NES can still accommodate 320/420 more agents in case of other NES failure.