Agent connectivity check fails with the error 'Connectivity Check failed. reason: Could not find accessible execution server for agent'.
The NIMI.LOG on the Execution Server shows the following error lines:
2014-08-19 17:51:14,385 [KeepAliveWorker-4] DEBUG (KeepAliveManagerImpl.java:283) - Not updating [NodeInfo{nodeId=nid:<Node ID>, hostname='<Hostname>', addresses=[/10.x.x.x:6600], nodeType=NODE, version=4.5.1.59}] - this supernode is over max capacity
CA Release Automation - All versions
This error occurs when the number of agents that can be handled by the Execution Server (NES) has reached it's maximum capacity.
An Execution server is configured to serve a specific number of agents defined in nimi_config.xml configuration file (Default: 200).
If there are more than the defined number of agents trying to connect to an Execution server there can be several issues.
One of the issues is that the agents randomly lose connectivity to the server at random times. This can cause jobs to fail during any step of process execution.
The current default definition of the number of agents can be found in <RA_EXECUTION_SERVER_INSTALL_DIR>/conf/nimi_config.xml
<capacity>200</capacity> <!-- how much nodes to accept--> <warn-capacity>200</warn-capacity> <!-- over this limit new connecting nodes will be asked to seek another supernode--> </server>
1. Increase the number of agents that can be handled by an Execution Server, by modifying nimi_config.xml.
Example for increasing the capacity to 250.
<server>
<capacity>250</capacity> <!-- how much nodes to accept-->
<warn-capacity>250</warn-capacity> <!-- over this limit new connecting nodes will be asked to seek another supernode-->
</server>
2. Restart the Nolio Services on the Execution Server
Important Note: Make sure that there are no jobs running on the Execution Server before making the change.
Things to consider before increasing the number or agents
- Current RA JVM heap size definitions
- RAM on the Execution Server
- Number of processors on the Execution Server
Note: It is highly recommended to add a new Execution Server and load balance the number of agents attached to each Execution Server, if the maximum capacity (default: 200) has been reached.