While running a deployment the following errors were returned:
Release : 6.6
Component : CA RELEASE AUTOMATION CORE
When an Execution Server starts a job on an agent, it first tries to PING the agent (see Additional Information for log msgs). The errors above indicate that the Execution Server sent the PING message to the Agent but did not get a response.
There are three options to resolve this problem:
Increasing the timeout used by the connectivity check service
Addressing this problem using options #1 or #2 is ideal. However, if these options are not possible then the timeout used by the connectivity check service can be increased.
The connectivity check service sends ping messages to all the nodes connected to the supernode (NES) to confirm the agent is actively connected/responding. The timeout of these messages may be changed in the execution-servlet.xml file. The steps to update this timeout value is as follows: in the following bean and as you can see the default value is 35 seconds:
//beans/bean[@id='
Note:
Troubleshooting
If the PING message sent by the NES is not received by the agent then it is expected that the underlying problem is a network issue. If additional analysis is needed then the following should be gathered after the problem has been reproduced:
When this type of error occurs, consider using the NES's JMX to manually PING the agent machine in question. This will give you the capability to reproduce problems sending messages without starting deployments. If you can reproduce the message sending problem then it can give an opportunity to get the network tracing enabled, reproduce, capture data. To do this:
If it gets a response then it will reflect that the message was sent. If it doesn't get a response then it will eventually give a timeout error.
Example Messages
See below for example message that you should be able to see exchanged between the NES and Agent upon a successful PING.
Note:
Execution Server:
2020-12-07 15:09:50,397 [JobExecutorThread-6] DEBUG (com.nolio.nimi.appmsg.durability.DurableCommunicationApi:144) - Got new message: es_<nes_nodeId>_160693912061731:payload=[ID:7c6007d8441a5800_1e@es_<nes_nodeId>, from:es_<nes_nodeId>, to:PING@<agent_nodeId>- PING]
Agent Server:
2020-12-07 15:09:50,503 [New I/O server worker #1-2] DEBUG (com.nolio.nimi.appmsg.durability.DurableCommunicationApi:233) - Received shipping: es_<nes_nodeId>_160693912061731:payload=[ID:7c6007d8441a5800_1e@es_<nes_nodeId>, from:es_<nes_nodeId>, to:PING@<agent_nodeId>- PING]
Agent Server:
2020-12-07 15:09:50,506 [Communication Msg Processor-2] DEBUG (com.nolio.nimi.appmsg.durability.DurableCommunicationApi:155) - Got new message: <agent_nodeId>_160737107556804:payload=[ID:d3b1d9cb9d5b800_3@<agent_nodeId>, from:<agent_nodeId>, to:MESSAGE_RESPONSE_SERVICE@es_<nes_nodeId>- [Response for message: 7c6007d8441a5800_1e@es_<nes_nodeId>]]
Execution Server:
2020-12-07 15:09:50,509 [New I/O client worker #1-1] DEBUG (com.nolio.nimi.appmsg.durability.DurableCommunicationApi:222) - Received shipping: <agent_nodeId>_160737107556804:payload=[ID:d3b1d9cb9d5b800_3@<agent_nodeId>, from:<agent_nodeId>, to:MESSAGE_RESPONSE_SERVICE@es_<nes_nodeId>- [Response for message: 7c6007d8441a5800_1e@es_<nes_nodeId>]]