DX APM 11.1.3 Agents disconnecting and reconnecting continuously

book

Article ID: 141967

calendar_today

Updated On:

Products

CA Application Performance Management Agent (APM / Wily / Introscope) CA Application Performance Management (APM / Wily / Introscope) INTROSCOPE DX Application Performance Management

Issue/Introduction

Agents are continuously disconnecting and reconnecting to the cluster environment.

- On Agent side, it's disconnected due to socket closed. The IntroscopeAgent.log file (DEBUG enabled) reports the following:
[DEBUG] [IntroscopeAgent.IsengardMessaging] WebSocket socket channel closed.                                                                          
[DEBUG] [IntroscopeAgent.IncomingMessageDeliveryTask] Unexpected end of stream while reading from node at: Socket Transport connected with websocket: wss://apmservices-xxxxx:443
java.io.EOFException                                                                                                                                                                                
        at java.io.DataInputStream.readInt(DataInputStream.java:392)                                                                                                                                
        at com.wily.isengard.postofficehub.link.v1.IsengardObjectInputStream.readInt(IsengardObjectInputStream.java:1312)                                                                           
        at com.wily.isengard.postofficehub.link.v1.IsengardObjectInputStream.setUpStartObjectGraph(IsengardObjectInputStream.java:459)                                                              
        at com.wily.isengard.postofficehub.link.v1.IsengardObjectInputStream.readObject(IsengardObjectInputStream.java:257)                                                                         
        at com.wily.isengard.postofficehub.link.v1.IncomingMessageDeliveryTask.deliverNextMessage(IncomingMessageDeliveryTask.java:76)                                                              
        at com.wily.isengard.postofficehub.link.v1.IncomingRouteConnector.receiveIncomingMessages(IncomingRouteConnector.java:170)                                                                  
        at com.wily.isengard.postofficehub.link.v1.IncomingRouteConnector.doTask(IncomingRouteConnector.java:89)                                                                 
        at com.wily.isengard.util.thread.AThreadedExecutable.run(AThreadedExecutable.java:192)                                                                                                      
        at java.lang.Thread.run(Thread.java:748)                                                                                                                                                    
[VERBOSE] [IntroscopeAgent.PostOfficeHub] Disconnected From: Node=Server, Address=apmservices-xxxx:443, Type=socket                    

- On Enterprise Manager side, it's also disconnected suddenly. The IntroscopeEnterpriseManager.log file (DEBUG enabled) reports the following:
[INFO] [PO Route Down Executor] [Manager] Lost connection at: Node=Agent_10310, Address=/xx.xxx.x.xxx:xxxxx, Type=socket

 

Cause

This is a known issue of ingress nginx auto reloading when there is service/pod restart. Example of a service/pod restarting, see "doi-dspintegrator" service below:

[[email protected] ~]# kubectl get pods -n dxi
NAME                                                    READY        STATUS                         RESTARTS        AGE
doi-dspcasa1-xxxxx-xxx                        1/1                Running                         0                        11d
doi-dspintegrator-xxxxx-xxx                   0/1               CrashLoopBackOff        1655                   11d
doi-genericapiconnector-xxxxx-xxx       1/1                Running                         0                        11d

 

Refer to the following link for further information regarding the known issue of ingress nginx:

https://github.com/kubernetes/ingress-nginx/issues/2985

 

 

Environment

APM 11.1.3

Resolution

This issue can be addressed by using any of the following suggestions:

a) Use HTTPS rather than WSS for the agent connection to the EM.
This will not affect HTTPS, because there is a session kept on cloudgw that will be assigned to a new connection based on the session cookie. Therefore HTTPS communication works even if the connections are being closed. The same should apply for the agent side.

- Open the IntroscopeAgent.profile file (located at "Agent_home\config directory")
- Replace the agentManager.ul property to use HTTPS, eg:
agentManager.url.1=wss://apmservices-xxx:443
to
agentManager.url.1=https://apmservices-xxx:443
- Save the IntroscopeAgent.profile
- Restart the Agent

b) By fixing the service/pods restarts there should not be any disconnections anymore.