Collectors Dropping All Agents Irregularly
search cancel

Collectors Dropping All Agents Irregularly

book

Article ID: 105772

calendar_today

Updated On:

Products

CA Application Performance Management Agent (APM / Wily / Introscope) INTROSCOPE

Issue/Introduction

Irregularly, yet about at a frequency of once a week, noticed that one of the controllers loses all of its agents. It is not the same collector that drops agents every time.  

Environment

CA APM 10.5.2

Cause

As per the Log Analysis, the errors related to 'nio' in the outgoing delivery threads' stack trace.

6/20/18 08:41:27.899 PM MDT [DEBUG] [Outgoing Delivery 1] [Manager.OutgoingMessageDeliveryNioTask] Caught exception writing to connected hub
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at com.wily.isengard.postofficehub.link.v1.server.ByteBufferOutputStream.writeToChannel(ByteBufferOutputStream.java:229)
at com.wily.isengard.postofficehub.link.v1.server.ByteBufferOutputStream.writeTo(ByteBufferOutputStream.java:158)
at com.wily.isengard.postofficehub.link.v1.IsengardObjectOutputStream.writeToDataOutput(IsengardObjectOutputStream.java:532)
at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDeliveryNioTask.writeToDataOutput(OutgoingMessageDeliveryNioTask.java:255)
at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDeliveryNioTask.deliverNextMessageInternal(OutgoingMessageDeliveryNioTask.java:169)
at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDeliveryNioTask.deliverNextMessage(OutgoingMessageDeliveryNioTask.java:119)
at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDelivererNio.run(OutgoingMessageDelivererNio.java:138)
at com.wily.util.concurrent.SetExecutor.doWork(SetExecutor.java:224)
at com.wily.util.concurrent.SetExecutor.access$0(SetExecutor.java:178)
at com.wily.util.concurrent.SetExecutor$WorkerRequest.run(SetExecutor.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Resolution

Please try disabling 'nio' in all the Collectors and MOM, then restart the cluster 
Please add the following property in IntroscopeEnterpriseManager.properties to disable 'nio'.

transport.enable.nio=false

Disabling NIO will switch back to the previous classic socket operations, there is not loss of functionality

There is a similar issue which got fixed through 10.5.2 HF#9 (DE248777).