CA Application Performance Management Agent (APM / Wily / Introscope)INTROSCOPE
Issue/Introduction
Irregularly, yet about at a frequency of once a week, noticed that one of the controllers loses all of its agents. It is not the same collector that drops agents every time.
Environment
CA APM 10.5.2
Cause
As per the Log Analysis, the errors related to 'nio' in the outgoing delivery threads' stack trace.
6/20/18 08:41:27.899 PM MDT [DEBUG] [Outgoing Delivery 1] [Manager.OutgoingMessageDeliveryNioTask] Caught exception writing to connected hub java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at com.wily.isengard.postofficehub.link.v1.server.ByteBufferOutputStream.writeToChannel(ByteBufferOutputStream.java:229) at com.wily.isengard.postofficehub.link.v1.server.ByteBufferOutputStream.writeTo(ByteBufferOutputStream.java:158) at com.wily.isengard.postofficehub.link.v1.IsengardObjectOutputStream.writeToDataOutput(IsengardObjectOutputStream.java:532) at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDeliveryNioTask.writeToDataOutput(OutgoingMessageDeliveryNioTask.java:255) at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDeliveryNioTask.deliverNextMessageInternal(OutgoingMessageDeliveryNioTask.java:169) at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDeliveryNioTask.deliverNextMessage(OutgoingMessageDeliveryNioTask.java:119) at com.wily.isengard.postofficehub.link.v1.server.OutgoingMessageDelivererNio.run(OutgoingMessageDelivererNio.java:138) at com.wily.util.concurrent.SetExecutor.doWork(SetExecutor.java:224) at com.wily.util.concurrent.SetExecutor.access$0(SetExecutor.java:178) at com.wily.util.concurrent.SetExecutor$WorkerRequest.run(SetExecutor.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Resolution
Please try disabling 'nio' in all the Collectors and MOM, then restart the cluster Please add the following property in IntroscopeEnterpriseManager.properties to disable 'nio'.
transport.enable.nio=false
Disabling NIO will switch back to the previous classic socket operations, there is not loss of functionality
There is a similar issue which got fixed through 10.5.2 HF#9 (DE248777).