A Red Hat Patch results in cluster slowness and connection issues. Logs shows the following
7/29/19 06:15:16.708 AM EDT [WARN] [PO:main Mailman 6] [Manager] Waited 2000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.agent.beans.autotracing.IAutoTracingTriggerService.clearAllAutoTracingTriggers, v1, []} from address Server.main:64640 to service address Agent_62649.main:258 from thread PO:main Mailman 6 -- We will not wait any longer
7/29/19 06:15:16.708 AM EDT [ERROR] [PO:main Mailman 6] [Manager.BaselineEngine.AutoTrace] AgentThresholdDeliveryService: unable to get trigger service for agent: SuperDomain|lfoo33|Tomcat|BI4_App [state=Disconnected, ipAddress=10.10.10.10, socketType=default, okToDisconnect=true, okToUnmount=true, okToAutoUnmount=true, supportsShutoff=true, shutoff=false, supportsTransactionTracing=false, supportedTransactionTracingFilterTypes=(0,1,2,3,4,5,6,7, dynamicInstrumentationFlags=0]
com.wily.isengard.messageprimitives.TimeoutConnectionException: Service call to host {Unknown} timed out after 2000 ms com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.agent.beans.autotracing.IAutoTracingTriggerService.clearAllAutoTracingTriggers, v1, []} threadname PO:main Mailman 6
at com.wily.isengard.messageprimitives.service.MessageServiceClient.blockOnResponse(MessageServiceClient.java:282)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:163)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
at com.sun.proxy.$Proxy228.clearAllAutoTracingTriggers(Unknown Source)
at com.ca.apm.baseline.thresholds.AgentThresholdDeliveryServiceImpl.agentAdded(AgentThresholdDeliveryServiceImpl.java:245)
at com.ca.apm.baseline.thresholds.AgentThresholdDeliveryServiceImpl$AutoTraceTriggerQuery.dataAdded(AgentThresholdDeliveryServiceImpl.java:669)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyAdded.run(AbstractQueryServiceManager.java:370)
at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateAdded(AbstractQueryServiceManager.java:201)
at com.wily.isengard.registry.server.RegistryService.addIfNotExists(RegistryService.java:110)
at com.wily.isengard.registry.server.RegistryService.addEntry(RegistryService.java:229)
at sun.reflect.GeneratedMethodAccessor535.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.wily.isengard.messageprimitives.MethodCallUtilities.callInterface(MethodCallUtilities.java:75)
at com.wily.isengard.messageprimitives.MethodCallUtilities.callInterface(MethodCallUtilities.java:29)
at com.wily.isengard.messageprimitives.service.MessageService.attemptMethodCall(MessageService.java:183)
at com.wily.isengard.messageprimitives.service.MessageService.handleMethodCallMessage(MessageService.java:135)
at com.wily.isengard.messageprimitives.service.MessageService.receiveMessage(MessageService.java:161)
at com.wily.isengard.postoffice.Mailbox.handleMessage(Mailbox.java:252)
at com.wily.isengard.postoffice.PostOffice.deliverInternal(PostOffice.java:531)
at com.wily.isengard.postoffice.PostOffice.access$2(PostOffice.java:477)
at com.wily.isengard.postoffice.PostOffice$DeliveryItem.run(PostOffice.java:868)
at com.wily.EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:728)
at java.lang.Thread.run(Thread.java:745)
7/29/19 06:15:31.597 AM EDT [WARN] [Dispatcher 1] [Manager] Outgoing message queue is not moving. Terminating connection: Node=Agent_62509, Address=gogol.foo.com/10.10.10.10:8176, Type=socket
Any supported APM release running on Red Hat with RHSA-2019:1481 deployed.
Red Hat Patch impacted TCP settings. Issue is with TCP fragmented packets time out that is load-related. See https://access.redhat.com/errata/RHSA-2019:1481 for details.
RedHat is still investigating the issue as of July 2019, and has provided advice for workarounds within https://access.redhat.com/solutions/4302501. Some of the advice given are tuning parameters which may only partially mitigate the problem, and at this time the only known full solution is to revert the upgrade contained within kernel-3.10.0-957.21.3.el7 to an earlier version, as recommended in the RedHat solution article.
Please contact Red Hat Support immediately for suggestions on the most current workarounds and solutions on this issue
If you have a Red Hat Linux subscription, please see https://access.redhat.com/solutions/4302501 for more details.