An APM environment is experiencing frequent agent disconnections when using HTTP(S) tunneling connection to any port. These disconnections do not appear to be performance or load balancing related
A potential cause of this issue was introduced in Oracle JVM 1.8 Update 20 or above. Oracle updated the Axis concurrency classes used in Oracle JVM 1.8.20 and later versions which resulted in changes to the collection sorting implementation. As of this Axis release, these classes now require synchronization if being invoked concurrently. (Before the release, Axis was using the collection API in its web service implementation without synchronization.)
This Axis implementation can potentially trigger a concurrent modification exception. This consequently, may produce a concurrent/race condition that can cause HTTP/HTTPS disconnections under stress conditions. Within APM, this issue predominately affects connections between agent and collector, but can manifest itself in any connections using HTTP/HTTPS tunneling (for example Webview).
Currently this issue can be worked around by downgrading to Oracle JVM 1.8 Update 11. (Note: Any version below Update 20, which does not include the Axis update, will work.) CA will also be updating the APM code with an updated version of Axis classes that resolves these issues, in a future APM release.
This issue was detected in an APM 10.2 Environment, but may impact all APM 10.x releases.
Download JVM 1.8 Update 11 from Oracle, Then update EM startup script to use the JRE from Oracle JVM 1.8 Update 11.