Webview CPU spike causing issues with users using APM

book

Article ID: 205641

calendar_today

Updated On:

Products

CA Application Performance Management Agent (APM / Wily / Introscope) CA Application Performance Management (APM / Wily / Introscope) INTROSCOPE DX Application Performance Management

Issue/Introduction

Recently started receiving complaints from users of APM that it's failing to represent metric data while using the tool and hanging. I'm seeing a very large CPU spike with the Webview Java process when this occurs. Also seeing the following errors in the Webview logs during the CPU spike. Can you help to understand what is going on with APM that causes this error?

[ERROR] [WebServer.AAsyncMessagePipeEndpoint] Caught exception while handling stream message. java.lang.NullPointerException
[ERROR] [WebServer.AAsyncMessagePipeEndpoint] Caught exception while handling stream message. java.lang.NullPointerException
[ERROR] [WebServer.AAsyncMessagePipeEndpoint] Caught exception while handling stream message. java.lang.NullPointerException

Environment

Release : 10.7.0

Component : APM Agents

Resolution

Confirmed this Collector was disconnected with MOM every 5 minutes:

 [INFO] [PO:main Mailman 4] [Manager.SessionBean] MOM Introscope Enterprise Manager connected: Node=Workstation_96, Address=xxx, Type=socket
 [INFO] [PO:main Mailman 8] [Manager.SessionBean] MOM Introscope Enterprise Manager connected: Node=Workstation_97, Address=xxx, Type=socket
INFO] [PO:main Mailman 6] [Manager.SessionBean] MOM Introscope Enterprise Manager connected: Node=Workstation_98, Address=xxx, Type=socket
...

 

And MOM log shows it's disconnecting the Collector due to the clock skew:

[WARN] [Collector [email protected]] [Manager.Cluster] Collector clock is too far skewed from MOM. Collector clock is skewed from MOM clock by 3,005 ms. The maximum allowed skew is 3,000 ms. Please change the system clock on the collector EM.

 [WARN] [Collector [email protected]] [Manager.Cluster] Collector clock is too far skewed from MOM. Collector clock is skewed from MOM clock by 3,016 ms. The maximum allowed skew is 3,000 ms. Please change the system clock on the collector EM.
[WARN] [Collector [email protected]] [Manager.Cluster] Collector clock is too far skewed from MOM. Collector clock is skewed from MOM clock by 3,029 ms. The maximum allowed skew is 3,000 ms. Please change the system clock on the collector EM.
...

 

So basically the MOM keep disconnecting the Collector trying to sync up the clock again.

 

Since this always occurs at same time every night (around 6pm/7pm until 10pm), my guess is the VM was busy with something that affected the system clock. Please check if the Collector VM has Clock Sync mechanism properly in place and if there are any maintenance task scheduled around that time.

Additional Information

Found that /usr/sbin/ntpd wasn’t running on the collector, Customer restarted the service. After that --

The Webview Java process didn’t exhibit any CPU spikes last night. Also, no “clock” issues with the Collector happened last night since the ntpd service was restarted.