APM Introscope .Java Agent - Troubleshooting and Best Practices

book

Article ID: 145354

calendar_today

Updated On:

Products

CA Application Performance Management (APM / Wily / Introscope)

Issue/Introduction

The following is a high-list of techniques and suggestions to employ when troubleshooting the below Introscope Java common performance and configuration issues:

- Java Agent Installation problems
- Instrumentation not working
- Agent overhead - high CPU and memory consumption
- Application slow response time and/or not working

Environment

- APM Introscope EM 10.x

- APM Agents 10.x, 11.x, Agents from SaaS platform

Resolution


Agent Metrics not visible in Investigator

Suggestions:

1. Check the agent clamp from the Metric Browser, expand the branch

 Custom Metric Host (virtual)
   - Custom Metric Process (virtual)
      - Custom Metric Agent (virtual)([email protected])(SuperDomain)
         - Agents
            - Host
               - Process
                   - AgentName

looks at the values for : “is Clamped” metric, it should be 0

2. If you are using an Agent package from SaaS to connect to Introscope EM 10.7. x

Make sure to update the IntroscopeAgent profile as below:

a) unset agentManager.credential as below:
agentManager.credential=

b) Add hidden property
introscope.agent.connection.compatibility.version=10.7

For more information refer to: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=142917




JVM Hangs, Crashes, High CPU, OutOfMemory

Suggestions:

1. If  you are using JVM J9 from IBM, it is recommended to use AgentNoRedefNoRetrans.jar and IntroscopeAgent.NoRedef.profile  instead.

2. In case of high CPU, disable manually Platform monitor: 
Move all the libIntroscope<OS>Stats.* and introscope<OS>Stats.jar files that corresponds to your platform to another directory outside the agent directory

3. If Websphere appsever is in-use and jvm is unable to startup, check if appserver log is reporting a NoClassDefFoundError, restart the appserver with JVM option -Xshareclasses:none

java.lang.NoClassDefFoundError: com.wily.introscope.agent.blame.VirtualStack$TransactionCache (initialization failure)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:140)
at com.wily.introscope.agent.blame.VirtualStack$VirtualStackCursor.<init>(VirtualStack.java:694)
at com.wily.introscope.agent.blame.VirtualStack$1.initialValue(VirtualStack.java:723)
at com.wily.introscope.agent.blame.VirtualStack$1.initialValue(VirtualStack.java:721)

For more information refer to : https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=93003

4. If you are using Java 11, you need to use Agent packages from SaaS:

For more information refer to : https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=129997

5. Find out if the problem is related to the instrumentation
Open the IntroscopeAgent profile, set introscope.autoprobe.enable=false; you need to restart the jvm.  

If the problem does not persist, set introscope.autoprobe.enable=true, disable all the below features and check if the problem persists.
Then, enable the features 1 by 1 in order to identify the root cause

a) Set introscope.agent.reduceAgentMemoryOverhead=true

b) disable Smart Instrumentation:
introscope.agent.deep.entrypoint.enabled=false
introscope.agent.deep.instrumentation.enabled=false
introscope.agent.deep.trace.enabled=false
introscope.agent.deep.errorsnapshot.enable=false
introscope.agent.deep.automatic.trace.crossprocess.enabled=false
introscope.agent.automatic.backend.detection.enabled=false

IMPORTANT: delete AutoPersist.pbd file from config/hotdeploy folder

c) 10.7 Agent includes a new default-micro.pbl and IntroscopeAgentMicro.profile specific for the appservers running in Container platforms: 

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/application-performance-management/10-7/implementing-agents/java-agent/install-the-java-agent/configure-application-server-to-use-the-java-agent/microservices.html

Alternatively you can disable manually below feature:

introscope.agent.reduceAgentMemoryOverhead=true
introscope.agent.sqlagent.sql.turnoffmetrics=true
introscope.agent.transactiontracer.sampling.enabled=false
introscope.agent.remoteagentdynamicinstrumentation.enabled=true
introscope.agent.appmap.enabled=false
introscope.agent.bizRecording.enabled=false
introscope.agent.external.biz.enabled=false

d) If you are using Agent version 9.5 or previous releases and you are experiencing very long delays on application startup, this might be due to the large number of classes that are loaded by the application.
To resolve the problem set introscope.autoprobe.deepinheritance.enabled=false
Disabling deep Inheritance will fall back to pre-90 behavior (only 2 levels of class inheritance). 

In 9.5.5 and onward version: deep inheritance mechanism has been improved and will shut down automatically to prevent this issue.
Additional clamps properties have been added to better control this condition: introscope.autoprobe.deepinheritance.auto.turnoff.*.
In the agent log file you will see entries as below when the deep inheritance has automatically been turned off, for example:

[INFO] [IntroscopeAgent.DeepInheritanceHelper] Deep inheritance configuration: max total time spent (ms) 120000 ; max interval time spent (ms): 12000 ; requests in interval: 100
[INFO] [IntroscopeAgent.DeepInheritanceHelper] Deep inheritance is automatically turned off due to large total delay. total time spent (ms) 120001 ; total requests: 161027255 ; interval time spent (ms): 1 ; interval requests: 95

e) Disable JMX collection: set introscope.agent.jmx.enable=false

f) Switch from <appserver>-full.pbl to <appserver>-typical.pbl in the IntroscopeAgent.profile, introscope.autoprobe.directivesFile property
The full profile provides greater visibility but incurs higher overhead.

g) Disable socket instrumentation:

- Turn off the socket tracers in toggles typical/full pbd file as below:

#TurnOn: ManagedSocketTracing
#TurnOn: SocketTracing
#TurnOn: NIOSocketTracing
#TurnOn: NIOSocketSummaryTracing
#TurnOn: NIOSelectorTracing
#TurnOn: NIODatagramTracing
#TurnOn: NIODatagramSummaryTracing

- Alternatively, reduce socket clamp: starting from Agent version 9.5.5 there is a new socket clamping that applies to all kind of sockets above (IOSockets, NIOSockets, NIODatagra and Managed Sockets):
com.wily.introscope.agent.socket.clamp.level=100 (it is not hot deploy)

h) If you noticed a high # of SQL metrics, disable/ reduce SQL instrumentation:

-  Reduce the length of SQL statements. The default maximum length captured by the agent is 999. You can modify this by adding the following line to the IntroscopeAgent.profile file:
introscope.agent.sqlagent.sql.maxlength=

- Set the following properties:
introscope.agent.sqlagent.sql.turnoffmetrics=true
introscope.agent.sqlagent.sql.artonly=true
introscope.agent.sqlagent.sql.turnofftrace=true

-Turn Off SQL Statement metrics if the application generates large numbers of unique SQL statements, causing a metric explosion. When you turn off SQL statement metric reporting, the backend or top-level JDBC metrics are not lost:

Open the sqlagent.pbd file and locate the SQL statements. For example:
TraceOneMethodWithParametersIfFlagged: SQLAgentStatements executeQuery(Ljava/lang/String;)Ljava/sql/ResultSet; DbCommandTracer "Backends|{database}|SQL|{commandtype}|Query|{sql}"

Remove {sql} from the trace directives you want to turn off. For example:

TraceOneMethodWithParametersIfFlagged: SQLAgentStatements executeQuery(Ljava/ lang/String;)Ljava/sql/ResultSet; DbCommandTracer "Backends|{database}|SQL|{commandtype}|Query"

i) Disable any custom pbd

This to help confirm that the issue has not been introduced by the custom pbd

j) Reduce or disable Transaction Traces feature:

- Running long periods of automatic Transaction collection of downstream traces due to tail filter can cause large numbers of unwanted traces to be sent to the Enterprise Manager
To disable propagation set introscope.agent.transactiontracer.tailfilterPropagate.enable=false on the Agent,
You must restart the managed application

- Turn off SOA boundary tracing, depending on the implementation; it could cause too many traces or too huge traces affecting both Agent and EM (crash/OOM)
Set com.wily.introscope.agent.transactiontrace.boundaryTracing.enable=false

- Alternately, as test you can try to disable the entire traces feature globally
Set com.wily.introscope.agent.blame.transaction.doTransactionTrace=false
It is a hot property no need to restart the appserver. It is valid for .NET and Java agents

k) disable BrowserAgent extension:

introscope.agent.browseragent.autoInjectionEnabled=false
introscope.agent.browseragent.responseCookieEnabled=false
-Remove browseragent.pbd from introscope.autoprobe.directivesFile or comment pbd from <appserver>-typical.pbl or <appserver>-full.pbl
-Move BrowserAgentExt.jar out of the /core/ext directory

l) Disable header insertion.
SOAP header insertion is enabled by default, however, some applications can be impacted by this change. In this case you can disable header insertion as below:

com.wily.introscope.agent.soapheaderread.enabled=false
com.wily.introscope.agent.soapheaderinsertion.enabled=false
com.wily.introscope.agent.httpheaderread.enabled=false
com.wily.introscope.agent.httpheaderinsertion.enabled=false

- If the application works as expected,  you can try to enable http header insertion instead:

com.wily.introscope.agent.httpheaderread.enabled=false
com.wily.introscope.agent.httpheaderinsertion.enabled=false

- If the above doesn't disable the header insertion, then it means that correlation ID was likely inserted by a correlation tracer not supporting the agent properties for disabling SOAP/HTTP header insertion, you can try the below additional manual changes:

NOTE: make sure to backup the mentioned pbd files first.

Option 1) Open the <agent-home>/core/config/httpcorrelation.pbd

Comment out "TurnOn: HTTPCorrelation"

Option 2) Open he <agent-home>/core/config/apm-common.pbd

Comment out the below lines:

#TraceComplexMethodsIfFlagged: SessionBeanTracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: SessionBean3Tracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBeanTracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBean3Tracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBeanTracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBean3Tracing CorPreTracer "CorId"

#TraceOneMethodIfFlagged: HttpServletTypeSpecificServiceTracing service(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V CorPreTracer "CorId"
#TraceOneMethodIfFlagged: HTTPServletTracing service(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V CorPreTracer "CorId"

Option 3) If the correlation ID is still being inserted in the header, then open <agent-home>/core/config/j2ee.pbd

Comment out the below lines:

#TraceComplexMethodsIfFlagged: EJB2StubTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: EJB3StubTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: SessionBeanTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: SessionBean3Tracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBeanTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBean3Tracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBeanTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBean3Tracing CorTracer "CorId"

6. (Recommended) If possible upgrade to latest Agent versions and HOTFIXES 

For more information refer to: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=136720


What to collect if the problem persist?

1. Zipped content of AGENT_HOME/logs (DEBUG logs are required)
2. IntroscopeAgent.profile
3. Generate a series of 5 thread dumps on the application server for OOM/high CPU situations spaced 5 -10 seconds apart. 
4. Appserver logs
5. App server config or startup script files.
6. Core dump, if applicable.
7. In case of OOM, collect heapdump. Additional jvm switches will be required for this.
For Sun jvm, add the following jvm switch: -XX:+HeapDumpOnOutOfMemoryError
8. Enable GC log. Additional jvm switches will be required for this.
For Sun jvm, add the following jvm switches: -Xloggc:<filename>.log -XX:+PrintGCDetails 
9. If the problem is related to the classloading, enable verbose class loadinig, this is activated by adding this java option:  -verbose:class

Additional Information

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/application-performance-management/10-7/troubleshooting/agent-troubleshooting.html