The following is a high-list of techniques and suggestions to employ when troubleshooting the below Introscope Java common performance and configuration issues:
- Java Agent Installation problems
- Instrumentation not working
- Agent overhead - high CPU and memory consumption
- Application slow response time and/or not working
-DX APM SaaS
-DX APM On-premise 10.x, 2x
SUGGESTIONS:
1. Check for a possible Clamp reached on the Agent or EM side:
a) Check if EM or Agent metric clamps have been reached.
To Check the EM clamps : Open the Metric Browser, expand the branch
Custom Metric Host (virtual) | Custom Metric Process (virtual) | Custom Metric Agent (virtual)(collector_host@port)(SuperDomain) | Enterprise manager | Connections
looks at the values for:
- "EM Historical Metric Clamped"
- "EM Live Metric Clamped"
The above metrics should all be 0.
b) To check the Agent clamp : expand the branch
Custom Metric Host (virtual) |Custom Metric Process (virtual) | Custom Metric Agent (virtual)(collector_host@port)(SuperDomain) |Agents | Host | Process |<AgentName>
looks at the value for : "is Clamped" metric, it should be 0.
Recommendation:
For SaaS: contact Broadcom Support for assistance increasing the SaaS APM EM clamps
For 10.8:
-Open the EM_HOME\config\apm-events-thresholds-config.xml
-Increase the below clamps as needed:
introscope.enterprisemanager.metrics.historical.limit
introscope.enterprisemanager.metrics.live.limit
This is a hot property, there is no need to restart the EM
NOTE: For more troubleshooting on APM EM server side refer to https://knowledge.broadcom.com/external/article/93176/apm-introscope-enterprise-manager-troub.html
2. Are using a SaaS Agent to connect to Introscope EM 10.8?
Make sure to update the IntroscopeAgent profile as below:
a) unset agentManager.credential as below:
agentManager.credential=
b) Add hidden property
introscope.agent.connection.compatibility.version=10.7
For more information refer to: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=142917
3. Check the Agent log and confirm the Agent is able to connect to the EM
Below is an example of an entry in the Agent log indicating that the Agent is not allowed to connect to the 10.8 EM:
[INFO] [IntroscopeAgent.IsengardServerConnectionManager] Connected controllable Agent to the Introscope Enterprise Manager at <your-EM>:5001,com.wily.isengard.postofficehub.link.net.DefaultSocketFactory. Host = "WIN", Process = "WAS9", Agent Name = "Weblogic", Active = "false".
4. If you are not able to see all the JMX metrics, it could be due to the JMX clamp;
[WARN] [IntroscopeAgent.Clamp] JMX metric clamp has been hit, no additional JMX metrics will be reported
Recommendation: increase the value of introscope.agent.remotejmx.clamp
SUGGESTIONS:
1. If you are using JVM J9 from IBM,
It is recommended to use AgentNoRedefNoRetrans.jar and IntroscopeAgent.NoRedef.profile instead.
2. If Websphere appsever is in-use and jvm is unable to startup
Check if appserver log is reporting a NoClassDefFoundError, restart the appserver with JVM option -Xshareclasses:none
java.lang.NoClassDefFoundError: com.wily.introscope.agent.blame.VirtualStack$TransactionCache (initialization failure)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:140)
at com.wily.introscope.agent.blame.VirtualStack$VirtualStackCursor.<init>(VirtualStack.java:694)
at com.wily.introscope.agent.blame.VirtualStack$1.initialValue(VirtualStack.java:723)
at com.wily.introscope.agent.blame.VirtualStack$1.initialValue(VirtualStack.java:721)
For more information refer to : https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=93003
3. Common errors:
a) [WARN] [IntroscopeAgent.DnsServiceExecutor] Failed to obtain host name in 200 ms. Consider changing property introscope.agent.dns.lookup.max.wait.in.milliseconds
Recommendation: Disable DNS lookup:
https://knowledge.broadcom.com/external/article/198498/java-agent-dns-lookup-may-be-causing-per.html
b) [WARN] [IntroscopeAgent.Agent] Absolute metric clamp of 100 reached, no more new SQL metrics can be created. All metrics will be reported under the default bucket.
Recommendation: Review https://knowledge.broadcom.com/external/article/195060/no-more-new-sql-metrics-can-be-created.html
Although you can increase clamp introscope.agent.sqlagent.statement.clamp, the message is an indication of a high # of sql metrics, try to refine and reduce the sql metrics collection, refer to below section on how to reduce disable/ reduce SQL instrumentation in order to isolate the issue.
4. Find out if the problem is related to the instrumentation
Open the IntroscopeAgent profile, set introscope.autoprobe.enable=false; you need to restart the jvm.
If the problem does not persist, set introscope.autoprobe.enable=true, disable all the below features and check if the problem persists.
Then, enable the features 1 by 1 in order to identify the root cause
a) Set introscope.agent.reduceAgentMemoryOverhead=true
b) disable Smart Instrumentation:
introscope.agent.deep.entrypoint.enabled=false
introscope.agent.deep.instrumentation.enabled=false
introscope.agent.deep.trace.enabled=false
introscope.agent.deep.errorsnapshot.enable=false
introscope.agent.deep.automatic.trace.crossprocess.enabled=false
introscope.agent.automatic.backend.detection.enabled=false
IMPORTANT: delete AutoPersist.pbd file from config/hotdeploy folder
c) 10.7 and 10.8 Agent includes a new default-micro.pbl and IntroscopeAgentMicro.profile specific for the appservers running in Container platforms:
https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/application-performance-management/10-8.html
Alternatively you can disable manually below feature:
introscope.agent.reduceAgentMemoryOverhead=true
introscope.agent.sqlagent.sql.turnoffmetrics=true
introscope.agent.transactiontracer.sampling.enabled=false
introscope.agent.remoteagentdynamicinstrumentation.enabled=true
introscope.agent.appmap.enabled=false
introscope.agent.bizRecording.enabled=false
introscope.agent.external.biz.enabled=false
d) If you are using Agent version 9.5 or previous releases and you are experiencing very long delays on application startup, this might be due to the large number of classes that are loaded by the application.
To resolve the problem set introscope.autoprobe.deepinheritance.enabled=false
Disabling deep Inheritance will fall back to pre-90 behavior (only 2 levels of class inheritance).
In 9.5.5 and onward versions: deep inheritance mechanism has been improved and will shut down automatically to prevent this issue.
Additional clamps properties have been added to better control this condition: introscope.autoprobe.deepinheritance.auto.turnoff.*.
In the agent log file you will see entries as below when the deep inheritance has automatically been turned off, for example:
[INFO] [IntroscopeAgent.DeepInheritanceHelper] Deep inheritance configuration: max total time spent (ms) 120000 ; max interval time spent (ms): 12000 ; requests in interval: 100
[INFO] [IntroscopeAgent.DeepInheritanceHelper] Deep inheritance is automatically turned off due to large total delay. total time spent (ms) 120001 ; total requests: 161027255 ; interval time spent (ms): 1 ; interval requests: 95
e) Disable JMX collection: set introscope.agent.jmx.enable=false
f) If you are using 10.x
Switch from <appserver>-full.pbl to <appserver>-typical.pbl in the IntroscopeAgent.profile, introscope.autoprobe.directivesFile property
The full profile provides greater visibility but incurs higher overhead.
g) Disable socket instrumentation:
If you are using 10.x: Turn off the socket tracers in toggles-typical.pbd or toggels-full.pbd file as below:
#TurnOn: ManagedSocketTracing
#TurnOn: SocketTracing
#TurnOn: NIOSocketTracing
#TurnOn: NIOSocketSummaryTracing
#TurnOn: NIOSelectorTracing
#TurnOn: NIODatagramTracing
#TurnOn: NIODatagramSummaryTracing
If you are using APM SaaS Agent: Turn off the socket tracers in acc-master-toggles.pbd file as below:
#TurnOn: ManagedSocketTracing
#TurnOn: SocketTracing
#TurnOn: NIOSocketTracing
#TurnOn: NIOSocketSummaryTracing
#TurnOn: NIOSelectorTracing
#TurnOn: NIODatagramTracing
#TurnOn: NIODatagramSummaryTracing
NOTE: depending of the agent package for example if you are using Webshere Agent NoRedef, only TurnOn: ManagedSocketTracing will be avaialble.
- Alternatively, reduce socket clamp: starting from Agent version 9.5.5 there is a new socket clamping that applies to all kind of sockets above (IOSockets, NIOSockets, NIODatagra and Managed Sockets):
com.wily.introscope.agent.socket.clamp.level=100 (it is not hot deploy)
h) If you noticed a high # of SQL metrics, disable/ reduce SQL instrumentation:
- Reduce the length of SQL statements. The default maximum length captured by the agent is 999. You can modify this by adding the following line to the IntroscopeAgent.profile file:
introscope.agent.sqlagent.sql.maxlength=
- Set the following properties:
introscope.agent.sqlagent.sql.turnoffmetrics=true
introscope.agent.sqlagent.sql.artonly=true
introscope.agent.sqlagent.sql.turnofftrace=true
-Turn Off SQL Statement metrics if the application generates large numbers of unique SQL statements, causing a metric explosion. When you turn off SQL statement metric reporting, the backend or top-level JDBC metrics are not lost:
Open the sqlagent.pbd file and locate the SQL statements. For example:
TraceOneMethodWithParametersIfFlagged: SQLAgentStatements executeQuery(Ljava/lang/String;)Ljava/sql/ResultSet; DbCommandTracer "Backends|{database}|SQL|{commandtype}|Query|{sql}"
Remove {sql} from the trace directives you want to turn off. For example:
TraceOneMethodWithParametersIfFlagged: SQLAgentStatements executeQuery(Ljava/ lang/String;)Ljava/sql/ResultSet; DbCommandTracer "Backends|{database}|SQL|{commandtype}|Query"
i) Disable any custom pbd
This to help confirm that the issue has not been introduced by the custom pbd
j) Reduce or disable Transaction Traces feature:
- Running long periods of automatic Transaction collection of downstream traces due to tail filter can cause large numbers of unwanted traces to be sent to the Enterprise Manager
To disable propagation set introscope.agent.transactiontracer.tailfilterPropagate.enable=false on the Agent,
You must restart the managed application
- Turn off SOA boundary tracing, depending on the implementation; it could cause too many traces or too huge traces affecting both Agent and EM (crash/OOM)
Set com.wily.introscope.agent.transactiontrace.boundaryTracing.enable=false
- Alternately, as test you can try to disable the entire traces feature globally
Set com.wily.introscope.agent.blame.transaction.doTransactionTrace=false
It is a hot property no need to restart the appserver. It is valid for .NET and Java agents
k) disable BrowserAgent extension:
introscope.agent.browseragent.autoInjectionEnabled=false
introscope.agent.browseragent.responseCookieEnabled=false
-Remove browseragent.pbd from introscope.autoprobe.directivesFile or comment pbd from <appserver>-typical.pbl or <appserver>-full.pbl
-Move BrowserAgentExt.jar out of the /core/ext directory
l) Disable header insertion.
SOAP header insertion is enabled by default, however, some applications can be impacted by this change. In this case you can disable header insertion as below:
com.wily.introscope.agent.soapheaderread.enabled=false
com.wily.introscope.agent.soapheaderinsertion.enabled=false
com.wily.introscope.agent.httpheaderread.enabled=false
com.wily.introscope.agent.httpheaderinsertion.enabled=false
- If the application works as expected, you can try to enable http header insertion instead:
com.wily.introscope.agent.httpheaderread.enabled=true
com.wily.introscope.agent.httpheaderinsertion.enabled=true
- If the above doesn't disable the header insertion, then it means that correlation ID was likely inserted by a correlation tracer not supporting the agent properties for disabling SOAP/HTTP header insertion, you can try the below additional manual changes:
NOTE: make sure to backup the mentioned pbd files first.
Option 1) Open the <agent-home>/core/config/httpcorrelation.pbd
Comment out "TurnOn: HTTPCorrelation"
Option 2) Open he <agent-home>/core/config/apm-common.pbd
Comment out the below lines:
#TraceComplexMethodsIfFlagged: SessionBeanTracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: SessionBean3Tracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBeanTracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBean3Tracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBeanTracing CorPreTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBean3Tracing CorPreTracer "CorId"
#TraceOneMethodIfFlagged: HttpServletTypeSpecificServiceTracing service(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V CorPreTracer "CorId"
#TraceOneMethodIfFlagged: HTTPServletTracing service(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V CorPreTracer "CorId"
Option 3) If the correlation ID is still being inserted in the header, then open <agent-home>/core/config/j2ee.pbd
Comment out the below lines:
#TraceComplexMethodsIfFlagged: EJB2StubTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: EJB3StubTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: SessionBeanTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: SessionBean3Tracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBeanTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: EntityBean3Tracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBeanTracing CorTracer "CorId"
#TraceComplexMethodsIfFlagged: MessageDrivenBean3Tracing CorTracer "CorId"
5. Upgrade to latest Agent versions: https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/dx-apm-agents/SaaS/release-notes.html
APM SaaS agents are backward compatible with APM onpremise releases (10.8 and 2x) : https://knowledge.broadcom.com/external/article/142917/how-to-configure-dxi-apm-saas-agents-or.html
SUGGESTIONS:
1. We do not do any calculation we just get the values provided by WebSphere API.
2. The agent is designed to support any PMI modules. User has to configure them in agent profile as follows:
introscope.agent.pmi.enable.New\ Module=true
The name of the module is case sensitive. If the name of the module contains a space, then it has to be escaped with a backslash "\". For details refer to the documentation
3. Verify that counters have been enabled from Webphere Tivoli PMI viewer, here is an example illustrating that counters are disabled:
You can also check the pmi-config.xml from the WAS, verify that counters have been enabled, below an example:
<pmimodules xmi:id="PMIModule_120" moduleName="transactionModule" type="transactionModule" enable="4,16,14" />
4. In the agent logs, check that PMI collection is enable and working as expected. Make sure that the respective PMI groups have been enabled in the Agent profile:
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.enable=true
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.enable.SCAStats.RootGroup=false
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.enable.SIB Service=true
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.enable.StatGroup.SIBService=true
...
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.enable.wlmModule=false
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.enable.wsgwModule=false
11/28/22 03:42:22 PM GMT [INFO] [IntroscopeAgent.Properties] introscope.agent.pmi.filter.objref=false
11/28/22 03:43:10 PM GMT [INFO] [IntroscopeAgent.Agent] Will start PMI WebAppSupport in 5 seconds
11/28/22 03:43:15 PM GMT [INFO] [IntroscopeAgent.Agent] Starting PMI WebAppSupport
11/28/22 03:43:15 PM GMT [INFO] [IntroscopeAgent.Agent] PMI data collection activated
5. Enable DEBUG logging to confirm that the agent is able to collect the respective PMI information.
A "Got null" string indicates a problem collecting information for that counter.
12/01/22 05:59:53 AM GMT [DEBUG] [IntroscopeAgent.Agent] Got webAppModule/ PmiRegistry: webAppModule
12/01/22 05:59:53 AM GMT [DEBUG] [IntroscopeAgent.Agent] Got servletSessionsModule/ PmiRegistry: servletSessionsModule
12/01/22 06:00:00 AM GMT [DEBUG] [IntroscopeAgent.Agent] Got StatGroup.SIBService/ PmiRegistry: StatGroup.SIBService
12/01/22 06:00:00 AM GMT [DEBUG] [IntroscopeAgent.Agent] Got null for SIB Service./ from PmiRegistry
1. DEBUG agent logs from AGENT_HOME/logs
To enable DEBUG logging set log4j.logger.IntroscopeAgent=DEBUG,console (to redirect the output to a file set log4j.logger.IntroscopeAgent=DEBUG, logfile)
IMPORTANT: Starting from Agent version 23.5 onwards, log4j has been replaced by logback framework, to run the Agent in Debug Mode, set introscope.agent.log.level.root=DEBUG, for more information refer to: https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/dx-apm-agents/SaaS/java-agent/configure-java-monitoring/java-agent-logging.html
2. IntroscopeAgent.profile
3. Generate a series of 5 thread dumps on the application server for OOM/high CPU situations spaced 30 seconds apart.
4. Appserver logs
5. App server config or startup script files.
6. Core dump, if applicable.
7. In case of OOM, collect heapdump. Additional jvm switches will be required for this.
For Sun jvm, add the following jvm switch: -XX:+HeapDumpOnOutOfMemoryError
8. Enable GC log. Additional jvm switches will be required for this.
For Sun jvm, add the following jvm switches: -Xloggc:<filename>.log -XX:+PrintGCDetails
9. If the problem is related to the classloading:
a) enable verbose class loading by adding jvm argument: -verbose:class
b) add to the Agent profile, introscope.autoprobe.logclassdetails.enabled=set and collect the Autoprobe.log