APM Introscope .NET Agent - Troubleshooting and Best Practices

Products

DX Application Performance Management CA Application Performance Management (APM / Wily / Introscope) DX APM SaaS

Issue/Introduction

The following is a high-list of techniques and suggestions to employ when troubleshooting the below .NET Agent issues:

1. Metrics not visible in metric browser
2. Common error messages
3. Common installation / configuration issues
4. .NET app crashes, slowness, high CPU, OutOfMemory

Environment

APM .NET Agent 10.x, 2x
APM .NET Agent Saas

Resolution

A) Metrics not visible in Metric Browser

SUGGESTIONS:

1. Check for a possible Clamp reached on the Agent or EM side:

a) Check if EM or Agent metric clamps have been reached.

To Check the EM clamps : Open the Metric Browser, expand the branch

Custom Metric Host (virtual) | Custom Metric Process (virtual) | Custom Metric Agent (virtual)(collector_host@port)(SuperDomain) | Enterprise manager | Connections

looks at the values for:

- "EM Historical Metric Clamped"
- "EM Live Metric Clamped"

The above metrics should all be 0.

b) To check the Agent clamp : expand the branch

looks at the value for : "is Clamped" metric, it should be 0.

Recommendation:

For SaaS: contact Broadcom Support for assistance increasing the SaaS APM EM clamps

For 10.8:

-Open the EM_HOME\config\apm-events-thresholds-config.xml

-Increase the below clamps as needed:

introscope.enterprisemanager.metrics.historical.limit
introscope.enterprisemanager.metrics.live.limit

This is a hot property, there is no need to restart the EM

NOTE: For more troubleshooting on APM EM server side refer to https://knowledge.broadcom.com/external/article/93176/apm-introscope-enterprise-manager-troub.html

2. Are using a SaaS Agent to connect to Introscope EM 10.8?

Make sure to update the IntroscopeAgent profile as below:

a) unset agentManager.credential as below:
agentManager.credential=

b) Add hidden property
introscope.agent.connection.compatibility.version=10.7

For more information refer to: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=142917

3. Check the Agent log and confirm the Agent is able to connect to the EM

Below is an example of an entry in the Agent log indicating that the Agent is not allowed to connect to the 10.8 EM:

[INFO] [IntroscopeAgent.IsengardServerConnectionManager] Connected controllable Agent to the Introscope Enterprise Manager at <your-EM>:5001,com.wily.isengard.postofficehub.link.net.DefaultSocketFactory. Host = "WIN", Process = "WAS9", Agent Name = "Weblogic", Active = "false".

B) Common Error Messages

In the IntroscopeAgent log

"[ERROR] [IntroscopeAgent.Agent] System.ComponentModel.Win32Exception (0x80004005): Access is denied"
at System.Diagnostics.ProcessManager.OpenThread(Int32 threadId, Int32 access)
at System.Diagnostics.ProcessThread.GetThreadTimes()
at System.Diagnostics.ProcessThread.get_TotalProcessorTime()
at com.wily.introscope.agent.feature.CpuTimingFeature.PutStartDataPoint(InvocationData data)

Suggestions:

1) Set introscope.agent.cpumethodtimebreakdown.enable=false in their introscopeagent.profile and restart IIS or the .NET App

2) Try commenting out all the corresponding SetTracerParameter lines in required.pbd and MVC.pbd that would set "cputiming true"? e.g.

in required.pbd:

#SetTracerParameter: PageInfoTracer cputiming true

in MVC.pbd:
#SetTracerParameter: DotNetMVCFrontendMarker cputiming true
#SetTracerParameter: DotNetMVCControllerAggregator cputiming true
#SetTracerParameter: ASPNETCoreAsyncFrontendTracer cputiming true
#SetTracerParameter: MVCControllerBlamePointTracer cputiming true
#SetTracerParameter: MVCControllerActionBlamePointTracer cputiming true
#SetTracerParameter: DotNetMVCFrontendMarkerAsync cputiming true

Back up the original IntroscopeAgent.profile, required.pbd and MVC.pbd before making these test configuration changes and restart the monitored .NET app

" java.net.SocketException: Socket is closed, java.net.SocketTimeoutException: Read timed out"

Possible root cause: network or firewall issues

In the Windows Events > Application log

"Failed to CoCreate profiler” Profiler CLSID: '{D6E0BA92-3BC3-45ff-B9CC-B4B5AB7190BC}

the above message indicates that there is another .NET profiler already installed. Only one .NET profiler can run at a time, see previous point.

"System.InvalidProgramException: Common Language Runtime detected an invalid program" "…cannot be activated due to an exception during compilation".

Disable “WCFRuntimeTracing” in the webservices.pbd as below:

#TurnOn: WCFRuntimeTracing

Then restart IIS or the monitored standalone .NET application.

Faulting application name: w3wp.exe, version: 8.5.9600.16384, time stamp: 0x5215df96
Faulting module name: KERNELBASE.dll, version: 6.3.9600.17415, time stamp: 0x54505737
Exception code: 0xe0434352
Fault offset: 0x0000000000008b9c
Faulting process id: 0x47e0
Faulting application start time: 0x01d138b4c21a8ef9
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\system32\KERNELBASE.dll
Report Id: 1f2322d9-a4a8-11e5-80f8-005056b772c9

1. Open the AGENT_HOME\toggles-typical.pbd.

2. Disable socket instrumentation as below.

#TurnOn: SocketTracing

3. Restart IIS

C) Common Installation / Configuration issues

a) Verify that the below "environment variables" exist

Open Command Prompt as Administrator, run: set
The output should be similar to the one below:

Cor_Enable_Profiling=0x1
COR_PROFILER={5F048FC6-251C-4684-8CCA-76047B02AC98}
com.wily.introscope.agentprofile=<install_dir>\wily\IntroscopeAgent.profile

b) Make sure permissions to the AGENT_HOME have been set accordingly

If you are instrumenting a .NET standalone app, make sure to run:
<AGENT_HOME>\wily\wilypermission.exe <AGENT_HOME>\wily <your application>

For example:
<AGENT_HOME>\wily\wilypermission.exe <AGENT_HOME>\wily mytestapp.exe

c) Verify that the CA .NET agent has been attached to the monitored .NET process, for example for IIS, w3wp.exe

-Put activity to the .NET Application, then verify that the agent has been attached to the .NET process

- Run:

tasklist /m wily*

Here is the output you should see:

Image Name                     PID Modules
========================= ======== ============================================
PerfMonCollectorAgent.exe        1300      wily.NativeProfiler.dll, wily.Agent.dll
w3wp.exe                                    4000      wily.NativeProfiler.dll, wily.Agent.dll,
                                                     wily.WebServicesAgent.ext.dll

d) Make sure the agent has been configured to connect to the right Introscope EM server name and port

Open the IntroscopeAgent.profile

Verify the property:

agentManager.url.1=<your EM>:<EM port>

e) disable generic.agent.trigger

Uncomment the property:

introscope.nativeprofiler.generic.agent.trigger.enabled=false

NOTE: There is no impact if the agent is up and running with this property false. The only reason this property is needed is in some cases the agent does not find the designated entry point to start and this property enables agents to go and hunt for the earliest safe method as an entry point.

f) Check that all the Agent logs have been created successfully

The below logs should be created for each instrumented application:
-AutoProbe.*.log
-IntroscopeAgent.*.log
-nativeprofiler_*.lo

g) If your application is still not visible from the Workstation or Webview try configuring the agent to monitor all .NET applications

This test will help you confirm if the issue is related to the application.
Open the IntroscopeAgent.profile, comment the "introscope.agent.dotnet.monitorApplications" property as below:

#introscope.agent.dotnet.monitorApplications=

h) Check for possible .NET profiler conflict

Only one profiler can run at a time, the Broadcom APM .NET Agent GUI is {5F048FC6-251C-4684-8CCA-76047B02AC98}

How to verify this condition:

- Open "Command Prompt" as Administrator
- Run the command: REG QUERY HKLM /f "COR_PROFILER" /s

Or redirect the output to a file: REG QUERY HKLM /f "COR_PROFILER" /s > apm_netagent_regquery_corproofiler.txt

In the above example AVICODE, SCOM or Microsoft Monitoring Agent is preventing the .NET Agent to run as expected.

SCOM GUID = AD5651A8-B5C8-46ca-A11B-E82AEC2B8E78

To resolve this issue you need to uninstall the "Microsoft Monitoring Agent" from Add/Remove programs from the affected server(s)

D) .NET app crashes, slowness, high CPU, OutOfMemory

You can try to isolate the issue by adjusting the Agent configuration as below:

TEST#1: Find out if the issue is related to the Agent instrumentation

-Open the IntroscopeAgent.profile , set introscope.autoprobe.enable=false
-Stop the Windows Perfmon collector service.
-Restart IIS or the .NET instrumented application.

If the problem occurs, contact Broadcom Support
If the problem doesn’t’ persists, proceed with TEST#2

TEST#2: Disable SOAP/HTTP header injection

Header insertion is enabled by default, however, some applications can be impacted by this change.

In this case you can disable header insertion as below:

a) enable back instrumentation : set introscope.autoprobe.enable=true

b) Open webservices.pbd , disable WebServicesCorrelationTracing group as below:

#TurnOn: WebServicesCorrelationTracing

c) Open MCV.pbd , disable GenericClientCorrelationTracing group as below:

#TurnOn: GenericClientCorrelationTracing

Save the files

d) Restart IIS or the .NET instrumented application.

If the problem doesn’t’ persists, proceed with TEST#3

TEST#3: Disable the entire Webservices and MVC instrumentation

Open webservices.pbd and MVC.pbd and disable all the lines that start with TurnOn as below:

#TurnOn: WebServicesProducerTracing
#TurnOn: WebServicesClientTracing
#TurnOn: WCFServerFaultTracing
#TurnOn: WCFClientFaultTracing
.....

Save the files

Restart IIS or the .NET instrumented application.

If the problem occurs, you need need to enable back the above traces groups 1 by 1 until you identify the root cause
If the problem doesn’t’ persists, proceed with TEST#4

TEST#4: Limit the # of monitored applications, try to monitor only specific application by updating the "introscope.agent.dotnet.monitorApplications" property

For example, to limit the agent to monitor IIS and the below DummyWinApp.exe application, you would need to update the monitorApplications property as below:
introscope.agent.dotnet.monitorApplications=w3wp.exe,aspnet_wp.exe,DummyWinApp.exe

TEST#5: Reduce Perform metric collection in IntroscopeAgent.profile to prevent CPU overhead / spikes

1.Stop the PerfMonCollectorAgent service

If the problem doesn’t’ persists, enable back PerfMonCollectorAgent service and try to reduce perfmon data collection as below

2.Open the IntroscopeAgent.profile, set:

introscope.agent.perfmon.category.browseEnabled=false
Purpose: to confirm whether disabling perfmon category browsing would significantly reduce the CPU overhead

introscope.agent.perfmon.metric.pollIntervalInSeconds=150
Purpose: to confirm if the CPU spikes is related to the polling interval and # of metrics, after increasing the value from 15 to 150 check if the CPU spikes occurs around every 150 seconds

introscope.agent.perfmon.metric.filterPattern=|Processor|*|*,Process|{osprocessname}|*,|ASP.NET*|*
Purpose: you might need to reduce the # of perfmon metrics being gathered. Depending on your application load the default setting could instruct the agent to collect a huge amount of metrics, introscope.agent.perfmon.metric.filterPattern=|Processor|*|*,|.NET Data Provider*|*|*,|.NET CLR*|{osprocessname}|*,|.NET CLR Data|*|*,|Process|{osprocessname}|*,|ASP.NET*|*

3.Start the PerfMonCollectorAgent service and verify if the issue persists.

TEST#6: Try to disable the Agent Traces feature:

com.wily.introscope.agent.blame.transaction.doTransactionTrace=false

This is hot property

What to collect if the problem persist?

Enable DEBUG logging:

- Open the logging.config.xml
- Change the logging level from INFO to DEBUG:

- Restart IIS or NET app

NOTE: if you cannot restart IIS, open the IntroscopeAgent.profile, make a change and save the file, this will force the update in the logging.config.xml to be taken into account

If IIS or NET app is crashing:

- open the IntroscopeAgent.profile
- set introscope.nativeprofiler.logBytecode=true and introscope.nativeprofiler.logAllMethodsNoticed=true

Try to reproduce the issue and collect the below information

Install logs and Agent logs (if any, AGENT_HOME/wily/logs)
AGENT_HOME/wily/IntroscopeAgent.profile
The result of "systeminfo" command.
The result of "set" command.
Exercise the application, then run "tasklist /m", send the output.
Scrennshot of the C:\windows\assembly folder, listing the wily.*.dll
Screenshot of application events from Windows Event viewer

If IIS or NET app is crashing, collect dump: there are multiple ways to capture dumps on.NET process.

Option 1: Bring up Task Manager, find the .NET process with the memory issue, and then right click on the process to select Create Dump File option in its context menu.

Option 2: (RECOMMENDED) Create a dump file when a process crashes as documented here : Microsoft .NET collecting dumps

Essentially it requires below system variables via Control Panel -> System -> Advanced System Setting -> Environment Variables

COMPlus_DbgEnableMiniDump value 1
COMPlus_DbgMiniDumpType value 1
COMPlus_DbgMiniDumpName value C:\TT\coredump  (set this to a location of your choice
                                               - ensure sufficient free space the files
                                               can be very large)
COMPlus_CreateDumpDiagnostics value 1

Then when a .Net process crashes it will write the dump file to the location defined above.

As this can create a lot of data I would suggest that once you have captured a dump from the failure that you then disable the dump creation by setting the variable COMPlus_DbgEnableMiniDump back to 0

The above all assumes of course that there is a process crash - which is what you indicated - when the failure occurs.

Ideally we recommend performing the above in a test environment assuming that the problem is reproducible there.

Additional Information

Techdocs Agent Troubleshooting