search cancel

APM Introscope .NET Agent - Troubleshooting and Best Practices

book

Article ID: 111638

calendar_today

Updated On:

Products

CA Application Performance Management Agent (APM / Wily / Introscope) INTROSCOPE

Issue/Introduction

The following is a high-list of techniques and suggestions to employ when troubleshooting the below Introscope .NET common performance and configuration issues:

- NET Agent Installation problems
- Instrumentation not working
- NET app crash, broken or stop responding
- Agent overhead - high CPU and memory consumption
- Application slow response time.
 

Environment

APM 10.x, 21.X

Resolution

1) Verify that the correct version of wily.Agent.dll has been registered in the GAC (c:\windows\assembly)

For example after installing .NET Agent version 10.5.2

If not listed, you need to register it manually, drag and drop AGENT_HOME\wily\bin\wily.Agent.dll into C:\Windows\assembly

 

2) Verify that the below "environment variables" exist


Open Command Prompt as Administrator, run: set
The output should be similar to the one below:
 
Cor_Enable_Profiling=0x1
COR_PROFILER={5F048FC6-251C-4684-8CCA-76047B02AC98}
com.wily.introscope.agentprofile=<install_dir>\wily\IntroscopeAgent.profile
 

 

3) Make sure permissions to the AGENT_HOME have been set accordingly


If you are instrumenting a .NET standalone app, make sure to run:
<AGENT_HOME>\wily\wilypermission.exe <AGENT_HOME>\wily <your application>
 
For example:
<AGENT_HOME>\wily\wilypermission.exe <AGENT_HOME>\wily mytestapp.exe



4) Verify that the CA .NET agent has been attached to the monitored .NET process, for example for IIS, w3wp.exe

-Put activity to the .NET Application, then verify that the agent has been attached to the .NET process

- Run:

tasklist /m wily*

Here is the output you should see:
 
Image Name                     PID Modules
========================= ======== ============================================
PerfMonCollectorAgent.exe        1300      wily.NativeProfiler.dll, wily.Agent.dll
w3wp.exe                                    4000      wily.NativeProfiler.dll, wily.Agent.dll,
                                                                  wily.WebServicesAgent.ext.dll

 

5) Make sure the agent has been configured to connect to the right Introscope EM servername and port


Open the IntroscopeAgent.profile

Verify the property:

agentManager.url.1=<your EM>:<EM port>

 

6) disable generic.agent.trigger

Uncomment the property:

introscope.nativeprofiler.generic.agent.trigger.enabled=false
 

NOTE: There is no impact if the agent is up and running with this property false. The only reason this property is needed is in some cases the agent does not find the designated entry point to start and this property enables agents to go and hunt for the earliest safe method as an entry point. 

 

7) Check that all the Agent logs have been created successfully


The below logs should be created for each instrumented application:
-AutoProbe.*.log
-IntroscopeAgent.*.log
-nativeprofiler_*.log
 


8) If your application is still not visible from the Workstation or Webview try configuring the agent to monitor all .NET applications


This test will help you confirm if the issue is related to the application.
Open the IntroscopeAgent.profile, comment  the "introscope.agent.dotnet.monitorApplications" property as below:

#introscope.agent.dotnet.monitorApplications=


9) You can see the "PerfmonCollectorAgent.exe" but the .NET agent is missing in the Investigator


Check if there is another .NET profiler installed in the window server(s)

IMPORTANT: Only 1 .NET profiler can run at a time.

The GUID of the  CA APM .NET Agent is {5F048FC6-251C-4684-8CCA-76047B02AC98}

 

- Open "Command Prompt" as Administrator,

- Run the command:  REG QUERY HKLM /f "COR_PROFILER" /s 

 

Or redirect the output to a file:  REG QUERY HKLM /f "COR_PROFILER" /s > apm_netagent_regquery_corproofiler.txt

In the above example AVICODE, SCOM or Microsoft Monitoring Agent is preventing the .NET Agent to run

SCOM GUID = AD5651A8-B5C8-46ca-A11B-E82AEC2B8E78

To resolve this issue you need to uninstall the "
Microsoft Monitoring Agent" from Add/Remove programs from the affected server(s). 

 

10) Check for possible errors in the the Windows Event viewer > Application log


a) “Failed to CoCreate profiler” “The profiler was loaded successfully.  Profiler CLSID: '{
D6E0BA92-3BC3-45ff-B9CC-B4B5AB7190BC}

the above message indicates that there is another .NET profiler already installed. Only one .NET profiler can run at a time, see previous point.

b) "System.InvalidProgramException: Common Language Runtime detected an invalid program"  "…cannot be activated due to an exception during compilation".

Disable “WCFRuntimeTracing” in the webservices.pbd as below:

#TurnOn: WCFRuntimeTracing


Then restart IIS  or the monitored standalone .NET application.

 

11) If you are using Ninject, manually update the .NET application configuration to use UseReflectionBasedInjection NinjectSetting


Make sure the reflection setting is added to all calls to the Kernel init.

For example:
public NinjectDependencyResolver()
{
kernel = new StandardKernel(new {UseReflectionBasedInjection = true;});
AddBindings();
}
..
For more information see:
http://stackoverflow.com/questions/5772860/exception-when-using-ninject
http://stackoverflow.com/questions/11989300/running-ninject-3-on-medium-trust-level
http://stackoverflow.com/questions/19770350/invalidprogramexception-profiling-a-webforms-app-using-ninject

For exact details how to enable UseReflectionBasedInjection, contact Ninject support or communities: http://www.ninject.org/community.html

12) If the agent is successfully reporting metrics to the Investigator but you don’t see all the expected metrics then check for possible metric clamps


a) From the Workstation or Webview, check if the Agent metric clamp has been reached, expand the branch:

 Custom Metric Host (virtual)
   - Custom Metric Process (virtual)
      - Custom Metric Agent (virtual)([email protected])(SuperDomain)
         - Agents
            - Host
               - Process
                   - AgentName

looks at the values for : “is Clamped” metric, it should return zero(0)

b) Enable Verbose logging and check if the perfom metric clamp has been reached. 

- Open the logging.config.xml, enable verblose logging as below:
<level value="VERBOSE"/>
- Restart IIS or the .NET standalone process .  If you cannot restart the .NET process immediately,  you can force the agent to recognized this change by saving the IntroscopeAgent.profile.

- if the permon metric clamp has been reached you will see in the the Agent log the below message:

[VERBOSE] [IntroscopeAgent.PerfMonService] Metric limit of xx has been reached

- Uncomment the below property and increase the value as needed, then restart the "CA APM PerfMon Collector Service".

#introscope.agent.perfmon.metric.limit=1000

13) In case of high CPU/Memory usage, slow response time or application no working after the agent has been installed

You can try to isolate the issue by adjusting the Agent configuration as below:

TEST#1: Find out if the issue is related to the Agent instrumentation

-Open the IntroscopeAgent.profile , set introscope.autoprobe.enable=false
-Stop the Windows Perfmon collector service.
-Restart IIS or the .NET instrumented application.
 
If the problem occurs, contact Broadcom Support
If the problem doesn’t’ persists, proceed with TEST#2

TEST#2: disable socket, MVC and Webservices instrumentation 

a) enable back instrumentation :
set introscope.autoprobe.enable=true
 
b) Disable Socket instrumentation in the toggles-typical or toggles-full.pbd as below:
#TurnOn: SocketTracing
 
c) Open webservices.pbd and MVC.pbd and disable the instrumentation by comment "ALL* lines that start with TurnOn as below for example:

#TurnOn: WebServicesProducerTracing
#TurnOn: WebServicesClientTracing
#TurnOn: WebServicesCorrelationTracing
#TurnOn: WCFServerFaultTracing
#TurnOn: WCFClientFaultTracing
.....

Save the file

Restart IIS or the .NET instrumented application.
 
If the problem occurs, you need need to enable back the  traces groups 1 by 1 until you identify the root cause
If the problem doesn’t’ persists, proceed with TEST#3

TEST#3: Limit the # of monitored applications, try to monitor only specific application by updating the "introscope.agent.dotnet.monitorApplications" property


For example, to limit the agent to monitor IIS and the below DummyWinApp.exe application, you would need to update the monitorApplications property as below:
introscope.agent.dotnet.monitorApplications=w3wp.exe,aspnet_wp.exe,DummyWinApp.exe

 

TEST#4: If your application is using SQLServer, you can try to disable SQL instrumentation

introscope.agent.sqlagent.sql.turnoffmetrics=true
introscope.agent.sqlagent.sql.artonly=true
introscope.agent.sqlagent.sql.turnofftrace=true

 

TEST#5: Try to disable the Agent Traces feature:

com.wily.introscope.agent.blame.transaction.doTransactionTrace=false

This is hot property

 

TEST#6: Disable and Reduce Perform metric collection in IntroscopeAgent.profile to prevent CPU overhead / spikes

1.Stop the PerfMonCollectorAgent service

If the problem doesn’t’ persists, enable back PerfMonCollectorAgent service and try to reduce perfmon data collection as below


2.Open the IntroscopeAgent.profile, set:

introscope.agent.perfmon.category.browseEnabled=false
Purpose: to confirm whether disabling perfmon category browsing would significantly reduce the CPU overhead

introscope.agent.perfmon.metric.pollIntervalInSeconds=150
Purpose: to confirm if the CPU spikes is related to the polling interval and # of metrics, after increasing the value from 15 to 150 check if the CPU spikes occurs around every 150 seconds

introscope.agent.perfmon.metric.filterPattern=|Processor|*|*,Process|{osprocessname}|*,|ASP.NET*|*
Purpose: you might need to reduce the # of perfmon metrics being gathered. Depending on your application load the default setting could instruct the agent to collect a huge amount of metrics, introscope.agent.perfmon.metric.filterPattern=|Processor|*|*,|.NET Data Provider*|*|*,|.NET CLR*|{osprocessname}|*,|.NET CLR Data|*|*,|Process|{osprocessname}|*,|ASP.NET*|*

3.Start the PerfMonCollectorAgent service and verify if the issue persists.

 

What to collect if the problem persists


Enable DEBUG logging in the logging.config.xml and save the IntroscopeAgent.profile so the change in the xml is taken into account.
If the application is crashing, set introscope.nativeprofiler.logBytecode=true and introscope.nativeprofiler.logAllMethodsNoticed=true

Try to reproduce the issue and collect the below information

  • Install logs and Agent logs (if any, AGENT_HOME/wily/logs)
  • AGENT_HOME/wily/IntroscopeAgent.profile
  • The result of "systeminfo" command.
  • The result of "set" command.
  • Exercise the application, then run "tasklist /m", send the output.
  • Scrennshot of the C:\windows\assembly folder, listing the wily.*.dll
  • Screenshot of application events from Windows Event viewer
  • Collect dump files:

    - If the issue is related to the high cup/memory, crash, hung situation, use Debug Diagnostics Tool from Microsoft to capture user dump, which contains both heap and thread snapshots. The following KB has both a download link and usage instructions: http://support.microsoft.com/kb/2580960   

    - There are multiple ways to capture dumps on.NET process.  One simple way is to bring up Task Manager, find the .NET process with the memory issue, and then right click on the process to select  Create Dump File  option in its context menu.

- There is also a feature in windows .Net to create a dump file when a process crashes and this is documented here : https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dumps

Essentially it just requires the setting of some system variables via Control Panel -> System -> Advanced System Setting -> Environment Variables

COMPlus_DbgEnableMiniDump value 1
COMPlus_DbgMiniDumpType value 1
COMPlus_DbgMiniDumpName value C:\TT\coredump  (set this to a location of your choice
                                               - ensure sufficient free space the files
                                               can be very large)
COMPlus_CreateDumpDiagnostics value 1

https://api-broadcom-ca.wolkenservicedesk.com/attachment/get_attachment_content?uniqueFileId=bhhx/aYWIF38kqdTiq7//g==

Then when a .Net process crashes it will write the dump file to the location defined above.

As this can create a lot of data I would suggest that once you have captured a dump from the failure that you then disable the dump creation by setting the variable  COMPlus_DbgEnableMiniDump back to 0

The above all assumes of course that there is a process crash - which is what you indicated - when the failure occurs.

Ideally we recommend performing the above in a test environment assuming that the problem is reproducible there.

Attachments