This is the approach that some of us in Support use to solve APM problems and particularly integration issues. This was also documented in an APM Blog.
Step 0: Do the Pre-work
Before looking at any logs, you should already know the following:
- Which components comprise the product functionality such as performing an integration?
- What function does each component perform?
- Which files need to be configured for the integration to work?
- How will I know that the function/integration is working?
- Which logs are available for each component? Do they have a debug mode?
- Which server does each component reside in my environment?
To find the answers to this, look for the following in the APM documentation:
- A components list
- An architectural diagram
- A workflow diagram or description
- Screenshots of the integration functionality working.
- A list of log and configuration file names and location.
A good example of a document having this is the CA SiteMinder Application Server Agents Guide.
Step 1: Troubleshooting (In the fire.)
The inevitable issue has occurred and it is time to show how the above will help in resolving issues. Two different approaches can be used together
- Visual Inspection
- Functional Workflow
With certain functionality, seeing what is and is not appearing can be an important clue on what to do next. For example, this includes seeing:
- What is appearing in the Investigator.
- What is appearing in the APM CE defect.
If something is not appearing, it is probably a configuration issue or a threshold not being exceeded (such as percent of slow time).
After completing the visual inspection approach, you can then look at the workflow diagram or description and determine the following:
- Which was the last step successfully completed?
- What is the next step in the functionality/integration workflow?
- Which components are involved in those two steps?
By knowing which components are involved, you can avoid looking needlessly at other components.
For example, looking at the section "CA APM problem resolution triage overview" in the APM Configuration and Administration Guide, one can determine that a Transaction Trace start request is made as part of a new Incident but the transaction definition is not being matched by the agent. So knowing the components involved, you would ignore all others such as TIM and database.
Your focus would be then on why the Component (APM Introscope Agent) was not successful in performing this Function (Matching an APM CE transaction against the Introscope Agent ruleset.) This would start with an Introscope Agent log in debug mode, a screenshot of the APM CE transaction definition. And much time has been saved on troubleshooting.
Step 2: Post-Resolution
Build yourself a wiki keeping track of using this approach and add additional questions in the pre-work steps, visual inspections/functional workflow steps as needed.
https://communities.ca.com/community/ca-apm/blog/2015/08/16/apm-blog-problem-solving-six-ways-to-avoid-the-wrong-path -- Problem Solving Six Ways to
Avoid the Wrong Path.
https://communities.ca.com/thread/119900520 -- A General Approach to Problem Solve APM Integration Issues
https://communities.ca.com/message/241725272#241725272 -- Cascading Problems