Example of increased Smartstor Duration due to disk I/O contention related to Transaction Trace activity

book

Article ID: 48708

calendar_today

Updated On:

Products

APP PERF MANAGEMENT CA Application Performance Management Agent (APM / Wily / Introscope) CUSTOMER EXPERIENCE MANAGER INTROSCOPE

Issue/Introduction

Description:

Enterprise Manager data processing has to complete in under 7.5 seconds otherwise there will be issues with the way data is displayed and stored.

If either or both the Harvest and Smartstor Duration are taking too long, you may start to see gaps in data where the Enterprise Manager delays data processing to keep up.

The whole Harvest and Smartstor process involves the collecting of data from agents and calculators, updating alert statuses, returning query data and then the writing of that data to disk.

The Enterprise Manager waits for the writing of data to be completed before harvesting again. This is why the two are important to each other.

One way that the Smartstor Duration can be affected is if there are other concurrent disk operations, for example writing transaction trace data to disk.

Solution:

To see if you are affected by this particular issue, review the perflog.txt.

An easy way to review perflog data, is to change the file extension to .csv and view in a spreadsheet viewer.

The columns that interest us are titled

  • Performance.Smartstor.Duration (Smartstor Duration)
  • Performance.Transactions.Data.Insertion.Time.Per.Interval (Trace Duration)

Note this sample perflog data from an actual Enterprise Manager collector

<Please see attached file for image>

Figure 1

The Smartstor Duration column has values of 466,11326,6013,8470,407,5067,4454,241

The Trace Duration column has values of 44,9693,4219,4730,75,4405,3871,61

We can see spikes in Smartstor Duration at the same points as Trace Duration.

The writing of trace data to disk is not part of the Enterprise Manager's harvest and Smartstor process, so this can only be an example of disk contention.

Suggested solutions:

  • 1) The APM Performance and Sizing Guide says that use of a dedicated I/O controller for the Smartstor disk is mandatory. If you are seeing this sort of pattern in the perflog, there is a good chance your EM is not configured optimally. To fix the problem, locate a disk on an I/O controller separate from the EM application itself, [My_Remote_Disk], and set the following properties to point to it:

introscope.enterprisemanager.smartstor.directory=[My_Remote_Disk]/data

introscope.enterprisemanager.smartstor.directory.archive=[My_Remote_Disk]/data/archive

Once a dedicated disk is specified for Smartstor, you must tell the EM about it by setting the "dedicatedcontroller" property to true:

introscope.enterprisemanager.smartstor.dedicatedcontroller=true

If this property is set to false while a separate disk is used for Smartstor, some minor improvement will be achieved, but it is the combination of these three properties used together (and correctly) that inform the EM to utilize its optimization methods to best advantage.

This means that if you specify a disk located on an independent I/O controller, always set the "dedicatedcontroller" property to true, while, If you specify a disk which is not on an independent I/O controller, always set the "dedicatedcontroller" property to false.

Bear in mind that all sizing recommendations that we list are halved where a dedicated controller is not used.

  • 2) Look at ways to reduce trace activity.

Traces are started:

  • a) automatically as sampled traces.

By default, 1 trace is taken from each agent every 2 minutes. Consider extending the sampling interval to every 30 minutes.

The properties in IntroscopeEnterpriseManager.properties must be uncommented to override the values on the agent:

#introscope.agent.transactiontracer.sampling.perinterval.count=1

#introscope.agent.transactiontracer.sampling.interval.seconds=1800

You can set the sampling.perinterval.count to 0 to stop sampling if you wish.

  • b) Manually, either from the Workstation UI or the CLW (Command Line Workstation).
  • c) As part of the CEM integration on Slow Time defects, a transaction trace will be run.

Check in the CEM UI under Setup > Introscope Settings for the Transaction Trace settings. If you are seeing this problem, try to configure a short session for the trace, 5 minutes at maximum. As for the trace threshold, you have to consider the link between whatever threshold you have set for Slow Time defects and the percentage you set here.

If the Slow Time threshold is set at 5 seconds and you have a transaction trace time threshold of 20%, you will track transactions longer than 1 second in duration. Try to not set the value too low or you will have a flood of transaction traces.

You should also review how long you need to keep trace data, By default this is set to 14 days, but reducing to 7 days in IntroscopeEnterpriseManager.properties has helped in this situation:

introscope.enterprisemanager.transactionevents.storage.max.data.age=7

Environment

Release:
Component: APMINT

Attachments

1558721444841000048708_sktwi1f5rjvs16w2c.gif get_app