How does CDM collect read/write latency for disks on Windows?

book

Article ID: 140965

calendar_today

Updated On:

Products

DX Infrastructure Management NIMSOFT PROBES

Issue/Introduction

We missed a SAN issue on a production server because UIM was not raising alarms about the latency going up for SAN disks on a server. 

Once we were aware of the issue, we checked the metrics in another tool (Redgate) and immediately were able to spot the latency rising, while the graph in UIM stayed unrealistically flat. 

Please tell us: 

a) How dows the cdm probe collect this metric on a Windows server? Please be specific. If it uses perfmon counters, let us know which counters. 

b) We understand from the CDM documentation, that the values for the latency metrics are aggregated over all disks. Please explain how this aggregate is being built. Be specific please. Is it the "sum" or the "average" or "median"? 

 

Environment

Release : 9.2.0

Component : UIM - CDM WITH IOSTAT

Resolution

It has been checked in the code and found below metrics are only collected at server level, and not at the disk level. This metric create a target that matches the server and not the disk.

QOS_DISK_READ_THROUGHPUT,

QOS_DISK_WRITE_THROUGHPUT,

QOS_DISK_TOTAL_THROUGHPUT ,

QOS_DISK_LATENCY,

QOS_DISK_READ_LATENCY,

QOS_DISK_WRITE_LATENCY

Other metrices like Disk Usage, Free, Size etc. are collected at individual disk level.

CDM uses windows perfmon counters for physical disk latency. CDM is not aggregating any value over all disks, we are getting Disk latency value at server level from performance counter. I believe it is aggregated by Windows performance counter itself over all disks. I found below link from Windows documentation for confirmation.

https://blogs.technet.microsoft.com/askcore/2012/02/07/measuring-disk-latency-with-windows-performance-monitor-perfmon/

What counters in Windows Performance Monitor show the physical disk latency?

“Physical disk performance object -> Avg. Disk sec/Read counter” – Shows the average read latency.

“Physical disk performance object -> Avg. Disk sec/Write counter” – Shows the average write latency.

“Physical disk performance object -> Avg. Disk sec/Transfer counter” – Shows the combined averages for both read and writes.

The “_Total” instance is an average of the latencies for all physical disks in the computer.

Each other instance represents an individual Physical Disk.

Below is the questionnaire asked in the issue:

a) How does the cdm probe collect this metric on a Windows server? Please be specific. If it uses perfmon counters, let us know which counters.

Yes CDM uses windows perfmon counters.

b) We understand from the CDM documentation, that the values for the latency metrics are aggregated over all disks. Please explain how this aggregate is being built. Be specific please. Is it the "sum" or the "average" or "median"?

 CDM is not aggregating value over all disks. We are getting Disk latency value at server level from performance counter. I believe it aggregated by Windows performance counter over all disks.