cdm probe bug applying messages incorrectly when monitoring pagefile usage
search cancel

cdm probe bug applying messages incorrectly when monitoring pagefile usage

book

Article ID: 201915

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

There seems to be a defect when trying to monitor Pagefile usage % on Windows systems with the cdm probe. First, I want to point out 2 places where there is conflicting information:

The cdm probe release notes (from v 3.31) say that pagefile usage is monitoring using Swap Memory: Windows: Swap memory usage now reflects the pagefile usage.
This knowledge article says that pagefile usage is monitored using Total Memory: https://knowledge.broadcom.com/external/article?articleId=9140

It seems like setting Swap Memory thresholds on Windows systems is correct to monitor Pagefile Usage %

However, there seems to be a bug with messages being applied incorrectly to threshold settings/breaches.

cdm Pagefile message settings - PagefileError message SHOULD be being applied, but from the alarms, it is clearly not.

As mentioned, the Swap Memory thresholds do seem to monitor Pagefile Usage for Windows, but the SwapError and SwapWarning messages are still applied to these threshold breaches, instead of the desired/configured PagefileError/Warning message.

Environment

Component : UIM - CDM WITH IOSTAT

- UIM v 20.10
- Hub v 9.30
- Robot v 9.30HF1
- cdm probe v 6.50

Cause

- mismatch of the use of the term 'swap' in the alarm message as it relates to Windows versus UNIX/Linux.

Resolution

In our cdm help doc, you can see some of the following information:

In the Release Notes, as of Jan. 2007, cdm v3.31, "Windows: Swap memory usage now reflects the pagefile usage."

cdm Release Notes
https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/ca-unified-infrastructure-management-probes/GA/alphabetical-probe-articles/cdm-cpu-disk-memory-performance-monitoring/cdm-cpu-disk-memory-performance-monitoring-release-notes.html

Swap versus Paging
https://stackoverflow.com/questions/1688962/whats-the-difference-between-operating-system-swap-and-page

Swapping and paging are similar concepts. With paging, the (physical) memory is divided into small blocks called "frames", and the (logical) memory of each program is divided into blocks called "pages". Pages and frames have the same size; each page is then mapped to a frame. This mapping is performed via page tables. Paging solves fragmentation problems that were present with earlier memory-management schemes.

With swapping, parts of memory which are not in use are written to disk; this enables one to run several programs whose total memory consumption is greater than the amount of physical memory. When a program makes a request for a part of memory that was written to the disk, that part has to be loaded into memory. To make room for it, another part has to be written to the disk (effectively the two parts swap places - hence the name). This "extension" of physical memory is generally known as "virtual memory".

Modern systems use both paging and swapping, and pages are what is being swapped in and out of memory. Different terms for pretty much the same thing. They both refer to an area of virtual memory that is (usually) stored on the hard drive.

UINX/Linux systems administrators call it "swap" -> reserved space on hard drive which is used by the system when the physical memory (RAM) is full. In UNIX/Linux, swap space is generally a separate partition. Swap in linux is a partition that is used for virtual memory. It contains pages which are blocks of memory that can be exchanged in and out of the real memory.

Windows admins call it Pagefile. On Windows is is typically a file stored somewhere on the OS's filesystem.

In the cdm Memory usage section, select one of the following memory categories to specify the threshold:

M: allows you to specify the threshold values for total memory usage.
S: allows you to specify the threshold values for swap memory usage.
P: allows you to specify the threshold values for physical memory usage.

High: specifies the maximum memory usage when the probe generates a higher severity alarm.
Low: specifies the maximum memory usage when the probe generates a lower severity alarm.

cdm Paging activity section:

High: specifies the maximum number of the paging data operations in a second when the probe generates a higher severity alarm.
Low: specifies the maximum number of the paging data operations in a second when the probe generates a lower severity alarm.

Swap Memory Usage: enables you to generate QoS messages for the space on the disk used for the swap file in Kbytes.
Swap Memory in %: enables you to generate QoS messages for the space on the disk used for the swap file in %.

Out of the box, the swap alarm message will generate for both Windows and UNIX/Linux machines.

Historically speaking, 'Swap'/'Swap space' is a term used mainly on UNIX/LINUX systems, and 'Paging' or 'Pagefile' is used on Windows.

When you enable the cdm SwapError and SwapWarning alarms, you can check the sample value/values (averaged) against the actual Paging File on Windows via the Windows Performance counter (Paging File->% Usage). Description: The amount of the Page File instance in use in percent.

Note that if you use 1 sample, the value will be the same as what the perfmon counter displays. If you use multiple samples, e.g., 5, it will be an average of the samples. The thing is, even on a Windows system, the alarm message generated for swap memory usage, e.g., Average (1 samples) swap memory usage is now 6.56%, which is above the error threshold (2%) is really reporting the Windows 'Paging File->% Usage' perfmon counter value, not swap so the message should reflect that fact.

WORKAROUND:

In cdm, create a Custom message. Under Setup->General->Message definitions. Rt-click and create a new message. For example, "CustomPagingError" with message like so:

Average ($value_number samples) pagefile usage is now $value$unit, which is above the error threshold ($value_limit$unit)

with severity of Major.

Click Ok to save it in the cdm probe.

Then click on the Custom Tab in cdm. Then rt-click->"New Memory Profile..."

Name: Paging
Select the Swap memory Tab
Select High and set the major message to "CustomPagingError."


Click Ok. (note you can follow this same process for the Low Threshold if desired.

Next, edit the cdm probe via Raw Configure and set the memory ->alarm for swap error and swap warning, action = no to disable it.

From that point on, you will only get the custom alarm message, e.g.,

"Average (1 samples) pagefile usage is now 7.50%, which is above the error threshold (1%)"