Polling timeout error Events raised in Performance Management

book

Article ID: 142755

calendar_today

Updated On:

Products

CA Infrastructure Management CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

Performance Management Events raised on devices for reduced polling due to timeouts.

The same Events can also be seen in Spectrum as Alarms.

In both instances the Event Message states:

  • Polling has been temporarily reduced due to prior timeouts.

These Events are synchronized to Spectrum via the integration between products.

Cause

Devices polled via Performance Management Data Aggregator are failing to respond to the poll requests due to timeouts.

Environment

All supported Performance Management releases

Resolution

The correct solution to these events is to resolve the problem causing the device to time out during the poll requests. The steps to determine and set the correct polling configuration for devices showing these Events is found in the Polling Sensitive and Critical Devices Without a Performance Impact section of the documentation.

An alternative though less desirable solution is to disable the Events from being raised. This can be done following the below steps. Pay close attention to the limitations noted in the Additional Information below before disabling these Events. They are important and indicative of a problem in the network.

To disable these Events from being seen again, we'll need to use a REST API to disable the Event Rule.

  • Open a REST client.
  • Set Content-Type=application/xml

First run a GET against the URL:

  • DA:8581/rest/eventrules

Find Event with Name:

  • <Name>Polling Safety Valve</Name>

Note the ID of that rule, found above the Name value. Run a new GET against the URL:

  • DA:8581/rest/eventrules/<ID>

In a lab the following is an example of the default Event Rule at issue.

<EventRule version="1.0.0">
<ID>460</ID>
<Enabled>true</Enabled>
<MetricFamily>
{http://im.ca.com/normalizer}NormalizedDevicePollingStatistics
</MetricFamily>
<Window>300</Window>
<ClearConditionList>
<ClearCondition>
<ClearOperator>LESS_THAN</ClearOperator>
<ClearValue>1</ClearValue>
</ClearCondition>
</ClearConditionList>
<Duration>300</Duration>
<Severity>MAJOR</Severity>
<AggregateToDevice>false</AggregateToDevice>
<ViolationConditionList>
<ViolationCondition>
<ViolationConditionType>CONSTANT</ViolationConditionType>
<ViolationPerformanceMetric>IsPollingStoppedDueToPriorTimeouts</ViolationPerformanceMetric>
<ViolationOperator>GREATER_THAN_OR_EQUAL_TO</ViolationOperator>
<ViolationValue>1</ViolationValue>
</ViolationCondition>
</ViolationConditionList>
<Item version="1.0.0">
<Name>Polling Safety Valve</Name>
<Description>Polling stopped due to timeouts</Description>
<CreateTime>Mon Sep 30 12:12:36 2019 -0400</CreateTime>
</Item>
</EventRule>

We need to edit the <Enabled> value from true to false to disable it. We do that by:

  • Set REST client to issue a PUT.
  • Set the URL to:
    • DA:8581/rest/eventrules/<ID>
  • Add the following to the body section:

<EventRule version="1.0.0">
<Enabled>true</Enabled>
</EventRule>

  • Now Run the PUT request and confirm a 200 Success message is received.
  • Refresh the GET against the rule ID URL and confirm the <Enabled> value is set to false.

Once completed no new instances of this Event will be raised in PM again, thus will not be sent to Spectrum for Alarm raise.

Additional Information