search cancel

No Reachability Threshold Alarms

book

Article ID: 252553

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration

Issue/Introduction

Defined an Alarm Threshold for Reachability / Average Response (ms) with Threshold Violation >10 and Clear <9 and no Event appears in Event Display

 

Environment

Release : 21.2

Cause

Several events are not resolved

mysql> select count(*) from em.unresolved_event_items\G
*************************** 1. row ***************************
count(*): 395182
1 row in set (0.93 sec
 
Perform EM Full Resynchronization
 
mysql> select count(*) from em.unresolved_event_items\G
*************************** 1. row ***************************
count(*): 395464
1 row in set (0.03 sec)

 

DM Message error
INFO   | jvm 1    | 2022/09/07 00:03:56 | ERROR | Thread-5024              | 2022-09-07 00:03:56,708 | com.ca.im.portal.dm.inventory.Inventory2WSImpl
INFO   | jvm 1    | 2022/09/07 00:03:56 |       | gatherDataThread (InvWS_cac3340f898a45f0894a82149542317e)
INFO   | jvm 1    | 2022/09/07 00:03:56 | com.ca.im.portal.dm.inventory.InventoryTimeoutException: null
INFO   | jvm 1    | 2022/09/07 00:03:56 |  at com.ca.im.portal.dm.inventory.Inventory2WSImpl.gatherData(Inventory2WSImpl.java:875) [classes/:?]
INFO   | jvm 1    | 2022/09/07 00:03:56 |  at com.ca.im.portal.dm.inventory.Inventory2WSImpl.run(Inventory2WSImpl.java:847) [classes/:?]
INFO   | jvm 1    | 2022/09/07 00:03:56 |  at java.lang.Thread.run(Unknown Source) [?:?]

Resolution

Stop all PC Services

service caperfcenter_devicemanager stop
service caperfcenter_console stop
service caperfcenter_sso stop
service caperfcenter_eventmanager stop

Start the SSO service:
service caperfcenter_sso start

Wait one minute, then start the event manager and device manager:
service caperfcenter_eventmanager start
service caperfcenter_devicemanager start

Wait one minute, then start the console service:
service caperfcenter_console start

Perform EM Full Resynchronization

Then after inventory sync completes  check unresolved_event_items to see if the count is going down.
mysql> select count(*) from em.unresolved_event_items;
repeat this command several times

Additional Information

Troubleshoot:

For DA Device ItemID 328481

Vertica query

select alarm_id, item_id, start_time, to_timestamp(start_time) from dauser.alarm where item_id = 369634 and alarm_id = 328481;


alarm_id             | 5765149
dcm_id               | 0
pollgroup_id         | 0
item_id              | 328481
description          | A Threshold Violation event has been raised. (Profile Name: Latency, Rule Name: Latency)
description_extended | {DbColumnName: im_AvgResponse,Duration: 30,Window: 30,Rule Type: Constant,Exceed Operator: >,Exceed Threshold: 0.0,Clear Operator: <,Clear Threshold: 1.0,Max Polled Value: 0.0,Metric: {http://im.ca.com/normalizer}NormalizedReachabilityInfo.AvgResponse,MetricFamily: {http://im.ca.com/normalizer}NormalizedReachabilityInfo }
rule_id              | 4820190
severity             | 1
start_time           | 1664997300
clear_time           | 0
clear_reason_code    | 0
alarm_type           | Threshold
profile_name         | Latencia
profile_id           | 4820191

mysql> select *, from_unixtime(OccurredOn) from em.event_properties where Name='_Alarm_ID' and value=5765149;

+------------+------------+-----------+------+---------+-----------+---------------------------+
| NPCEventID | OccurredOn | Name      | Type | Value   | CultureID | from_unixtime(OccurredOn) |
+------------+------------+-----------+------+---------+-----------+---------------------------+
|   11054549 | 1664997300 | _Alarm_ID |    0 | 5765149 |           | 2022-10-05 19:15:00       |
+------------+------------+-----------+------+---------+-----------+---------------------------+
1 row in set (48.44 sec)\

mysql> select *, from_unixtime(OccurredOn) from em.events where NPCEventID=11054549\G

*************************** 1. row ***************************
               NPCEventID: 11054549
               ProducerID: 3
             LocalEventID: ce781707-6d76-49ad-ac2a-b6ad9782da730
                   TypeID: 29
                SubTypeID: 30
                   MetaID: 22
                 Category: 4
               OccurredOn: 1664997300
                    State: 0
              Description: A Threshold Violation event has been raised. (Profile Name: Latency, Rule Name: Latency)
                   ItemID: 0
                 ItemType: NULL
              ItemSubType: NULL
             ParentItemID: NULL
                 Severity: 1
       ThresholdProfileID: 4820191
        ThresholdFolderID: 4820188
               ReceivedOn: 1664997422
from_unixtime(OccurredOn): 2022-10-05 19:15:00
1 row in set (0.00 sec)

mysql> select *, from_unixtime(OccurredOn) from em.unresolved_event_items where NPCEventID=11054549\G
*************************** 1. row ***************************
               NPCEventID: 11054549
               OccurredOn: 1664997300
                NPCItemID: NULL
              ItemIsLocal:
                ItemIndex: 0
             ItemTypeName: device
          ItemSubTypeName: router
              LocalItemID: 328481
               ProducerID: 3
from_unixtime(OccurredOn): 2022-10-05 19:15:00
1 row in set (0.00 sec)

mysql> select count(*) from em.unresolved_event_items\G
*************************** 1. row ***************************
count(*): 395182
1 row in set (0.93 sec
 
Article reference Article ID: 213940 - Looking for an alarm from Vertica to CAPC