CA SOI - Event / Alarm Storm in eHealth caused Alert counts to be out in SOI
search cancel

CA SOI - Event / Alarm Storm in eHealth caused Alert counts to be out in SOI

book

Article ID: 112280

calendar_today

Updated On:

Products

CA Service Operations Insight (SOI)

Issue/Introduction



The customer had an "event storm" last week which caused many events to not get cleared because the limit was reached in eHealth.

Can the systems be brought into sync manually now - by a restart? What must be done to achieve this? 

Environment

Release:
Component: CI0002

Resolution

Alarm/Alert storms are known occurrences with products like Spectrum, UIM and eHealth. I'm aware of prevention measures you can take for UIM to detect and suppress potential Alarm Storms, I would suggest you raising a new ticket with the eHealth team to ask for best practices / guides to prevent this from reoccurring. 

SOI can handle a large number of alarms however, if there is a flood of alerts in just a few seconds the SOI manager will go into a hung state and stop processing those alerts. We have an Idea open in the SOI Community for this scenario, I would suggest voting here: 

SOI Alarm functionality to detect Alarm Storms - https://communities.ca.com/ideas/235736086-soi-alarm-functionality-to-detect-alarm-storms 

The solution to this problem, would be to clear those duplicate / flood of alerts from the eHealth side and then recycle the Connector services. After this the alarms should be synchronized and the problem on the SOI side resolved.