I have several applications that depend on one another, and I would like to know if there is a method to correlate alarms between them.
For example:
In this scenario, I would like to know if there is a way to relate or identify that the alerts in Applications 1 and 2 are caused by the issue in Application 3.
Does a mechanism or feature like this already exist? Would it be possible to implement logic to handle this type of dependency correlation? Something similar to the services from the OI, but focused on alerts and email notifications instead of SLOs or performance metrics.
DX O2 SaaS
You can perform it by creating a custom situation into SaaS login >> Settings >> Custom Situation Definitions.
1- Ensure Dependencies are Mapped (Topology) The system must know that the applications depend on one another. If you are using APM agents, these dependencies are usually discovered automatically (e.g., App 1 calling App 3's API). If they are custom external applications, you can build this relationship manually using the platform's Service Modeler or ingest the topology via the REST API/RESTMon so the edges (connections) between the applications are established.
2- Enable Alarm Clustering Ensure that machine-learning-based Alarm Clustering (Situations) is active. The engine will evaluate the timeframe (the alarms happening at the same time) and the topology (the dependencies) to cluster them together.
3- Create an Action Policy for Notifications This is the most critical step for your email requirement. Instead of setting up a standard notification policy that emails you for every critical alarm, you will configure an Action Policy that triggers based on the AIOps correlation. You can set the policy to trigger an email only when a Situation is created, or specifically for alarms tagged as the Root Cause.