APM SaaS - The New "Agent States" and "ConnectionState" supportability Metrics
search cancel

APM SaaS - The New "Agent States" and "ConnectionState" supportability Metrics

book

Article ID: 269887

calendar_today

Updated On:

Products

CA Application Performance Management SaaS DX APM SaaS DX Application Performance Management

Issue/Introduction

The "ConnectionStatus" metric has been replaced by the new "Agent States" and "ConnectionState"  Agent Health metrics:

 

To summarize the challenges with old ConnectionStatus metric:

1. The 2 min delay in reporting a disconnect due to a shutdown agent
2. The 20 min delay in reporting a disconnect that is due to interruption
3. The path of the Connection Status changing by the agent's collector impedes alert definitions
4. An agent switching connection between collectors temporarily have two states: Disconnected with the old, and Connected with the new which impedes alert definition.

For more details refer to : https://knowledge.broadcom.com/external/article/224265 

Environment

DX APM SaaS only

Resolution

Starting from APM SaaS 2023.5.1.21 (released last August) a new internal calculator emits the below new Agent Health metrics:

a) Overall agent status:

"SuperDomain|Custom Metric Host (Virtual)|Custom Metric Process (Virtual)|Custom Metric Agent (Virtual)|Agents|Agent States
  • NoData – 0 :
  • Unstable – 2 : it means the agent is connected but the connection is bad. The agent and collectors regularly ping each other and measure
    the latency. And in case the latency is high or the latency between calls differs too much the connection is understood as unstable
  • Alive – 4 : the agent is connected and alive metrics are being received
  • Shutdown – 5 : agent has gracefully disconnected. Eg. the process with agent was stopped.
  • Reconnected - 6 : The agent has reconnected to another collector
  • Interrupted - 7 : the agent has not disconnect but APM is not receiving any metrics. Could be network outage.
  • Aged Out - 8 : The agent hasn't reported for 24 hour

NOTES:

-1 and 3 legacy values (1 - CONNECTED and 3 - DISCONNECTED ) are not used as they are remapped to 4-ALIVE or 5-STOPPED.

- The different between state STOPPED(5) and INTERRUPTED(7) is whether agent disconnects or not. In both states we are not getting metrics from the agent. But one case is after graceful disconnect.

 
b) For individual agent states:
 
SuperDomain|Custom Metric Host (Virtual)|Custom Metric Process (Virtual)|Custom Metric Agent (Virtual)|Agents|<Host>|<Process>|<Agent>:ConnectionState"

Where <Host>, <Process>, <Agent> are agent's host name, process name and agent name

Possible values: *same as above*


c) New calculator processing time: 

SuperDomain|Custom Metric Host (Virtual)|Custom Metric Process (Virtual)|Custom Metric Agent (Virtual)|Agents:ConnectionState processing time (in ms)

 

IMPORTANT: these New metrics are not enabled by default in your Tenant(s), contact Broadcom Support for assistance, provide reference : Enhancement # US899146

 

Additional Information

APM OnPrem - The "AgentConnectivity" javascript calculator, a solution to the Agent "ConnectionStatus" limitations
https://knowledge.broadcom.com/external/article/224265