Using the "AgentConnectivity" javascript calculator to resolve limitations in the Agent "ConnectionStatus" metric
search cancel

Using the "AgentConnectivity" javascript calculator to resolve limitations in the Agent "ConnectionStatus" metric

book

Article ID: 224265

calendar_today

Updated On:

Products

DX Application Performance Management CA Application Performance Management (APM / Wily / Introscope)

Issue/Introduction

This article is about identifying when agents are stopped versus disconnect and then reconnect due to network/connectivity issues in a timely manner. That is, differentiating between:

An agent being orderly stopped (and then disconnecting) and then later restarted (and connecting again). Let's call this orderly shutdown.

An agent disconnecting due to network interruption and then reconnecting without the agent having stopped. Let's call this connection interruption.

There are currently no means in the EM or Agent to explicitly differentiate between orderly shutdown - which is most often intentional - and connection interruption - which is most often an issue.

Challenges with current ConnectionStatus metric

There current ConnectionStatus poses several challenges:

-The 2 min delay in reporting a disconnect due to a shutdown agent.
-The 20 min delay in reporting a disconnect that is due to interruption.
-The path of the Connection Status changing by the agent's collector impedes alert definitions.
-An agent switching connection between collectors temporarily have two states: Disconnected with the old, and Connected with the new which impedes alert definition.

This calculator resolves this:

-Connection State has a stable metric path under the virtual agent | Agents
-Consolidates an agent's Connection Status into a single Connection State
-States 1 thru 3 corresponds between Connection Status and Connection State
-State 4, Alive, indicates that at the agent is connected and alive metrics are being reported by the agent
-States 5 thru 8 are new, allowing identification of Shutdown (6), Reconnection (6), Interruption(7), and Age Out (8)
-Agent disconnection/reconnection Scenarios.

Environment

  • DX APM 10.*
  • DX APM SaaS
  • DX O2 24*

Resolution

Deploy the "AgentConnectivity" Javascript calculator (v1.5.3)


INSTALLATION STEPS:


For APM 2x 
1. Download attached AgentConnectivity.txt
2. Rename it as .js, for example AgentConnectivity.js
3. Login to DX SaaS
4. Go to APM > Settings > Javascript Extensions
5. Click "Create New Extension"
6. Follow steps as documented in Configure JavaScript Extensions 

For APM 10x:
1. Download attached AgentConnectivity.txt
2. Rename it as .js, for example AgentConnectivity.js
3. Copy the JavaScript text file into the <EM_Home>/scripts directory.

For more information refer to Using JavaScript Calculators

For APM SaaS
Refer to APM SaaS - The New "Agent States" and "ConnectionState" Supportability Metrics


Explanation of the Javascript customization:

Disconnection and Reconnection Determination

The default ConnectionStatus is useless on its own for agent Connection State determination. Instead a few “alive metrics” that immediately cease reporting by agent disconnection: % Time Spent in GC resp. % CPU Utilization (Host) resp. Bytes In Use. They are used in conjunction with the ConnectionStatus metric to determine the agent’s Connection State:

-NoData – 0 ~ Connection Status
-Unstable – 2 ~ Connection Status
-Disconnected – 3 ~ Connection Status
-Alive – 4 — the agent is connected and alive metrics are being received
-Shutdown – 5 - The agent has been orderly shutdown
-Reconnected - 6 - The agent has reconnected to another collector
-Interrupted - 7 - The agent is disorderly stopped or interrupted
-Aged Out - 8 - The agent hasn't reported for 24 hours

Additional Information

Attachments

1685969865137__AgentConnectivity.txt get_app