VMware Aria Operations for Networks has been observed to delay alerting about a collector being down by 30 minutes following a collector outage.
VMware vRealize Network Insight 6.9
Aria Operations for Networks 6.10.0
Aria Operations for Networks 6.11.0
Aria Operations for Networks 6.12.0
Aria Operations for Networks 6.12.1
Aria Operations for Networks 6.13.0
Aria Operations for Networks 6.14.0
The delay is inherent to the current design of AON. The following sequence of events explains the behavior:
Health Status Communication: The collector periodically sends its health status to the platform using an RPC channel.
Health Tracking: The platform records the last health update timestamp using a key-value store.
Health Check Mechanism: If no health status update is received within 30 minutes, the platform marks the collector as unhealthy.
Alert Generation: The MgmtResourceHealthChecker module utilizes this health check mechanism to trigger an alert in the vRNI UI.
The 30-minute delay is a result of the system’s design, which relies on this interval for health monitoring and alerting.
At this time, there is no workaround to reduce the 30-minute delay. This behavior is hard-coded into the system's health-checking mechanism. Broadcom has identified this as an area for improvement and plans to address the issue in a future release of vRNI.