Fault Tolerant Data Collectors - Functionality
search cancel

Fault Tolerant Data Collectors - Functionality

book

Article ID: 257077

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

I need to know if, after a poller comes back online, the cluster goes back to its original state. Based on my observations, the answer appears to be no.

For example, in a 3-node cluster (A, B, and C) where A and B are active and C is the "standby":

  • B goes down, so C becomes active, restoring the 2-active-node cluster.
  • However, when B comes back online, it appears to stay in standby and does not return to the active state.

This results in A and C being active, with B as the standby. Ideally, the cluster would revert to its original state (A and B active, and C standby). Is what I am observing expected behavior, meaning the cluster will not automatically return to its original state?

Environment

Dx NetOps Performance Management 22.2.x/23.3.x version

Resolution

Failover Data Collectors (DCs) should not be categorized as primary, secondary, or tertiary. Instead, they should be viewed as a pool of DCs capable of polling the same devices.

Expected Behavior

In a scenario where you have three collectors (A, B, and C):

  • Initial State: A and B are active, and C is in standby.
  • Failover Event: If B goes down, C becomes active, maintaining a 2-active-node configuration.
  • Restoration: When B is restored, it transitions to the standby state, while C remains active.

This behavior is by design and reflects the system's normal functionality. Once a collector becomes active due to a failover event, it remains active even after the originally active collector is restored.

Key Points

  • Collector Pool: All DCs should be considered as part of a collective pool rather than having fixed primary, secondary, or tertiary roles.
  • Failover and Restoration: The failover process dynamically assigns active and standby roles based on availability and ensures continuous polling capability.
  • System Behavior: It is expected and normal for the restored collector (B) to remain in standby while the failover collector (C) remains active.

Conclusion

Understanding that failover DCs function as a flexible pool rather than fixed roles helps in ensuring a seamless polling process during failover and restoration events.

 

 

Additional Information

Please the documentation for further info.