The Highly Available (HA) NACs experienced a failover. The secondary NAC became the Master NAC but failed to connect to the execution servers. Why?
To understand if this scenario matches what you're seeing please see the section "Two NACs Active At Same Time" in the "Addition Information" area at the bottom of the article.
Release : 6.7.0.b124
Component : CA RELEASE AUTOMATION RELEASE OPERATIONS CENTER
Database: MSSQL 2019 with Always On Availability Group
The root cause is still unclear. Review the Resolution section for recommendations.
While investigating this issue we found that:
Diagnosing and resolving the issue are two different things. If you need to diagnose root cause then please review and capture the data outlined in the "Diagnosing" section in Additional Information area (below) before applying the steps to resolve the problem outlined below.
To resolve the problem:
Diagnosing
Information from each of the MSSQL Servers participating in Always On Availability Group:
Information from both NACs participating in HA setup:
Information from Load Balancer monitoring/testing the datamanagement/availability URL of the mgmt servers:
Two NACs Active At Same Time
Nolio supports Active/Passive High Availability NACs. Not Active/Active.
In this specific scenario, the information needed from the database and NACs are equally important. Having only one will likely be insufficient because of the nature of the problem - both NACs running as the active/master NAC. The reason why both are equally important is because both the primary and secondary NACs query the database (every 1 second) to determine if it is the master NAC. It does this by comparing the id in the master_nac table to the conf/nacNodeId value on its local NAC server. If it is the same then it will continue running as the active NAC. If it is different then it will either shutdown or start the "master application context dm" depending on whether it was the master NAC or not. This is how each of the NACs understand which role they are supposed to play in the active/passive HA setup that Nolio offers.
If the NAC was the master and detects it's roll has changed to be the passive NAC then it will log these messages:
If the NAC was passive and its roll has changed to be the master NAC then it will log these messages:
Since both NACs should be using the same DB there should not be a time when the NACs are out of sync in terms of what roll they're playing. Anytime one NAC shows either of these messages the other NAC should also show the other set of messages. If they're not then that indicates a difference in what they're getting from the database and that both NACs trying to behave as the active/master NAC at the same time (which is unsupported).