Symptoms:
- There is a Tier 0 Gateway configured in Active/Standby (A/S) mode.
- Preemptive mode is enable.
- A failover was attempted by editing the Gateway and changing the preferred edge node.
- This failed and the failover did not occur.
- In the NSX-T manager /var/log/proton/nsxapi.log we can see Gateway was updated:
2022-10-21T01:55:05.664Z INFO http-nio-127.0.0.1-7440-exec-34 LogicalRouterServiceImpl 4414 ROUTING [nsx@6876 comp="nsx-manager" level="INFO" reqId="9bdd2986-####-####-####-##########30" subcomp="manager" username="admin"] Invoking entity listener with UPDATE for LogicalRouter/eba53a2a-####-####-####-##########f7
2022-10-21T01:55:06.231Z INFO http-nio-127.0.0.1-7440-exec-34 LogicalRouterServiceImpl 4414 ROUTING [nsx@6876 comp="nsx-manager" level="INFO" reqId="9bdd2986-####-####-####-##########30" subcomp="manager" username="admin"] Persisted configuration update for logical router eba53a2a-####-####-####-##########f7 of type TIER0
- Checking the three NSX-T manager's /var/log/proton/nsxapi.log we see two managers got the work item, but one did not:
2022-10-21T01:55:06.428Z INFO with operation UPDATE, adding work-item WorkItem{identifier=LogicalRouter/eba53a2a-####-####-####-##########f7, Timestamp{epoch=12666, address=5362095164}} for processing
2022-10-21T01:55:06.428Z INFO policyProviderTaskScheduler-1 WorkerShardManager 4567 POLICY [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] canCurrentNodeProcess : false, for worker : LogicalRouterWorker, for workItem : WorkItem{identifier=LogicalRouter/eba53a2a-####-####-####-##########f7, Timestamp{epoch=12666, address=5362095164}}
/var/log/proton/nsxapi.log
2022-10-21T01:55:06.428Z ERROR org.corfudb.runtime.collections.streaming.StreamPollingScheduler-worker-2 ResumeStreamListener 1758 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP4" level="ERROR" subcomp="manager"] Failed to re-subscribe [tag:worker_framework] nsx$[null]. Listener is NOT SUBSCRIBED yet! lastProcessedTs:epoch: 746
/var/log/syslog
2022-10-21T01:55:06.428Z nsx-mgt-03 NSX 1758 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP4" level="ERROR" subcomp="manager"] Failed to re-subscribe [tag:worker_framework] nsx$[null]. Listener is NOT SUBSCRIBED yet! lastProcessedTs:epoch: 746#012sequence: 1086122231#012, retry 1/20
Note: Above is a sample entry for the two managers which received the work item.
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.