If both an old edge node and a new node are placed into maintenance mode manually prior to performing 'Replace Edge Cluster Member' action, it can lead to a network outage.
When these nodes exit maintenance mode manually, all SRs (Service Routers) may unexpectedly become active on the old edge node because this node is still a member of the edge cluster.
We can confirm from the /var/log/syslog.log on the old standby node that the SRs transition to an active state immediately upon exiting maintenance mode.
<DATE_TIME> <HOSTNAME> NSX #### - [nsx@#### comp="nsx-edge" subcomp="node-mgmt" username="root" level="INFO"] Updating maintenance mode to False <DATE_TIME> <HOSTNAME> NSX # ROUTING [nsx@#### comp="nsx-edge" subcomp="rcpm" s2comp="rcpm-db" level="INFO"] EdgeClusterConfig Message: <DATE_TIME> <HOSTNAME> NSX # FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="svcrt-fsm" level="INFO"] <UUID> transit from state Down to Standby event Node Up |
VMware NSX
Only the old edge node should be placed into maintenance mode based on the official documentation below.