VMs configured in High Availability setup lose connectivity after failback of primary node.

Products

VMware NSX VMware vSphere ESXi

Issue/Introduction

There are 2 VMs configured in a High Availability (HA) setup (active-standby), connected to NSX overlay segments, with a virtual IP (VIP) assigned.
In this case, the issue was observed for DRA VMs.
During failback, when the primary VM is rebooted or shut down, the VIP is successfully taken over by the secondary VM, ensuring continued accessibility to the VIP.
However, the issue arises when the primary VM comes back online.
Although the VIP is reassigned back to the primary VM, it becomes unresponsive after the transition.
Connectivity to the VIP is restored after a delay of ~10 minutes.
Validated the configuration of IP-Discovery profile required for VM configured in HA. Reference: Configure and apply NSX-T Segment IP discovery profile when using high availability (HA) for Virtual Machines.
Trust on First Use (TOFU) has been disabled and ARP binding limit has been increased to 2 to accommodate for the VIP-MAC address binding.
Still connectivity is retained after an interval of 10 minutes.
When both the VMs are in powered ON state:
- The ARP table entry on the logical-switch (on ESXi host), reports the VIP assigned with the MAC address of the primary VM, expected.

ESXi> get logical-switch <VNI> arp-table
Logical Switch ARP Table
--------------------------------------------------

LCP Entry
==================================================
IP MAC

10.xxx.xxx.10 00:50:56:##:##:4f (Primary VM IP and MAC address)
10.xxx.xxx.20 00:50:56:##:##:54 (Secondary VM IP and MAC address)
10.xxx.xxx.30 00:50:56:##:##:4f (VIP assigned with primary VM MAC)

When the primary VM is shut down and secondary VM takes over the VIP:
- The ARP table entry on the logical-switch (on ESXi host) gets updated with the latest IP-MAC address binding and reports the VIP assigned with the MAC address of the secondary VM.

ESXi> get logical-switch <VNI> arp-table
Logical Switch ARP Table
--------------------------------------------------

LCP Entry
==================================================
IP MAC

10.xxx.xxx.20 00:50:56:##:##:54
10.xxx.xxx.30 00:50:56:##:##:54 (VIP assigned with secondary VM MAC)

Primary VM is back in powered ON state:
- The VIP is assigned back to primary node which has been validated from inside the guest OS.
- After the primary VM is back on, traffic does not return to the vNIC.
- Post an interval of 10 minutes (ARP binding timeout limit), an ARP refresh is triggered by the vdr port to update the entries. Traffic now resumes back on the primary VM as expected.

Environment

VMware NSX
VMware vSphere ESXi

Cause

The local logical-switch ARP table on the ESXi host does not reflect the primary VM’s MAC address for the VIP. It still shows that the VIP is assigned with the secondary VM MAC, even after the failback of VIP was completed.

ESXi> get logical-switch <VNI> arp-table
Logical Switch ARP Table
--------------------------------------------------

LCP Entry
==================================================
IP MAC

10.xxx.xxx.10 00:50:56:##:##:4f
10.xxx.xxx.20 00:50:56:##:##:54
10.xxx.xxx.30 00:50:56:##:##:54 (VIP still assigned with the secondary VM MAC)

GARP packets are required to update all the ARP tables across the infra to route the traffic to the correct ports after failover.
To validate further regarding the delay in ARP table update, packet-captures will help isolate the issue.
Following behavior was observed after performing packet-captures:

pktcap-uw --switchport <vDR_port_ID> --dir 2 --ng -o /path_to_datastore/filename.pcapng

When primary VM went down, secondary VM sent GARP packets updating the MAC address on the vdr port. This updated the table for VIP-MAC address binding:

When primary came back up on the network, primary VM sent the GARP updating the VIP-MAC address binding:

This should have updated the ARP table entry and moved the traffic back to primary VM. However, immediately after this GARP was sent by the primary VM, another GARP was sent by secondary VM. This updated the ARP binding back to the secondary, even when the primary came back on the network.

The GARP packet sent by the secondary VM overrode the previous update, causing the ARP table to incorrectly route traffic to the secondary VM, as validated from the above command output.
After the ARP refresh, the correct MAC got advertised again by the primary VM, routing traffic to the correct VM.
This indicates an issue at the guest/application layer of the VMs, where they continue to assert ownership of the VIP even after the primary resumed, delaying traffic restoration and leading to connectivity loss after failover.

Resolution

Please reach out to the Guest OS/application team to troubleshoot why the GARP packets were sent post-failback of the primary VM and, how to prevent the secondary VM from sending the GARP and updating the VIP-MAC address binding.

Additional Information

For more information on performing packet-captures on ESXi hosts configured with NSX, please refer: Performing Packet Captures on ESXi host configured with NSX