VRRP between WatchGuard VMs in the NSX environment - connectivity issues when standby VRRP VM is turned ON
search cancel

VRRP between WatchGuard VMs in the NSX environment - connectivity issues when standby VRRP VM is turned ON

book

Article ID: 398176

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

- VRRP between WatchGuard VMs in Active-Standby state for both internal VRRP network (NSX segment) and WAN network (NSX segment). VRRP uses unicast in this scenario to negotiate Active/Standby status.

- With one VRRP Active WatchGuard VM internet connectivity works fine, but as soon as the Standby WatchGuard VM is turned ON, the internet traffic stops working

- Doing the traceroute of both working situation and non working situation. In working scenario, traceroute completes fine

- But in non working scenario, it fails at NSX segment point

- Next we can a continuous ping from the source VM to public DNS IP in internet and captured on ESXi host for the source VM's VnicTx,Rx, then at Edge VMs on Tier0 and at uplinks which is connected to physical environment 

- From the captures we can see that the ping traffic (ICMP request) was going out correctly (Source VM --> Active WatchGuard VM --> Tier0 (Edge cluster) ---> BGP to ToR and we could see ICMP replies coming back from BGP towards other Edge in the Edge Cluster and this sends to a different ESXi host (via TEP connection) where the Standby WatchGuard VM is present and here the traffic is delivered to WatchGuard standby VM and the Watchguard VM drops the packets as its Standby

- Checking the VTEP table on one Edge shows the same VRRP floating IP and virtual MAC pointing towards Host1, and the VTEP table on second Edge shows the same VRRP floating IP and virtual MAC pointing towards Host 2:

Environment

VMware NSX

Cause

- This is due to the fact that when VRRP is setup, there is a floating MAC and IP which should be only be using the active WatchGuard VM's WAN interface, but here we can see the ARP sent even from Standby WatchGuard VM causing the duplicate MAC/IP situation and traffic reaching to standby WatchGuard is being dropped by WatchGuard

- Capture at Vnic of the Standby WatchGuard VM on the ESXi host, we can see the ARP from the VRRP standby WatchGuard VM with same floating IP and MAC causing the duplicates and hence causing this internet connectivity issue

Resolution

Need to contact 3rd party WatchGuard support to have this standby issue fixed by making sure standby WatchGuard doesn't send out an ARP with same virtual MAC and floating IP