VMware NSX Distribute Firewall blocking traffic after changing the IP discovery profile
search cancel

VMware NSX Distribute Firewall blocking traffic after changing the IP discovery profile

book

Article ID: 372146

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • There are multiple IPs assigned to a vNIC, such as a load balancer, that is associated with dynamic grouping through tagging populated through IP discovery and utilized by the Distributed Firewall.
  • Following a change from one IP discovery profile with TOFU (Trust On First Use) enabled and a binding limit greater than 1 to another IP discovery profile with TOFU enabled and a binding limit greater than 1, a traffic outage is observed.
  • The impacted IPs are no longer observed in the impacted group utilized by the Distributed Firewall. 
  • The impacted IPs are observed in the discovered binding list but not observed in the realized bindings list. 
  • Any IP that is associated with a vNIC that is continuously passing traffic does not recover from the issue. 

Cause

The workflow of NSX is to apply the default profile with TOFU enabled and a binding limit of 1 to be realized before applying the new profile, due to the brief realization of the default profile with a binding limit of 1, all bindings except the latest realized binding that had been discovered prior to the change in profiles may be removed from the realized bindings list. 

Impacted IPs that continuously pass traffic will not reach the binding timeout value as the timeout only starts when the vNIC is idle. 

Resolution

This is a known issue impacting VMware NSX, currently there is no resolution. 

 

Workarounds

  1. After changing the IP discovery profile briefly disable TOFU for the new profile in the GUI to force the realization of the bindings. This has the possible downside of clearing the bindings that are currently inactive, requiring a fresh ARP to re-bind the IPs.
  2. Set the new profile with TOFU disabled, a binding limit 256 and a binding timeout to a value above 20 min. The down side here is a possible increase in churn for groups as IPs will time out and re-add depending on traffic patterns.
  3. Configure IP sets as opposed to the utilization of IP discovery