NSX Distributed and Gateway Firewalls drop VM traffic during vMotion or Storage vMotion when IP discovery is through VM Tools
search cancel

NSX Distributed and Gateway Firewalls drop VM traffic during vMotion or Storage vMotion when IP discovery is through VM Tools

book

Article ID: 319137

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • During vMotion or Storage vMotion of a Virtual Machine, its IP is incorrectly dropped from Firewall addrset. Flows then do not match specific rules and could hit the default rule.
  • The VM's IP address was changed at some point.
  • Firewall rule source or destination is defined by Group.
  • Group Membership is not done by explicit IP address (i.e. by Virtual Machine, Segment, etc).

  • To verify this behavior on Gateway Firewall, this can confirmed with the below commands on an Edge with the relevant T0 or T1 SR:
    • To obtain the UUID of the T0 or T1 uplink SR interface, run:
      get logical-routers
      vrf <SR VRF#>
      get interfaces
      exit    (To leave VRF and return to Edge shell)
    • To identify the relevant addrset for the expected Allow rule, run:
      get firewall <uplink interface UUID> ruleset rules
       
    • Check addrset membership before, during, and after Storage vMotion:
      get firewall <UUID> addrset name <addrset>
  • Depending on the issue hit, the IP of the VM being relocated will be dropped from the address set for the duration of Storage vMotion, or for around 15 seconds during vMotion or Storage vMotion.
    • IP Discovery profile on Segment likely has default ARP Binding Limit of 1, and the IP of the affected interface has changed at some point.
    • Port in Manager view in UI > Networking > Logical Switches > Ports > Address Bindings > Realized Bindings shows the affected IP has Discovery Type of VM_TOOLS. If continually refreshed, this window will show the affected IP address learned by VM_TOOLS dropped from the Realized Bindings lost during an outage.

Environment

VMware NSX 4.x
VMware NSX-T Data Center 3.x

Cause

For the issue with the outage for the duration of Storage vMotion, VM Tools status is changed to stopped during Storage vMotion. IP addresses discovered by VM Tools are then dropped from Group's Effective Members and Firewall address set, causing traffic to not match the expected rule.

Resolution

The issue with an outage for the full duration of Storage vMotion is resolved in VMware NSX 4.1.0. 
The issue of VM Tools-based IP discovery supporting vMotion and removing the ~15-30 second outage is resolved in VMWare NSX 4.1.1. The ESXi version needs to be at least 8.0 GA to utilize this feature.

 

In order to remove stale entries from the binding list before vMotion, please follow the steps below:

  • Navigate to the VM’s Segment:

    • Go to the segment where the VM is connected.
    • Click on the Segment Profiles tab and confirm that it is using the default-ip-discovery-profile(or TOFU Enabled custom profile).
  • View Interfaces:

    • Click on the "1" in blue under Ports/Interfaces.
    • This displays the current interfaces associated with the selected VM (filter by VM name if there are several ports).
  • View Address Bindings:

    • Click on the ">" icon to expand the Address Bindings section.
  • Review Realized Bindings:

    • Click on Realized Bindings to see a list of all discovered IP addresses.
    • Locate the old IP address that needs removal.
  • Move the Old IP to Ignored Bindings:

    • Select the old IP address and click Copy to Ignore Bindings.
    • This action moves the old IP from Realized Bindings to Ignored Bindings, so only the new IP will be actively associated with the VM in Realized Bindings.
  • Permanently Remove the Ignored Binding:

    • Click on the "1" under Ignored Bindings and select Edit.
    • Choose Delete to permanently remove the old IP from the Ignored Bindings list.
    • Now, only the new IP address remains associated with the NSGroup, and the old IP is fully removed.
  • Perform desired vMotion or Storage vMotion

Workaround:
Rely on DHCP snooping or vmtools based IP discovery. Those are enabled by default but does make assumptions on VM is getting IP from DHCP and VM has vmtools installed.

Additional Information

Impact/Risks:
Outage for VM's traffic for duration of Storage vMotion, or for around 15 seconds during vMotion or Storage vMotion.