DFW rule(s) not working as expected after VM Hardware change or vMotion
book
Article ID: 326334
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Symptoms:
DFW rule(s) not working as expected after live VM Hardware change (vCPU hot add for example) or vMotion.
ESXi host logs (vmkernel.log) display message(s) similar to:
2019-11-06T15:15:57.970Z cpu46:66391)Importing nic-12345-eth0-vmware-sfw.2, Version 500 2019-11-06T15:15:57.972Z cpu46:66391)Importing succeeded Note: filter versions 5 and 6 and also impacted.
The DFW rule affected contains more than 7 listed ports as source or destination.
Example: - Rule 1010 contains 13 listed ports is affected by the issue: #vsipioctl getrules -f nic-12345-eth0-vmware-sfw.2 | grep "rule 1010" rule 1010 at 8 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013} accept; - Rule 1011 contains only 3 listed ports although it includes many ports it is not affected by the issue: #vsipioctl getrules -f nic-12345-eth0-vmware-sfw.2 | grep "rule 1031" rule 1011 at 8 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1001-1013, 2001} accept;
Environment
VMware NSX-T Data Center
Cause
The issue start after a filter export/import operation is triggered. The export/import operations is triggered when a VM is vMotioned or live Hardware change are performed (vCPU hot add for example). Prior to NSX-T 2.5, some DFW filter versions have an issue where the export/import operations only match the first 7 ports of the rule.
As an example: Rule prior to export/import: #vsipioctl getrules -f nic-12345-eth0-vmware-sfw.2 | grep "rule 1010" rule 1010 at 8 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013} accept;
And after the export/import: #vsipioctl getrules -f nic-12345-eth0-vmware-sfw.2 | grep "rule 1010" rule 1010 at 8 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1001, 1002, 1003, 1004, 1005, 1006, 1007} accept;
This issue occurs only when a filter export/import operation is triggered. When the filter is programmed from the NSX Manager, this issue will not occur.
Resolution
This issue is resolved in NSX-T 2.5.0.
Workaround: As temporary workaround, change any DFW rules. This will cause the NSX Manager to push the DFW to all the ESXi hosts and the DFW filter will be programmed correctly.
As a long term workaround, if you don't want to upgrade the following options are available:
Option 1: redesign DFW rules Split the DFW rules in multiple rules with a maximum of 7 source or destination ports.
Example: Initial rule: rule 1010 at 8 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013} accept;
After splitting the DFW Service in two DFW Services where the first service includes 7 ports and the second service with the remaining ports: rule 1010 at 8 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1001, 1002, 1003, 1004, 1005, 1006, 1007} accept; rule 1010 at 9 inout protocol tcp from addrset addrset-10 to addrset addrset-20 port {1008, 1009, 1010, 1011, 1012, 1013} accept;
Option 2: upgrade DFW filter version (only for NSX-T 2.4.x) The issue described in this article is present in the DFW filter versions 5, 6 and 500. The issue is not present with the filter version 1000 introduced in NSX-T 2.4.x. In an NSX-T 2.4.x environment that has been upgraded from NSX-T 2.3.x, VMs may still use the affected filter versions. In order for the VMs to move to filter version 1000 the following options are available:
1. Change the DFW filter version using the following commands: a. Find the DFW filter for the VM: #summarize-dvfilter Example: world 611217 vmm0:MY_TEST_VM vcUuid:'50 08 ef 5b 94 e7 89 96-89 35 3f 66 94 66 7f 28' port 50331664 UPSA - VM C.eth0 vNic slot 2 name:nic-12345-eth0-vmware-sfw.2 <<-- this is the DFW filter (output omitted)
b. Change the DFW filter version to 1000 #vsipioctl setexportversion -f {filtername} -e 1000
2. Power off/Power on VMs using the filter pre version 1000.
3. Add and remove VM using the filter pre version 1000 from the DFW exclusion list. In the NSX-T UI go to Advanced Network & Security > Security > Distributed Firewall > Exclusion List Note: this workaround is disruptive as flows may be dropped when the VM is added back to the DFW exclusion list.
4. Disable and re-enabled DFW globally. In the NSX-T UI go to Advanced Network & Security > Security > Distributed Firewall > Settings Note: this workaround is disruptive as flows may be dropped when DFW is re-enabled.
To verify the DFW filter version follow the steps below: 1. Find the DFW filter for the VM: #summarize-dvfilter
Example: world 611217 vmm0:MY_TEST_VM vcUuid:'50 08 ef 5b 94 e7 89 96-89 35 3f 66 94 66 7f 28' port 50331664 UPSA - VM C.eth0 vNic slot 2 name:nic-12345-eth0-vmware-sfw.2 <<-- this is the DFW filter agentName: vmware-sfw state: IOChain Attached vmState: Attached (output omitted)
2. Run the following vsipioctl command #vsipioctl getfilterstat -f {filter-sfw.2}