Symptoms:
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
VMware NSX for vSphere 6.3.x
VMware NSX for vSphere 6.4.x
The high volume of container updates causes locking of the complete data path resulting in high latency. Typically this might be seen in VDI environments where VMs are constantly added/removed and powered cycled leading to frequent container updates.
This issue is resolved in VMware NSX Data Center for vSphere 6.4.7. Upgrade to the latest version of NSX for vSphere following the document Download Broadcom products and software .
Workaround:
To workaround this issue enable Global Containers via API call.
First obtain the current configuration.
GET https://<NSX-Manager-IP>/api/4.0/firewall/config/globalconfiguration
<?xml version="1.0" encoding="UTF-8"?>
<globalConfiguration>
<layer3RuleOptimize>false</layer3RuleOptimize>
<layer2RuleOptimize>false</layer2RuleOptimize>
<tcpStrictOption>false</tcpStrictOption>
<enableGlobalContainers>false</enableGlobalContainers>
<autoDraftDisabled >true<autoDraftDisabled>
</globalConfiguration>
Update <enableGlobalContainers> to true
PUT https://<NSX-Manager-IP>/api/4.0/firewall/config/globalconfiguration
<?xml version="1.0" encoding="UTF-8"?>
<globalConfiguration>
<layer3RuleOptimize>false</layer3RuleOptimize>
<layer2RuleOptimize>false</layer2RuleOptimize>
<tcpStrictOption>false</tcpStrictOption>
<enableGlobalContainers>true</enableGlobalContainers>
<autoDraftDisabled >true<autoDraftDisabled>
</globalConfiguration>
Confirm the change is in place using the GET command again,
GET https://<NSX-Manager-IP>/api/4.0/firewall/config/globalconfiguration
Impact/Risks:
Do not enable Global Containers if Spoofguard is in use. This can cause vNic disconnects during vMotion resulting in failed migrations.
DFW rules have two components. The rule itself, which specifies the 5 Tuple & Action, and the Address sets which are specified as part of the SRC/DST.
From SRC A to DST B, Service C, Action Applied to D. Where SRC A and DST B could be Address Sets in the form of
SRC A {
Address A1
CIDR A2/Mask
Address A3
...
}
and AppliedTo D is a set of vNIC UUIDs where this rule needs to be applied in the form of
appliedTo {
vNIC1 UUID
vNIC2 UUID
..
}
Since the Applied To field is optional, this defaults to the rule being applied to every filter(vNic) in the Cluster. Since we program the rules based upon the filters mentioned in the Applied To, leading to each filter potentially having a unique set of rules. Along with the rules, we program the corresponding Address Sets (SRC A, DST B, etc) consumed by the rule. Given that typically the rules themselves are limited per filter (few hundred to low thousands), and that each filter could have unique rules, these consume a finite amount of memory (which is limited to a max of 3GB) on the system.
The consolidation ratio (number of VMs/ Per host) typically stands around 30-60, but can go up to a few hundreds in cases such as VDI etc. This leads to the rules and address sets being replicated across each filter. We have typically seen large number of addresses specified in Address Sets which leads to a memory bloat as well as configuration churn. Given that rules typically do not change that often in a datacenter, but Dynamic Address Sets which can get populated based upon the VMs being added or removed (including Powered On of Off) leads to constant churn in configuration as well as undefined size of the address set.
Since the Address sets are also configured per Filter, every time the Address Set changes a config cycle has to be performed on the filters where its consumed, which leads to an interruption in the datapath. Unlike the custom rules per filter, the Address Set information actually is the same across all the filters. As an optimization, by enabling Global / Shared Address Set, we keep only one copy of the address set in the DFW engine hence the config is done only once per an update rather than per filter. Also, since there is only one copy, the memory consumption reduces substantially. Each filter instead of now having its own copy, simply points to the Shared Address Set.
This would lead to the following being shown on the host
vsipioctl getaddrsets -f <filtername>
global addrset <<===
addrset ip-spoofguard-sfw {
# generation number: 0
# realization time : 2020-01-23T06:54:09
ip 175.20.0.194,
ip 2001:3002::250:56ff:febb:b430,
}