ESXi host prepared for NSX-T may experience a PSOD under high DFW rule churn due to a race condition in the cleanup event of dependent addresets
book
Article ID: 325139
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Symptoms:
ESXi hosts crash with a PSOD
Continuous rule churn in the environment from Adding / Creating / Deleting / Modifying DFW rules
Environment
VMware NSX-T Data Center 2.5.x VMware NSX-T Data Center 3.x VMware NSX-T Data Center
Cause
When there are multiple transactions made by DFW onto the DFW filters on the hosts, there is a race condition in the code that leads to datastructure corruption, and that is what causes the PSOD.
When there is a churn of DFW rule config (add / delete / modify / update) to the rules that are present on the VNIC filters, or when there is a vMotion of VMs from one host to another and when a concurrent update happens to the filters by where the ruleset are updated as in after vMotion. The cleanup of the older rulesets causes the race condition. This cleanup coming as a concurrent update causes a corruption thus leading to a PSOD
Resolution
This issue is resolved in: VMware NSX-T Data Center 2.5.2.1 Express Patch available at VMware Downloads VMware NSX-T Data Center 3.0.2 available at VMware Downloads