ESXi host prepared for NSX-T may experience a PSOD under high DFW rule churn due to a race condition in the cleanup event of dependent addresets
search cancel

ESXi host prepared for NSX-T may experience a PSOD under high DFW rule churn due to a race condition in the cleanup event of dependent addresets

book

Article ID: 325139

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • ESXi hosts crash with a PSOD
  • Continuous rule churn in the environment from Adding / Creating / Deleting / Modifying DFW rules


Environment

VMware NSX-T Data Center 2.5.x
VMware NSX-T Data Center 3.x
VMware NSX-T Data Center

Cause

  1. When there are multiple transactions made by DFW onto the DFW filters on the hosts, there is a race condition in the code that leads to datastructure corruption, and that is what causes the PSOD.  
  2. When there is a churn of DFW rule config (add / delete / modify / update) to the rules that are present on the VNIC filters, or when there is a vMotion of VMs from one host to another and when a concurrent update happens to the filters by where the ruleset are updated as in after vMotion. The cleanup of the older rulesets causes the race condition. This cleanup coming as a concurrent update causes a corruption thus leading to a PSOD 

Resolution

This issue is resolved in:
VMware NSX-T Data Center 2.5.2.1 Express Patch available at
VMware Downloads
VMware NSX-T Data Center 3.0.2 available at VMware Downloads

Workaround:
Reboot the ESXi host

Additional Information

Impact/Risks:
Affects VMware NSX-T Data Center:

2.5.0, 2.5.1, and 2.5.2 
3.0.0 and 3.0.1