ESXi Host PSOD Due to dvfilter Race Condition which could lead dvfilter to call agent's restore function twice.
search cancel

ESXi Host PSOD Due to dvfilter Race Condition which could lead dvfilter to call agent's restore function twice.

book

Article ID: 409331

calendar_today

Updated On:

Products

VMware vDefend Firewall

Issue/Introduction

Customers may observe PSOD on ESXi hosts under specific conditions involving the dvfilter component.

The issue is caused when the dvfilter state restoration process is invoked twice due to a race condition. This occurs during port enablement when packets arrive before the initial state restoration completes, leading to duplicate restoration calls and eventually triggering a PSOD.

Environment

VMware ESXi 8.0.x

VMware NSX

Cause

A race condition in dvfilter is the root cause.

  • When a port is enabled, dvfilter initiates state restoration via an asynchronous helper task.

  • If packets are received before the first restore completes, a second restore request is queued.

  • This results in duplicate state restoration calls, leading to heap corruption and a PSOD.

The issue is already fixed in the ESXi 9.1 branch.

Log Excerpts:

Backtrace at PSOD (PSOD backtrace with sensitive bits masked (pointers/addresses and exact file paths/line numbers), while keeping the call flow intact)

(gdb) where
#0  DLM_free (msp=0x************, mem=<optimized out>, allowTrim=1) at …/dlmalloc.c:****
#1  0x**************** in Heap_Free (heap=0x************, mem=<optimized out>) at …/heap.c:****
#2  0x**************** in vmk_HeapFree (heap=<optimized out>, mem=<optimized out>) at …/vmkapi_heap.c:****
#3  0x**************** in VSIPFreeFromHeapWithoutAccounting (heapID=<optimized out>, data=<optimized out>) at …/vsip_util.c:****
#4  0x**************** in uma_zfree_arg (z=0x************, item=item@entry=0x************, arg=arg@entry=0x0) at …/glue.c:****
#5  0x**************** in uma_zfree (item=0x************, zone=<optimized out>) at …/uma.h:****
#6  pfr_destroy_ktable (kif=kif@entry=0x************, kt=kt@entry=0x************, flushflags=7, set=PFR_SET_ACTIVE) at …/pf_table.c:****
#7  0x**************** in pfr_setflags_ktable (…) at …/pf_table.c:****
#8  0x**************** in pfr_detach_table (…) at …/pf_table.c:****
#9  0x**************** in pf_tbladdr_remove (…) at …/pf.c:****
#10 0x**************** in pf_rm_rule (…) at …/pf_ioctl.c:****
#11 0x**************** in pf_commit_rules (…) at …/pf_ioctl.c:****
#12 0x**************** in pfioctl (…) at …/pf_ioctl.c:****
#13 0x**************** in VSIPCommitTransaction (…) at …/msg2pf.c:****
#14 0x**************** in PFImportSingleRulesetTLV (…) at …/migrate.c:****
#15 0x**************** in PFImportRulesTLV (…) at …/migrate.c:****
#16 0x**************** in PFImportStateTLV (…) at …/migrate.c:****
#17 0x**************** in PFImportState (…) at …/migrate.c:****
#18 0x**************** in VSIPDVFRestoreState (…) at …/vsip_dvfilter.c:****

On the Host's vmkernel logs:

Filter Creation

Creating filter, expect restore
Filter nic-2786293-eth0-vmware-sfw.2 created
Registered filter nic-2786293-eth0-vmware-sfw.2

First State restoration (Successful)

Restore state called for filter nic-2786293-eth0-vmware-sfw.2
Importing nic-2786293-eth0-vmware-sfw.2
Importing succeeded
Filter creation report: source = Import

Second State Restoration for the same filter causes a PSOD

Restore state called for filter nic-2786293-eth0-vmware-sfw.2
Importing nic-2786293-eth0-vmware-sfw.2
Unconfigured filter nic-2786293-eth0-vmware-sfw.2
--- PSOD Triggered ---

Resolution

  • Fixed in ESXi 9.1 and 8.0.3.0 P07
  • No workaround is available — upgrade to a fixed build is required.