ESXi host fails with PSOD (purple screen of death) in NSX Environment
search cancel

ESXi host fails with PSOD (purple screen of death) in NSX Environment

book

Article ID: 306803

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware NSX for vSphere

Issue/Introduction

PSOD (purple screen of death) is experienced with the following backtrace in NSX environment:

2016-10-15T02:44:04.401Z cpu24:33537)World: 9729: PRDA 0x418046000000 ss 0x0 ds 0x4018 es 0x4018 fs 0x0 gs 0x0

2016-10-15T02:44:04.401Z cpu24:33537)World: 9731: TR 0x4000 GDT 0xfffffffffc60a000 (0xffff) IDT 0xfffffffffc608000 (0xffff)

2016-10-15T02:44:04.401Z cpu24:33537)World: 9732: CR0 0x80050033 CR3 0x80ed16f000 CR4 0x52668

2016-10-15T02:44:04.438Z cpu24:33537)Backtrace for current CPU #24, worldID=33537, rbp=0x43a641cc47d0

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b4a0:[0x41800a325397][email protected]#v2_3_0_0+0x3b stack:

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b4d0:[0x41800994e998]IOChain_Resume@vmkernel#nover+0x210 stack: 0x4303e66bf658, 0x4391980

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b570:[0x41800993226e]PortOutput@vmkernel#nover+0xae stack: 0x417fc9a7f280, 0x43a641cc47d0

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b5b0:[0x418009fff5ec]EtherswitchForwardLeafPorts@<None>#<None>+0x244 stack: 0x54a33c00, 0

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b620:[0x41800a000778]EtherswitchPortDispatch@<None>#<None>+0x604 stack: 0x4303e6659f00, 0

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b810:[0x4180099324d3]Port_InputResume@vmkernel#nover+0x17b stack: 0x41800a000174, 0x41800

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b870:[0x4180099dd0f0]vmk_PortInput@vmkernel#nover+0x80 stack: 0x0, 0x4303e6648200, 0x4391

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b8a0:[0x41800a545a4f]VdrTxDispatchQueueDrain@<None>#<None>+0xbf stack: 0x0, 0x80000080000

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809b950:[0x41800a53589d]VdrProcessPackets@<None>#<None>+0x161 stack: 0x0, 0x418000000000, 0x

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bb80:[0x41800994e981]IOChain_Resume@vmkernel#nover+0x1f9 stack: 0x4303e6648798, 0x4391980

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bc20:[0x41800993226e]PortOutput@vmkernel#nover+0xae stack: 0x43a6549dc0c0, 0x43a641d30828

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bc60:[0x418009fff2f2]EtherswitchForwardLeafPortsQuick@<None>#<None>+0x136 stack: 0x439198

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bcb0:[0x41800a000760]EtherswitchPortDispatch@<None>#<None>+0x5ec stack: 0x4303e6648200, 0

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bea0:[0x4180099324d3]Port_InputResume@vmkernel#nover+0x17b stack: 0x41800a000174, 0x41800

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bf00:[0x41800991e7e5]Net_AcceptRxList@vmkernel#nover+0x115 stack: 0x4303e66586c0, 0x1, 0x

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bf60:[0x4180099d9db7]NetPollWorldCallback@vmkernel#nover+0x12f stack: 0x14, 0x418009fc06d

2016-10-15T02:44:04.438Z cpu24:33537)0x43919809bfd0:[0x418009a14a3e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0

2016-10-15T02:44:04.456Z cpu24:33537)^[[45m^[[33;1mVMware ESXi 6.0.0 [Releasebuild-3620759 x86_64]^[[0m

#PF Exception 14 in world 33537:vmnic0-pollW IP 0x41800a325397 addr 0x43070000005a

PTEs:0x8000000100009023;0x800000016a328063;0x0;

 

PSOD can occur during dvfilter usage. When the IOChain callback for dvfilter need to be installed during PortDoEnable, there is a possibility that the filter is not ready, it's like a race condition. The function which inserts the Iochain already catches this race, but one of the callers doesn't verify if the filter is ready.



Environment

VMware vSphere ESXi 6.0
VMware NSX for vSphere 6.4.x

Resolution

This is a known issue in ESXi 6.0 Update 2 (Build 3620759) and ESXi 6.0 Patch 3 (Build 4192238).

To resolve this issue, upgrade to ESXi 6.0 Express Patch 7 (Build 4510822) or later versions.