ESXi host fails with purple diagnostic screen in NSX environment
search cancel

ESXi host fails with purple diagnostic screen in NSX environment

book

Article ID: 327887

calendar_today

Updated On:

Products

VMware

Issue/Introduction



Symptoms:
VMware ESXi 5.5.x hosts fails with purple diagnostic screen in NSX for vSphere environment.
Note: For additional symptoms and log entries, see the Additional Information section.


Cause

This issue occurs due to vdrb kernel module in which a counter is used to reference packet dispatch queues and is incremented when processing packets forwarded between DLR and L2 Bridge.
When a signed counter is used, the counter will overflow. A negative value points to invalid memory and causes the system crash.

Resolution

This issue is resolved in VMware NSX for vSphere 6.2.4, available at VMware Downloads.


Additional Information

You experience these additional symptoms:
  • In the ESXi host console, you see entries similar to:
2016-06-27T10:15:26.129Z cpu29:32797)@BlueScreen: #PF Exception 14 in world 32797:idle29 IP 0x418020822659 addr 0x58534e35
PTEs:0x113f03b027;0x10f359d027;0x0;
2016-06-27T10:15:26.129Z cpu29:32797)Code start: 0x418020600000 VMK uptime: 39:21:13:02.659
2016-06-27T10:15:26.129Z cpu29:32797)0x41238075d188:[0x418020822659]vmk_PktListAppend@vmkernel#nover+0xd stack: 0x411325ae8c00
2016-06-27T10:15:26.130Z cpu29:32797)0x41238075d238:[0x4180216ee72a]VdrTxToVswitch@<None>#<None>+0x18e stack: 0x41238075d268
2016-06-27T10:15:26.130Z cpu29:32797)0x41238075d2d8:[0x418020792712]IOChain_Resume@vmkernel#nover+0x17a stack: 0x400000a
2016-06-27T10:15:26.130Z cpu29:32797)0x41238075d348:[0x41802077ab4b]Port_InputResume@vmkernel#nover+0xc3 stack: 0x410a497546e0
2016-06-27T10:15:26.131Z cpu29:32797)0x41238075d378:[0x41802081dda4]vmk_PortInput@vmkernel#nover+0x98 stack: 0x41368ba5e500
2016-06-27T10:15:26.131Z cpu29:32797)0x41238075d438:[0x4180216fd432]VdrTxDispatchQueueDrain@<None>#<None>+0xca stack: 0x2c0176bfca
2016-06-27T10:15:26.132Z cpu29:32797)0x41238075d678:[0x4180216ee4bf]VdrProcessPackets@<None>#<None>+0x14f stack: 0xef
2016-06-27T10:15:26.132Z cpu29:32797)0x41238075d718:[0x418020792712]IOChain_Resume@vmkernel#nover+0x17a stack: 0x4180207971b7
2016-06-27T10:15:26.132Z cpu29:32797)0x41238075d768:[0x41802077a212]PortOutput@vmkernel#nover+0x136 stack: 0x4108dbd56ef0
2016-06-27T10:15:26.133Z cpu29:32797)0x41238075d7c8:[0x418020e77f50]EtherswitchForwardLeafPortsQuick@<None>#<None>+0x4c stack: 0x4112acb
2016-06-27T10:15:26.133Z cpu29:32797)0x41238075d9e8:[0x418020e78f51]EtherswitchPortDispatch@<None>#<None>+0xe29 stack: 0x412400000018
2016-06-27T10:15:26.134Z cpu29:32797)0x41238075da58:[0x41802077ac17]Port_InputResume@vmkernel#nover+0x18f stack: 0x4109227881c0
2016-06-27T10:15:26.134Z cpu29:32797)0x41238075daa8:[0x41802077bf09]Port_Input_Committed@vmkernel#nover+0x25 stack: 0x1a7d26000
2016-06-27T10:15:26.135Z cpu29:32797)0x41238075db38:[0x4180207ccc75]Vmxnet3VMKDevTQDoTx@vmkernel#nover+0x28d stack: 0x2
2016-06-27T10:15:26.135Z cpu29:32797)0x41238075dba8:[0x4180207d0b63]Vmxnet3VMKDev_AsyncTx@vmkernel#nover+0xa3 stack: 0x1e923141dde1b6
2016-06-27T10:15:26.135Z cpu29:32797)0x41238075dbe8:[0x4180207afa9b]NetWorldletPerDevCB@vmkernel#nover+0x77 stack: 0x418047400000
2016-06-27T10:15:26.136Z cpu29:32797)0x41238075dd48:[0x4180206eb7b7]WorldletProcessQueue@vmkernel#nover+0xcf stack: 0x1
2016-06-27T10:15:26.136Z cpu29:32797)0x41238075dd88:[0x4180206ec27c]WorldletBHHandler@vmkernel#nover+0x54 stack: 0x1e9231420a4893
2016-06-27T10:15:26.137Z cpu29:32797)0x41238075ddf8:[0x41802062e659]BH_Check@vmkernel#nover+0xc9 stack: 0xffffffff00000000
2016-06-27T10:15:26.137Z cpu29:32797)0x41238075de68:[0x41802085093d]CpuSchedIdleLoopInt@vmkernel#nover+0x391 stack: 0x41238075de98
2016-06-27T10:15:26.138Z cpu29:32797)0x41238075de78:[0x418020858b3a]CpuSched_IdleLoop@vmkernel#nover+0x16 stack: 0x2
2016-06-27T10:15:26.138Z cpu29:32797)0x41238075de98:[0x418020666910]Init_SlaveIdle@vmkernel#nover+0x78 stack: 0x0
2016-06-27T10:15:26.138Z cpu29:32797)0x41238075dfe8:[0x4180209ac5f2]SMPSlaveIdle@vmkernel#nover+0x29e stack: 0x0

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
NSX 環境で ESXi ホストが紫色の診断画面で失敗する
ESXi 主机在 NSX 环境中发生故障并显示紫色诊断屏幕