Understanding Exception 6 purple diagnostic screens

Products

VMware vSphere ESXi

Issue/Introduction

Notes:

If you encounter a purple diagnostic screen that does not contain either of the bolded messages above, see Interpreting an ESX host purple diagnostic screen (1004250) to determine the type of exception condition encountered.
If you have encountered an Invalid Opcode message but have not experienced a purple diagnostic screen, see Understanding Invalid Opcode messages (1003278).

Symptoms:

The purple diagnostic screen contains one of these messages, depending on the product version that you are using:

VMware ESX 4.0, VMware ESX 4.1

VMware ESX [Releasebuild-164009 X86_64] #UD Exception(6) in world 4234:hxge @ 0x418001086455 frame=0x4100c0457db8 ip=0x418001086455 cr2=0x0 cr3=0x7f59a000 err=6 rflags=0x10202 cr4=0x16c rax=0x6c rbx=0xb2 rcx=0xc0457ca3 rdx=0x6c rbp=0x4100c0457ec0 rsi=0x4100c0457d30 rdi=0x0 r8=0x10 r9=0x4100c0457d50 r10=0x0 r11=0x0 r12=0x4100b6005570 r13=0x0 r14=0xffff r15=0x2964a5 0:4096/console 1:4097/idle1 2:4098/idle2 3:4099/idle3 *4:4234/hxge 5:4101/idle5 6:4102/idle6 7:4103/idle7 8:4104/idle8 9:4105/idle9 10:4106/idle10 11:4107/idle11 12:4108/idle12 13:4109/idle13 14:4110/idle14 15:4111/idle15 @BlueScreen: #UD Exception(6) in world 4234:hxge @ 0x418001086455 Code starts at 0x418000c00000 0x4100c0457ec0:[0x418001086455]reclaim_tx_ring+0x5c stack: 0x400 0x4100c0457f60:[0x418001087f7b]reclaim_thread+0x6a stack: 0x41000000001e 0x4100c0457fa0:[0x418000fb4fc2]LinuxStartFunc+0x51 stack: 0x1e 0x4100c0457ff0:[0x418000c96cff]vmkWorldFunc+0x52 stack: 0x0 0x4100c0457ff8:[0x0]Unknown stack: 0x0 VMK uptime: 0:07:32:07.408 TSC: 68550556360975 FSbase (0x0) GSbase (0x0) kernelGSbase (0x0)

VMware ESX 3.5

VMware ESX Server [Releasebuild-163429] Exception type 6 in world 1047:helper1-2 @ 0x8b35a9 frame=0x3a5fe24 ip=0x8b35a9 cr2=0x0 cr3=0x3f6a3000 cr4=0x16c es=0x674028 ds=0x2854028 fs=0x2884028 gs=0x4028 eax=0x0 ebx=0x2255100 ecx=0x41502000 edx=0x0 ebp=0x3d641078 esi=0x3d6011f4 edi=0x3d6011e8 err=0 eflags=0x10246 0:1024/console 1:1025/idle1 2:1026/idle2 *3:1047/helper1-2 4:1028/idle4 5:1029/idle5 6:1030/idle6 7:1031/idle7 8:1032/idle8 9:1033/idle9 10:1034/idle10 11:1035/idle11 12:1036/idle12 13:1037/idle13 14:1038/idle14 15:1039/idle15 @BlueScreen: Exception type 6 in world 1047:helper1-2 @ 0x8b35a9 0x3d641078:[0x8b35a9]aacraid_esx30+0x75a8 stack: 0x0, 0x0, 0x0 VMK uptime: 0:10:48:56.626 TSC: 72683955512532 0:10:46:53.446 cpu0:1047)NMI: 1625: Faulting eip:esp [0x6402a7:0x3a5fe2c] 0:10:48:53.446 cpu1:1025)Heartbeat: 470: PCPU 0 didn't have a heartbeat for 180 seconds. *may* be locked up 0:10:48:53.446 cpu0:1047)NMI: 1625: Faulting eip:esp [0x6402ab:0x3a5fe2c] Starting coredump to disk Starting coredump to disk Dumping using slot 1 of 1... using slot 1 of 1... log

Environment

VMware ESXi 4.0.x Installable
VMware ESXi 3.5.x Embedded
VMware ESX 4.1.x
VMware ESXi 4.1.x Embedded
VMware ESX Server 3.0.x
VMware ESXi 3.5.x Installable
VMware ESX Server 3.5.x
VMware ESXi 4.0.x Embedded
VMware ESXi 4.1.x Installable
VMware ESX 4.0.x

Resolution

Overview

An Operation Code (Opcode) is part of a machine language instruction that specifies an operation to be performed. An Opcode is a specific instruction, for example add, subtract, copy, or compare.

The specification and format of an Opcode is specified in the instruction set architecture of the processor (CPU) or in a component of the processor.

Another part of the machine language instruction is the Operand, which specifies the data that needs to be manipulated.

For example, an instruction for a CPU can be written to add (the opcode) a number (an operand) and another number (another operand) together.

There are other elements required for a complete instruction, however, such as CPU registers and memory addresses. Details on additional elements fall outside the scope of this article.

Exception 6: Invalid Opcode

In ESX, the #UD Exception (6) or Exception Type 6 purple diagnostic screen results from a CPU attempting to execute an undefined or invalid instruction (Opcode). These may be caused by, but not limited to:

Driver errors:
- A device driver may have mishandled instructions involving a problem peripheral device.
- A device may not be fully supported by the device driver.
- The device driver may have software issues.
Firmware errors: A device may have malfunctioning or unsupported firmware which results in incorrect hardware initialization or utilization. This may potentially manifest as an error in the device driver.
Hardware failure: An exceptional condition has been encountered by a hardware component either partially or completely. This may include system memory, CPUs, the system motherboard, or a peripheral.
Memory corruption:
- The above, including software issues, can result in corruption of hardware memory. Instructions involving corrupted memory regions can result in several exception types, including an Exception 6.
- Memory corruption may also result in Exception 13 or Exception 14 purple diagnostic screens. For more information, see Understanding Exception 13 and Exception 14 purple diagnostic screen events (1020181).

Steps

Take a screenshot from your System Management Interface (such as iLO, iLOM, DRAC, RSA, or an IP KVM), and wait for the server to generate a core dump.

Note: The core dump process can take up to an hour to complete. Its completion should be indicated by a Disk dump successful message towards the end of the diagnostic screen's printed output.
Reboot the server.
Collect diagnostic information from the affected VMware ESX host and submit a support request. For more information, see Collecting diagnostic information for VMware products (1008524).

Additional Information

For more information on the analysis and interpretation of an ESX host purple diagnostic screen, see: Interpreting an ESX host purple diagnostic screen (1004250).

VMware ESX 3.5

Note: If you encounter an error has not been documented within the knowledge base, collect diagnostic information from the ESX host and submit a support request.

Understanding Invalid Opcode messages
Interpreting an ESX/ESXi host purple diagnostic screen
Collecting diagnostic information for VMware products