When a OOPS or Panic occurs in the service console of a VMware ESX host, a purple screen fault is generated.
Note: If the Advanced Setting, Misc.PsodOnCosPanic is set to zero (0), a purple screen fault does not occur. Ensure this is not the case as the purple screen information is necessary to diagnose any issues the host is experiencing. Also, ensure that the Misc.CosCoreFile is set correctly so that a core dump for the service console is also generated.
The contents of the service console fault based purple screen contain two main components. The first component is the VMkernel purple screen output and the second is the service console Linux kernel output. For more information related to decoding a VMkernel purple screen, see Interpreting an ESX host purple diagnostic screen (1004250) .
The contents from this example are:
VMware ESX Server [Releasebuild-64607]
Oops
frame=0x1f16d34 ip=0xc022e995 cr2=0x100 cr3=0x13401000 cr4=0x6f0
es=0x68 ds=0xc02a0068 fs=0x0 gs=0x0
eax=0x0 ebx=0x0 ecx=0x1 edx=0x800
ebp=0x0 esi=0x0 edi=0xc03a7b20 err=0 eflags=0x0
*0:1024/console 1:1025/idle1 2:1026/idle2 3:1027/idle3
4:1028/idle4 5:1029/idle5 6:1030/idle6 7:1031/idle7
0x0:[0xc022e995]blk_dev+0xbd98d934 stack: 0x0, 0x0, 0x0
VMK uptime: 0:00:02:17.807 TSC: 343459198808
0:00:02:11.319 cpu0:1024)Host: 4781: COS Error: Oops
Starting coredump to disk Starting coredump to disk Dumping using slot 1 of 1... using slot 1 of 1... log
Stack trace from cos log:
<4>EIP: 0060:[<c022e995>] Tainted: P
<4>EFLAGS: 00010246
<4>
<4>EIP is at sr_finish [kernel] 0xa5 (2.4.21-47.0.1.ELvmnix/i686)
<4>eax: 00000000 ebx: 00000000 ecx: 00000001 edx: 00000800
<4>esi: 00000000 edi: c03a7b20 ebp: 00000000 esp: c204fd70
<4>ds: 0068 cs: 0060 es: 0068 ss: 0068
<4>Process esxcfg-rescan (pid: 2997, stackpage=c204f000)
<4>Stack: c045ca80 00000400 00000000 00000000 80000000 00000282 00000001 c03a7ac0
<4> c9fc6e00 c022ca10 00000003 c021cb94 c02d6fa8 00000000 00000000 00000004
<4> c204fdd0 c204fdd4 c204fdd8 c8c4ae00 c204fddc 00000000 00000004 c204fddc
<4>Call Trace: [<c022ca10>] sd_attach [kernel] 0x0 (0xc204fd94)
<4>[<c021cb94>] scan_scsis [kernel] 0x3d4 (0xc204fd9c)
<4>[<c0123b79>] printk [kernel] 0x149 (0xc204feb8)
<4>[<c0212b44>] proc_scsi_gen_write [kernel] 0x624 (0xc204feec)
<4>[<c0168ffe>] locate_fd [kernel] 0xae (0xc204ff40)
<4>[<c0180130>] proc_file_write [kernel] 0x40 (0xc204ff80)
<4>[<c0158a73>] sys_write [kernel] 0xa3 (0xc204ff94)
<4>[<c02a406f>] no_timing [kernel] 0x7 (0xc204ffc0)
<4>[<c02a002b>] zlib_tr_flush_block [kernel] 0x3b (0xc204ffe0)
<4>
<4>Code: 89 90 00 01 00 00 a1 80 9f 4b c0 80 4c 18 12 01 a1 80 9f 4b
<4>
<4>
<4>dell_rbu 0xd2188060 -s .data 0xd2189dcc -s .bss 0xd2189e00
<4>ppdev 0xd2185060 -s .data 0xd2186b80 -s .bss 0xd2186c00
<4>parport 0xd217a060 -s .data 0xd2183540 -s .bss 0xd2183880
<4>ipmi_devintf0xd2160060 -s .data 0xd21614e0 -s .bss 0xd2161580
<4>ipmi_si_drv0xd2171060 -s .data 0xd2177f00 -s .bss 0xd21780c0
<4>ipmi_msghandler0xd2168060 -s .data 0xd216f170 -s .bss 0xd216f1e0
<4>ipt_REJECT0xd2165060 -s .data 0xd21662c0 -s .bss 0xd2166320
The service console panic output starts from:
Stack trace from cos log:
This first important piece of information is the EIP and where the fault had occurred. This shows you where in the Linux kernel the system had caught (or triggered) the fault. In this example, the function that was running in the Linux kernel at the time was sr_finish. This function is used in the processing of storage information.
<4>EIP: 0060:[<c022e995>] Tainted: P
<4>EFLAGS: 00010246
<4>
<4>EIP is at sr_finish [kernel] 0xa5 (2.4.21-47.0.1.ELvmnix/i686)
The next lines are the register dump. This section shows the register and its contents at the time of the fault:
<4>eax: 00000000 ebx: 00000000 ecx: 00000001 edx: 00000800
<4>esi: 00000000 edi: c03a7b20 ebp: 00000000 esp: c204fd70
<4>ds: 0068 cs: 0060 es: 0068 ss: 0068
This line is very important. The line shows the process that was running at the time of the fault. In this case, a storage rescan was being performed:
<4>Process esxcfg-rescan (pid: 2997, stackpage=c204f000)
These lines contain the content of the stack:
<4>Stack: c045ca80 00000400 00000000 00000000 80000000 00000282 00000001 c03a7ac0
<4> c9fc6e00 c022ca10 00000003 c021cb94 c02d6fa8 00000000 00000000 00000004
<4> c204fdd0 c204fdd4 c204fdd8 c8c4ae00 c204fddc 00000000 00000004 c204fddc
These lines are the call trace of what the Linux kernel was doing prior to the failure. Use this information to help you diagnose any issues. In this example SCSI scanning was in progress:
<4>Call Trace: [<c022ca10>] sd_attach [kernel] 0x0 (0xc204fd94)
<4>[<c021cb94>] scan_scsis [kernel] 0x3d4 (0xc204fd9c)
<4>[<c0123b79>] printk [kernel] 0x149 (0xc204feb8)
<4>[<c0212b44>] proc_scsi_gen_write [kernel] 0x624 (0xc204feec)
<4>[<c0168ffe>] locate_fd [kernel] 0xae (0xc204ff40)
<4>[<c0180130>] proc_file_write [kernel] 0x40 (0xc204ff80)
<4>[<c0158a73>] sys_write [kernel] 0xa3 (0xc204ff94)
<4>[<c02a406f>] no_timing [kernel] 0x7 (0xc204ffc0)
<4>[<c02a002b>] zlib_tr_flush_block [kernel] 0x3b (0xc204ffe0)
<4>
This line is the machine code that was running on the CPU at the time of the fault:
<4>Code: 89 90 00 01 00 00 a1 80 9f 4b c0 80 4c 18 12 01 a1 80 9f 4b
This is a list of the kernel modules loaded:
<4>dell_rbu 0xd2188060 -s .data 0xd2189dcc -s .bss 0xd2189e00
<4>ppdev 0xd2185060 -s .data 0xd2186b80 -s .bss 0xd2186c00
<4>parport 0xd217a060 -s .data 0xd2183540 -s .bss 0xd2183880
<4>ipmi_devintf0xd2160060 -s .data 0xd21614e0 -s .bss 0xd2161580
<4>ipmi_si_drv0xd2171060 -s .data 0xd2177f00 -s .bss 0xd21780c0
<4>ipmi_msghandler0xd2168060 -s .data 0xd216f170 -s .bss 0xd216f1e0
<4>ipt_REJECT0xd2165060 -s .data 0xd21662c0 -s .bss 0xd2166320
Note: If you need more assistance diagnosing your purple screen error: