Purple diagnostic screen with entries VmMemPf_UserWorldPageFault and UserMem_HandleMapFault on ESXi 6.0 host
search cancel

Purple diagnostic screen with entries VmMemPf_UserWorldPageFault and UserMem_HandleMapFault on ESXi 6.0 host

book

Article ID: 317982

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • ESXi 6.0 host fails with a purple diagnostic screen.
     
  • The purple diagnostic screen contains entries similar to:
     

    0xnnnnnnnn :[0xnnnnnnnn ]Code start: 0x41801e400000 VMK uptime: 10:01:15:26.076 <YYYY-MM-DD><time>068Z cpu44:36887)Saved backtrace from: pcpu 38 Heartbeat NMI
    </time>0xnnnnnnnn :[0xnnnnnnnn ]LogWarning@vmkernel#nover+0x25 stack: 0x43918761be68
    0xnnnnnnnn :[0xnnnnnnnn ]_Warning@vmkernel#nover+0x50 stack: 0x43918761bbb0
    0xnnnnnnnn :[0xnnnnnnnn ]VmMemPf@vmkernel#nover+0x3d9 stack: 0x7b6cc43bc8552
    0xnnnnnnnn :[0xnnnnnnnn ]VmMemPf_UserWorldPageFault@vmkernel#nover+0xf9 stack: 0x4308a5e8c760
    0xnnnnnnnn :[0xnnnnnnnn ]UserMem_HandleMapFault@<None>#<None>+0x1042 stack: 0x43918761be00
    0xnnnnnnnn :[0xnnnnnnnn ]User_Exception@<None>#<None>+0x126 stack: 0x341d1c30
    0xnnnnnnnn :[0xnnnnnnnn ]Int14_PF@vmkernel#nover+0x17f stack: 0x3ffde3f21b0
    0xnnnnnnnn :[0xnnnnnnnn ]gate_entry_@vmkernel#nover+0x0 stack: 0x0
    0xnnnnnnnn :[0xnnnnnnnn ]base fs=0x0 gs=0x41804b000000 Kgs=0x0 <YYYY-MM-DD><time>013Z cpu38:37100)NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0x6199d(0x41801e400000):0xbad0025:0x4010](Src0x1, CPU38)
    </time>0xnnnnnnnn :[0xnnnnnnnn ]NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd2eab(0x41801e400000):0x430419269320:0x4010](Src 0x1, CPU38)
    0xnnnnnnnn :[0xnnnnnnnn ]NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0x6199d(0x41801e400000):0xbad0025:0x4010](Src0x1, CPU38)
    0xnnnnnnnn :[0xnnnnnnnn ]NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd2eab(0x41801e400000):0x430419269320:0x4010](Src 0x1, CPU38)
    0xnnnnnnnn :[0xnnnnnnnn ]Backtrace for current CPU #44, worldID=36887, rbp=0x0
    0xnnnnnnnn :[0xnnnnnnnn ]PanicvPanicInt@vmkernel#nover+0x37e stack: 0x439180b9bcc8, 0x43008d0
    0xnnnnnnnn :[0xnnnnnnnn ]Panic_WithBacktrace@vmkernel#nover+0x56 stack: 0x439180b9bd30, 0x439
    0xnnnnnnnn :[0xnnnnnnnn ]Heartbeat_DetectCPULockups@vmkernel#nover+0x4f7 stack: 0x3877a7100,
    0xnnnnnnnn :[0xnnnnnnnn ]Timer_BHHandler@vmkernel#nover+0xea stack: 0x4390d1e58318, 0x0, 0x7b
    0xnnnnnnnn :[0xnnnnnnnn ]BH_DrainAndDisableInterrupts@vmkernel#nover+0x78 stack: 0xef, 0x4301
    0xnnnnnnnn :[0xnnnnnnnn ]IDT_IntrHandler@vmkernel#nover+0x1c6 stack: 0x22, 0x246, 0xfffffffff
    0xnnnnnnnn :[0xnnnnnnnn ]gate_entry_@vmkernel#nover+0x0 stack: 0x0, 0x41003f6f40c0, 0x0, 0x0,
    0xnnnnnnnn :[0xnnnnnnnn ]Interrupts_SetFlags@vmkernel#nover+0x4 stack: 0xfffffffffc607a38, 0x
    0xnnnnnnnn :[0xnnnnnnnn ]VMMVMKCall_Call@vmkernel#nover+0xf0 stack: 0x41801e4ab724, 0x0, 0x40
    0xnnnnnnnn :[0xnnnnnnnn ]vmkernel 0x0 .data 0x0 .bss 0x0

  • In the VMkernel log \var\log\ files, entries similar to the following are seen:

    <0xnnnnnnnn :[0xnnnnnnnn ] WARNING: VmMemPf: vm 498729: 654: COW copy failed: pgNum=0x23de9, mpn=0x3fffffffffESC<br>0xnnnnnnnn :[0xnnnnnnnn ] WARNING: VmMemPf: vm 498729: 654: COW copy failed: pgNum=0x23de9, mpn=0x3fffffffffESC
    0xnnnnnnnn :[0xnnnnnnnn ] WARNING: VmMemPf: vm 498729: 654: COW copy failed: pgNum=0x23de9, mpn=0x3fffffffffESC

    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on the environment.
     



Environment

VMware vSphere ESXi 6.0

Cause

This issue occurs due to overflow of P2M (Physical to Machine) buffer and retry operations which results to a very high log spew in vmkernel log and causes CPU lockups.

Resolution

This issue is resolved in ESXi 6.0 Update 1b, available at the Broadcom Support Portal.