ESXi 6.X PSOD with the following Error "nv_interrupt_handler"
search cancel

ESXi 6.X PSOD with the following Error "nv_interrupt_handler"

book

Article ID: 339978

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • ESXi 6..X host crashes PSOD
  • NVIDIA card is used for SVGA (Super Video Graphics Array)
  • The PSOD stack entries is similar to:

2017-05-02T07:44:40.876Z cpu8:16845109)NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

2017-05-02T07:44:44.176Z cpu10:16791233)WARNING: Heartbeat: 796: PCPU 8 didn't have a heartbeat for 8 seconds; *may* be locked up.

2017-05-02T07:44:44.176Z cpu8:16845109)ALERT: NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd51463(0x418021c00000):0x439149a9ba88:0x4010](Src 0x1, CPU8)

2017-05-02T07:44:44.176Z cpu8:16845109)0x439149a9ba88:[0x418022951463]_nv017268rm@<None>#<None>+0x2f stack: 0x0

2017-05-02T07:44:44.176Z cpu8:16845109)0x439149a9ba98:[0x4180226893af]_nv011967rm@<None>#<None>+0x143 stack: 0x0

2017-05-02T07:44:44.177Z cpu8:16845109)0x439149a9bc68:[0x4180227ea5ca]_nv018771rm@<None>#<None>+0x1a stack: 0x439149a9bcc8

2017-05-02T07:44:44.177Z cpu8:16845109)0x439149a9bc88:[0x4180227ea525]_nv018777rm@<None>#<None>+0x1c5 stack: 0x439149a9bcc8

2017-05-02T07:44:44.177Z cpu8:16845109)0x439149a9be48:[0x41802297f1e8]nv_interrupt_handler@<None>#<None>+0x184 stack: 0x43010447a390

2017-05-02T07:44:44.177Z cpu8:16845109)0x439149a9be88:[0x418021c588e9]IntrCookieBH@vmkernel#nover+0x299 stack: 0x0

2017-05-02T07:44:44.177Z cpu8:16845109)0x439149a9bf28:[0x418021c32bfe]BH_DrainAndDisableInterrupts@vmkernel#nover+0xe2 stack: 0xffffffff00

2017-05-02T07:44:44.177Z cpu8:16845109)0x439149a9bfb8:[0x418021cabc66]VMMVMKCall_Call@vmkernel#nover+0x176 stack: 0x418021cab778

2017-05-02T07:45:25.234Z cpu1:16681040)@BlueScreen: PCPU 8: no heartbeat (3/3 IPIs received)

2017-05-02T07:45:25.234Z cpu1:16681040)Code start: 0x418021c00000 VMK uptime: 30:21:41:35.441

2017-05-02T07:45:25.234Z cpu1:16681040)Saved backtrace from: pcpu 8 Heartbeat NMI

2017-05-02T07:45:25.234Z cpu1:16681040)0x439149a9b580:[0x418022980697]os_acquire_spinlock@<None>#<None>+0x13 stack: 0x418021c7533c

2017-05-02T07:45:25.237Z cpu1:16681040)0x439149a9bcd8:[0x418022726d82]_nv003406rm@<None>#<None>+0x196 stack: 0x4304f7dfee28

2017-05-02T07:45:25.237Z cpu1:16681040)0x439149a9bd48:[0x41802271e922]_nv015412rm@<None>#<None>+0x52 stack: 0x439149a9be38

2017-05-02T07:45:25.238Z cpu1:16681040)0x439149a9bd68:[0x41802295390c]_nv001099rm@<None>#<None>+0x108 stack: 0x0

2017-05-02T07:45:25.238Z cpu1:16681040)0x439149a9be48:[0x41802297f1e8]nv_interrupt_handler@<None>#<None>+0x184 stack: 0x43010447a390

2017-05-02T07:45:25.238Z cpu1:16681040)0x439149a9be88:[0x418021c588e9]IntrCookieBH@vmkernel#nover+0x299 stack: 0x0

2017-05-02T07:45:25.238Z cpu1:16681040)0x439149a9bf28:[0x418021c32bfe]BH_DrainAndDisableInterrupts@vmkernel#nover+0xe2 stack: 0xffffffff00

2017-05-02T07:45:25.239Z cpu1:16681040)0x439149a9bfb8:[0x418021cabc66]VMMVMKCall_Call@vmkernel#nover+0x176 stack: 0x418021cab778

2017-05-02T07:45:25.245Z cpu1:16681040)base fs=0x0 gs=0x418040400000 Kgs=0x0

2017-05-02T07:45:25.177Z cpu8:16845109)NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0xd80697(0x418021c00000):0x439149a9b5d8:0x4010](Src0x1, CPU8)

2017-05-02T07:44:57.177Z cpu8:16845109)NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd51463(0x418021c00000):0x439149a9ba18:0x4010](Src 0x1, CPU8)

2017-05-02T07:44:44.176Z cpu8:16845109)NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd51463(0x418021c00000):0x439149a9ba88:0x4010](Src 0x1, CPU8)

2017-05-02T07:45:25.177Z cpu8:16845109)NMI: 681: NMI IPI recvd. We Halt. eip(base):ebp:cs [0xd80697(0x418021c00000):0x439149a9b5d8:0x4010](Src0x1, CPU8)

2017-05-02T07:44:57.177Z cpu8:16845109)NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd51463(0x418021c00000):0x439149a9ba18:0x4010](Src 0x1, CPU8)

2017-05-02T07:44:44.176Z cpu8:16845109)NMI: 709: NMI IPI received. Was eip(base):ebp:cs [0xd51463(0x418021c00000):0x439149a9ba88:0x4010](Src 0x1, CPU8)

2017-05-02T07:45:25.249Z cpu1:16681040)Backtrace for current CPU #1, worldID=16681040, rbp=0x0



Environment

VMware vSphere ESXi 6.5
VMware vSphere ESXi 6.7
VMware vSphere ESXi 6.0

Cause

The issue is caused by the nvidia GPU call sequence

Resolution

Contact NVIDIA for further assistance.

Additional Information

ESXi ホストが PSOD nv_interrupt_handler でクラッシュする