NMI IPI: Panic requested by another PCPU. PC 0x42002ed7ef99, SP 0x45399191be10 (Src 0x4, CPU63)
search cancel

NMI IPI: Panic requested by another PCPU. PC 0x42002ed7ef99, SP 0x45399191be10 (Src 0x4, CPU63)

book

Article ID: 390617

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere ESXi 7.0 VMware vSphere ESXi 8.0

Issue/Introduction

ESXi in NSX Environment experience PSOD:

VMWARE ESXi 8.0.3 [Releasebuild-24505383] x86_64

NMI IPI: Panic requested by another PCPU. PC @ 0x42002e7ef99, SP 0x45399191be10 (Src 0x4, CPU63)
cr0=0x8001003d cr2=0x7fb04c014000 cr3=0x682c0000 cr4=0x14216c
FMS=06/55/7 uCode=0x5003707
*PCPU63:2097714/PSEventHelper
PCPU 0: S5VVVVSVVSVVSVVSVVSVVVVS5VVS0VVVVVVVVVVVVVVSSSSVVSU5VVSUVUVVSSU5
PCPU 64: VVSVVSUVUVVS5SU
Code start: 0x42002e8c0000 VMWK uptime: [TIME]
Saved backtrace from pcpu 63 SpinLock spin out NMI
0x45399191be10: [0x42002e7ef990]Ref_CountBlock@vmkernel #nover+0x35 stack: 0x43300d6019c0
0x45399191be20: [0x42002e74c25b]Port_AcquireExcl@vmkernel #nover+0x1fc stack: 0x0
0x45399191be70: [0x420030085f5b][email protected]#[VERSION]+0x8 stack: 0x43300d602144
base fs=0x0 gs=0x42004fcd0000 Kgs=0x0
5 other PCPUs are in panic.
cpu60: 2098700| NMI IPI: PC 0x42002e7efba, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7ef8f, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu1: 2098668| NMI IPI: PC 0x42002e7ef8f, SP 0x4539a9c1be50 (Src 0x4, CPU1)
cpu63: 2097714| NMI IPI: PC 0x42002e7efc0, SP 0x45399191be10 (Src 0x4, CPU63)
cpu63: 2097714| NMI IPI: PC 0x42002e7efc0, SP 0x45399191be10 (Src 0x4, CPU63)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60) No port for remote debugger.

In the logEFI.log file located in /var/run/log you see:

Panic from another cpu (cpu 46, world 2098651): ip=0x42002e287ff7 randomOff=0x2ec00000: Spin count exceeded - possible deadlock with PCPU 63Halting PCPU 46.Panic from
another cpu (cpu 60, world 2098700): ip=0x42002e2d7a2d randomOff=0x2ec00000: NMI IPI: Panic requested by another PCPU. PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4,
CPU60)Halting PCPU 60.Panic from another cpu (cpu 77, world 2098659): ip=0x42002e2c87f7 randomOff=0x2ec00000:Spin count exceeded - possible deadlock with PCPU 60Halting
PCPU 77.Panic from another cpu (cpu 3, world 2098262): ip=0x42002e2c87f7 randomOff=0x2ec00000:Spin count exceeded - possible deadlock with PCPU 63Halting PCPU 3.Panic from
another cpu (cpu 22, world 2100186): ip=0x42002e2c87f7 randomOff=0x2ec00000:Spin count exceeded - possible deadlock with PCPU 60Halting 0x48f103-08t19:32NMI IPI: Panic requested by
another PCPU. PC 0x42002e7ef99, SP 0x45399191be10 (Src 0x4, CPU63)
cr0=0x8001003d cr2=0x7fb04c014000 cr3=0x682c0000 cr4=0x14216c
FMS=06/55/7 uCode=0x5003707
*PCPU63:2097714/PSEventHelper
PCPU 0: S5VVVVSVVSVVSVVSVVSVVVVS5VVS0VVVVVVVVVVVVVVSSSSVVSU5VVSUVUVVSSU5
PCPU 64: VVSVVSUVUVVS5SU
cpu63:2097714| Code start: 0x42002e8c0000 VMWK uptime: 4:03:41.59.510
cpu63:2097714| Saved backtrace from pcpu 63 SpinLock spin out NMI
cpu63:2097714| 0x45399191be10:[0x42002e7ef990]Ref_CountBlock@vmkernel #nover+0x35 stack: 0x43300d6019c0
cpu63:2097714| 0x45399191be20:[0x42002e74c25b]Port_AcquireExcl@vmkernel #nover+0x1fc stack: 0x0
cpu63:2097714| 0x45399191be70:[0x420030085f5b][email protected]#[VERSION]+0x8 stack: 0x43300d602144
cpu63:2097714| base fs=0x0 gs=0x42004fcd0000 Kgs=0x0
5 other PCPUs are in panic.
cpu60: 2098700| NMI IPI: PC 0x42002e7efba, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7ef8f, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60)
cpu1: 2098668| NMI IPI: PC 0x42002e7ef8f, SP 0x4539a9c1be50 (Src 0x4, CPU1)
cpu63: 2097714| NMI IPI: PC 0x42002e7efc0, SP 0x45399191be10 (Src 0x4, CPU63)
cpu63: 2097714| NMI IPI: PC 0x42002e7efc0, SP 0x45399191be10 (Src 0x4, CPU63)
cpu60: 2098700| NMI IPI: PC 0x42002e7efc0, SP 0x4539aa39be20 (Src 0x4, CPU60) No port for remote debugger.

Environment

ESXi 7x
ESXi 8x

Cause

There is a rank order violation wherein 'VSwitch_FCLookupEntryInitCB' attempts to acquire the MAC-learning lock with the port lock held. Ens_InvalidateFlows attempts to hold the port non-exclusive lock with the MAC-learning lock held.

Resolution