ESXI PSOD with SpinCount locks
search cancel

ESXI PSOD with SpinCount locks

book

Article ID: 392817

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXI host will have PSODs due to "LockCheckSpinCountInt" events when using a default RCP/IP stack instead of a dedicated, separate vmotion stack 

 

You will get a PSDO with backtrace that has the events bellow indicating the Events bellow:

#0  LockCheckSpinCountInt
#1  Lock_CheckSpinCount 
#2   SP_WaitReadLock 
#3   SPAcqWriteLockWork
#4 SP_AcqWriteLockWithRA
#5  tcp_usr_shutdown
#6  soshutdown 
#7  vmk_shutdown
#8  Net_ShutdownSocket
#90 UserSocketInetShutdown
#10 UserSocket_Shutdown
#11 LinuxSocket_Shutdown64 
#12  User_LinuxSyscallHandler

Environment

ESXi 7.0

Cause

This is caused the lack of a dedicated TCP/IP stack for vmotions

When using "default" TCP/IP stacks , large amount of migrations can cause tcbinfo lock contention

For Example:

4 vmks with a default Tcp/IP Stack would result in the following:

  • 8 parallel vMotion's sessions for each network stream helper
  • This totals 64 vMotion processes 
  • For TCP/IP this results in 16 network receive processes ( 4 nics * 4 processes per NIC ) 

This volume of concurrent processes with the large amount of data in transit causes the lock contention with tcbinfo as too many vMotion threads reading and writing to the network socket buffer

This results in the host having a PSOD 

 

 

Resolution

Reduce the volume of concurrent migrations using the settings and steps in the kb bellow

Resource Manager Settings for Managing Migrations

This should result in a total reduction in vMotion threads hitting the network socket buffer at the same time and avoid the tcbinfo lock issue

 

You can also reconfigure your TCP/IP stack and set a dedicated stack for vMotion

Dedicated vMotion Stack 

Networking Best Practices