vCenter Server vpxd services crash and service restart required to regain production
search cancel

vCenter Server vpxd services crash and service restart required to regain production

book

Article ID: 416955

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • vCenter Services may randomly crash and may become unavailable, a reboot or full service restart is needed to restore production for some time before issue reoccurs. 
  • In /var/log/vmware/vpxd/vpxd.log we may see below when services crash: 

####-##-##T##:##:##.###+##:## error vpxd[#####] [Originator@6876 sub=Memory checker] Current value 11726448 exceeds hard limit 11682816. Shutting down process.
####-##-##T##:##:##.###+##:## panic vpxd[#####] [Originator@6876 sub=Default]
-->
--> Panic: Memory exceeds hard limit. Panic
--> Backtrace:
--> [backtrace begin] product: VMware VirtualCenter, version: 8.0.3, build: build-24305161, tag: vpxd, cpu: x86_64, os: linux, buildType: release
--> backtrace[00] libvmacore.so[0x00531DC5]
--> backtrace[01] libvmacore.so[0x0042182A]: Vmacore::System::Stacktrace::CaptureFullWork(unsigned int)
--> backtrace[02] libvmacore.so[0x00434009]: Vmacore::System::SystemFactory::CreateBacktrace(Vmacore::Ref<Vmacore::System::Backtrace>&)
--> backtrace[03] libvmacore.so[0x0050A989]
--> backtrace[04] libvmacore.so[0x0050AAA1]: Vmacore::PanicExit(char const*)
--> backtrace[05] libvmacore.so[0x0042154C]: Vmacore::System::ResourceChecker::DoCheck()
--> backtrace[06] libvmacore.so[0x00385107]
--> backtrace[07] libvmacore.so[0x0037EC04]
--> backtrace[08] libvmacore.so[0x00384517]
--> backtrace[09] libvmacore.so[0x00510FBB]
--> backtrace[10] libpthread.so.0[0x00008EB0]
--> backtrace[11] libc.so.6[0x000FFADF]
--> backtrace[12] (no module)
--> [backtrace end]

 

  • In /var/log/vmware/vmware-sps/sps.log we may see entries like below: 

####-##-##T##:##:##.###+##:## [pool-3-thread-19] INFO  opId= com.vmware.vim.storage.common.task.CustomThreadPoolExecutor - [VLSI-client] Active thread count is: 20, Core Pool size is: 20, Queue size: 11, Time spent waiting in queue: 4 millis | ThreadPool Starvation Alert

 

  • May also see a lot of of createContainerView events in vpxd for SPS tasks, these do all get closed, but shows high level of activity for SPS: 

grep vim.view.ViewManager.createContainerView vpxd-*.log | grep BEGIN | awk '{print$16}' | sort | uniq -c | sort -nr | head


   8649 55555####-####-####-####-############(55555####-####-####-####-############)
   2902 #########-####-####-####-############(#########-####-####-####-############)
   

--> To confirm what is responsible for these, below can be run against the ID seen from above output: 

find -iname "vpxd-profiler*" -type f -exec grep -H  "55555#####-####-####-####-############" {} \; | grep "ClientIP"   | head -n 5

Environment

vCenter Server 8.0 

Cause

 This is due to Core Pool Size for SMS which is set to default 20, this is being reached due to high SPS activity in environment  and thread pool starvation is occurring causing vpxd memory to be reached and service to crash. 

Resolution

**Snapshot/backup of vCenter Server (offline snapshot for enhanced linked mode environment) prior to making any changes

 
1. Log into VC through SSH as root and run shell 

2. Run below command to make changes to sms.properties file 

vi /usr/lib/vmware-vpx/sps/conf/sms.properties

 
3. Increase below in BOLD value from 20 to 30 

sms.threadpool.corePoolSize=30
sms.threadpool.maxPoolSize=500
sms.threadpool.keepAlive=120
sms.threadpool.queueSize=2

4. Save file 

Press "Esc"

Type  ":wq!"

Press "Enter" 

  
5. Restart VC services to ensure changes take affect 

service-control --stop --all && service-control --start --all 

6. Monitor to ensure issue does not reoccur