VPXD crashes due to race condition of host member runtime in DVS
book
Article ID: 379923
calendar_today
Updated On:
Products
VMware vCenter Server 7.0VMware vCenter Server 8.0
Issue/Introduction
The VPXD crashes in vCenter shortly after start.
A core.vpxd-worker generates each time after crashed.
The call stack of core file may be similar with below trace:
(gdb) bt #0 0x00007f57dfa9c041 in raise () from /lib/libc.so.6 #1 0x00007f57dfa85536 in abort () from /lib/libc.so.6 #2 0x00007f57e5a7e3c0 in Vmacore::System::SignalTerminateHandler (info=0x7f56de09b1f0, ctx=0x7f56de09b0c0) at bora/vim/lib/vmacore/posix/defSigHandlers.cpp:62 #3 <signal handler called> #4 0x00005643861db305 in PersistStrTab::Insert (str=<error reading variable: Cannot access memory at address 0x0>, this=0x7f56de09bee8) at bora/vim/lib/public/stringTable/StringTable.h:54 #5 PersistStrTab::Insert (str=..., this=0x7f56de09bee8) at bora/vim/lib/public/stringTable/StringTable.h:55 #6 Vpxd::VmDvs::InitializeHostMemberStatus (this=this@entry=0x7f56d92196f0, checkState=checkState@entry=0x7f56de09bd10) at bora/vpx/vpxd/vmcheck/vmDvs.cpp:204 #7 0x00005643861dc475 in Vpxd::VmDvs::VmDvs (this=0x7f56d92196f0, dvs=<optimized out>, checkState=0x7f56de09bd10) at bora/vpx/vpxd/vmcheck/vmDvs.cpp:97 #8 0x00005643861c5d8d in Vpxd::CompatCheckState::CreateDvs (this=this@entry=0x7f56de09bd10, datacenter=datacenter@entry=0x7f576c01d7e0, uuid=...) at bora/vim/lib/public/vmacore/Ref.h:239 #9 0x00005643861e1b2b in Vpxd::VmHost::VmHost (this=this@entry=0x7f57cc946070, host=host@entry=0x7f56f08c6a70, checkState=checkState@entry=0x7f56de09bd10, isXvc=isXvc@entry=false) at bora/vim/lib/public/vmacore/Ref.h:239 #10 0x00005643861c3e7a in Vpxd::CompatCheckState::CreateHost (this=this@entry=0x7f56de09bd10, host=0x7f56f08c6a70, isXvc=isXvc@entry=false) at bora/vpx/vpxd/vmcheck/compatCheckState.cpp:83 #11 0x00005643861f03b8 in Vpxd::ConstructVmHosts (hosts=..., checkState=checkState@entry=0x7f56de09bd10, isXvc=isXvc@entry=false, hostStates=...) at bora/vpx/vpxd/vmcheck/vmTestDriver.cpp:157 #12 0x00005643861f1911 in Vpxd::FastVmTestDriver (hosts=..., vms=..., setType=setType@entry=HOST_SET_FOR_VMOTION, testOptions=testOptions@entry=0x7f56de09bcd8, testFamily=testFamily@entry=Vpxd::VMTESTFAMILY_PROV, opType=opType@entry=Vpxd::VmOperation::relocate, checkState=..., compatible=<optimized out>, dasCompatible=<optimized out>) at bora/vpx/vpxd/vmcheck/vmTestDriver.cpp:203 #13 0x00005643861c6a60 in Vpxd::MoVmCompatChecker::ComputeCompatSetWithDrmReason (vms=..., allHosts=..., type=type@entry=HOST_SET_FOR_VMOTION, strictDrsCheck=strictDrsCheck@entry=true, fromHa=fromHa@entry=false, drmReason=drmReason@entry=kUnspecified, result=...) at bora/vpx/vpxd/vmcheck/moVmCompatChecker.cpp:603 #14 0x00005643861c6aec in Vpxd::MoVmCompatChecker::ComputeCompatSet (vms=..., allHosts=..., type=type@entry=HOST_SET_FOR_VMOTION, strictDrsCheck=strictDrsCheck@entry=true, fromHa=fromHa@entry=false, result=...) at bora/vpx/vpxd/vmcheck/moVmCompatChecker.cpp:479 #15 0x00005643864b58cf in DrsDumpWriter::GetVMSnapshot[abi:cxx11](std::vector<Vmacore::Ref<VmMo>, std::allocator<Vmacore::Ref<VmMo> > > const&, std::vector<Vmacore::Ref<HostMo>, std::allocator<Vmacore::Ref<HostMo> > > const&) (vms=..., hosts=...) at bora/vpx/drs/interface/drsDump.cpp:243 #16 0x00005643864b86b1 in DrsDumpWriter::DumpClusterSnapshot (cluster=<optimized out>) at bora/vpx/drs/interface/drsDump.cpp:318 #17 0x0000564386411aa0 in operator() (__closure=0x7f56f124e970) at bora/vim/lib/public/vmacore/Ref.h:239 #18 std::__invoke_impl<void, CdrsLoadBalancer::DoAsynchronousClusterDump()::<lambda()>&> (__f=...) at external/cayman_esx_toolchain_gcc12/usr/bin/../lib/gcc/x86_64-vmk-linux-gnu/12.1.0/../../../../x86_64-vmk-linux-gnu/include/c++/12.1.0/bits/invoke.h:61 #19 std::__invoke_r<void, CdrsLoadBalancer::DoAsynchronousClusterDump()::<lambda()>&> (__fn=...) at external/cayman_esx_toolchain_gcc12/usr/bin/../lib/gcc/x86_64-vmk-linux-gnu/12.1.0/../../../../x86_64-vmk-linux-gnu/include/c++/12.1.0/bits/invoke.h:111 #20 std::_Function_handler<void(), CdrsLoadBalancer::DoAsynchronousClusterDump()::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at external/cayman_esx_toolchain_gcc12/usr/bin/../lib/gcc/x86_64-vmk-linux-gnu/12.1.0/../../../../x86_64-vmk-linux-gnu/include/c++/12.1.0/bits/std_function.h:290 #21 0x000056438655287d in VpxUtil_InvokeWithOpId (opID=..., funcName=funcName@entry=0x56438468db0a "ClusterSnapshot", functor=...) at bora/vpx/common/vpxAppsUtil.cpp:395 #22 0x000056438655293c in VpxUtil_InvokeWrapper (opID=..., funcName=0x56438468db0a "ClusterSnapshot", functor=..., doNotCatchVmacoreException=<optimized out>) at bora/vpx/common/vpxAppsUtil.cpp:420 #23 0x00007f57e5918be6 in Vmacore::System::ThreadPoolFair::InvokeItem(std::function<void ()>&) const (this=<optimized out>, item=...) at bora/vim/lib/vmacore/asio/ThreadPoolFair.cpp:641 #24 0x00007f57e591e4f9 in Vmacore::System::ThreadPoolFair::RunWorkerThread (this=0x564388247780) at bora/vim/lib/vmacore/asio/ThreadPoolFair.cpp:1298 #25 0x00007f57e5ab8093 in std::function<void ()>::operator()() const (this=0x7f5728808728) at external/cayman_esx_toolchain_gcc12/usr/bin/../lib/gcc/x86_64-vmk-linux-gnu/12.1.0/../../../../x86_64-vmk-linux-gnu/include/c++/12.1.0/bits/std_function.h:591 #26 Vmacore::System::ThreadPosix::ThreadBegin (data=0x7f5728808720) at bora/vim/lib/vmacore/posix/thread.cpp:122 #27 0x00007f57dfc2deae in start_thread () from /lib/libpthread.so.0 #28 0x00007f57dfb5ce2f in clone () from /lib/libc.so.6
Environment
VMware vCenter Server 7.0.x VMware vCenter Server 8.0.x
Cause
When multiple ESXi hosts join a new DVS, a race condition in the vCenter host sync handler causes ESXi hosts to fail saving the new runtime of member status of DVS.
Resolution
This is a known issue and fixed in vCenter Server 8.0 U3b.
To workaround the issue:
SSH to vCenter Server.
Disconnect all hosts connected to vCenter in vPostgres: