ESXi host not responding to vCenter server due to vmx process crash.
search cancel

ESXi host not responding to vCenter server due to vmx process crash.

book

Article ID: 322085

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • ESXi host is not responding to vCenter server.
     
  • vpxa process on the ESXi host was shut down by itself and soon restarted automatically.
     
  • A dump file of vmx process was created under a VM folder.
     
  • In the vmware.log under the VM directory, you see entries similar to:

    2017-05-11T08:38:18.497Z| svga| I125: MKS-RenderMain: Stopping MKSBasicOps
    2017-05-11T08:38:18.498Z| svga| W115: GLWindow: Unable to reserve host GPU resources
    2017-05-11T08:38:18.498Z| vmx| I125: Msg_Post: Information
    2017-05-11T08:38:18.498Z| vmx| I125: [msg.mks.noGPUResourceFallback] Hardware GPU resources are not available. The virtual machine will use software rendering.
    2017-05-11T08:38:18.498Z| vmx| I125: ----------------------------------------
    2017-05-11T08:38:18.499Z| svga| I125: MKS-SWP: plugin started - llvmpipe (LLVM 3.3, 256 bits)
    2017-05-11T08:38:18.500Z| svga| I125: Started Shim3D
    2017-05-11T08:38:18.500Z| svga| I125: MKS-RenderMain: Starting SWRenderer
    2017-05-11T08:38:18.506Z| svga| I125: Stopped Shim3D
    2017-05-11T08:38:18.506Z| svga| I125: MKS-SWP: plugin stopped
    2017-05-11T08:38:18.506Z| svga| I125: MKS-RenderMain: Stopping SWRenderer
    2017-05-11T08:38:18.506Z| svga| I125: MKS-RenderMain: Starting MKSBasicOps
    2017-05-11T08:38:18.512Z| svga| I125: MKS-RenderMain: Stopping MKSBasicOps
    2017-05-11T08:38:18.513Z| svga| W115: GLWindow: Unable to reserve host GPU resources
    2017-05-11T08:38:18.514Z| svga| I125: MKS-SWP: plugin started - llvmpipe (LLVM 3.3, 256 bits)
    2017-05-11T08:38:18.514Z| svga| I125: Started Shim3D
    2017-05-11T08:38:18.514Z| svga| I125: MKS-RenderMain: Starting SWRenderer
    2017-05-11T08:38:19.024Z| vmx| I125: Vigor_MessageRevoke: message 'msg.mks.noGPUResourceFallback' (seq 17226643) is revoked
    2017-05-11T08:38:19.025Z| vmx| I125: Msg_Post: Information
    2017-05-11T08:38:19.025Z| vmx| I125: [***] ************
    2017-05-11T08:38:19.025Z| vmx| I125: ----------------------------------------
    2017-05-11T08:38:19.035Z| vmx| I125: Vigor_MessageRevoke: message '***' (seq 17226644) is revoked
    2017-05-11T08:38:19Z[+0.000]| vmx| W115: Caught signal 11 -- tid 2946475 (addr 51CB83E0)


    where area marked as *** in the log messages above are some string with garbled letters depending on cases.

 

  • In the hostd log under /var/run/log directory, you see entries similar to:

    2017-05-11T08:38:19.024Z info hostd[60980B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 124098 : Message on <VM_name> on <ESXi_name> in ha-datacenter: Hardware GPU resources are not available. The virtual machine will use software rendering.
    ..
    2017-05-11T08:38:19.035Z info hostd[60980B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 124100 : Message on <VM_name> on <ESXi_name> in ha-datacenter: ***
    ..
    2017-05-11T08:38:21.173Z info hostd[611C4B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 124104 : An application (/bin/vmx) running on ESXi host has crashed (1 time(s) so far). A core file might have been created at /vmfs/volumes/bae09318-1962d078/<VM_name>/vmx-zdump.000.

    Similar to the vmware.log, *** in hostd log above will be garbled letters.
     
  • In the vpxa log under /var/run/log directory, you see entries similar to:

    2017-05-11T08:38:19.757Z error vpxa[FFEB3B70] [Originator@6876 sub=vmomi.soapStub[1] opID=WFU-76958b15] Error deserializating SOAP response body:
    --> Error returned by expat parser: not well-formed (invalid token)
    -->
    --> while parsing serialized value of type string
    --> at line 7, column 626
    -->
    --> while parsing property "fullFormattedMessage" of static type string
    -->
    --> while parsing serialized DataObject of type vim.event.VmMessageEvent
    --> at line 7, column
    -->
    --> while parsing return value of type vim.event.Event[], version vim.version.version
    --> at line 7, column 0
    -->
    --> while parsing SOAP body
    --> at line 6, column 0
    -->
    --> while parsing SOAP envelope
    --> at line 2, column 0
    -->
    --> while parsing HTTP response for method readNext
    --> on object of type vim.event.EventHistoryCollector
    --> at line 1, column 0
    2017-05-11T08:38:19.758Z error vpxa[FFEB3B70] [Originator@6876 sub=vpxaVmomi opID=WFU-76958b15] [VpxaClientAdapter::InvokeCommon] Got exception while invoking readNext on vim.event.EventHistoryCollector:session[52524d42-09c9-7049-b373-1909e8ca4582]5284e251-e630-1e82-5dff-df13ab09949a: 'N7Vmacore4Soap24InvalidResponseExceptionE(
    --> Error returned by expat parser: not well-formed (invalid token)
    -->
    --> while parsing serialized value of type string
    --> at line 7, column 626
    -->
    --> while parsing property "fullFormattedMessage" of static type string
    -->
    --> while parsing serialized DataObject of type vim.event.VmMessageEvent
    --> at line 7, column 42
    -->
    --> while parsing return value of type vim.event.Event[], version vim.version.version10
    --> at line 7, column 0
    -->
    --> while parsing SOAP body
    --> at line 6, column 0
    -->
    --> while parsing SOAP envelope
    --> at line 2, column 0
    -->
    --> while parsing HTTP response for method readNext
    --> on object of type vim.event.EventHistoryCollector
    --> at line 1, column 0)'
    2017-05-11T08:38:19.758Z error vpxa[FFEB3B70] [Originator@6876 sub=VpxaHalCnxHostagent opID=WFU-76958b15] [WaitForUpdatesDone] Got error while processing updates: N5Vmomi5Fault17HostCommunication9ExceptionE(vmodl.fault.HostCommunication)
    2017-05-11T08:38:19.758Z error vpxa[FFEB3B70] [Originator@6876 sub=VpxaHalCnxHostagent opID=WFU-76958b15] [WaitForUpdatesDone] Fatal error while listening-for/processing updates from hostd.
    --> N5Vmomi5Fault17HostCommunication9ExceptionE(vmodl.fault.HostCommunication)
    2017-05-11T08:38:19.763Z info vpxa[FFEB3B70] [Originator@6876 sub=vpxaInvtHost opID=WFU-76958b15] [VpxaInvtHost] ServerId has been changed from 54132 to 0
    2017-05-11T08:38:19.763Z error vpxa[FFEB3B70] [Originator@6876 sub=vpxaInvtHostCnx opID=WFU-76958b15] [VpxaInvtHost] Can't connect to hostd/serverd. Shutting down...
    2017-05-11T08:38:19.763Z info vpxa[FFEB3B70] [Originator@6876 sub=Default opID=WFU-76958b15] [Vpxa] Shutting down now


Environment

VMware vSphere ESXi 6.0
VMware vSphere ESXi 6.5

Cause

The vmx process crashed around processing the message event "[msg.mks.noGPUResourceFallback] Hardware GPU resources are not available. The virtual machine will use software rendering". As the vmx process passed a message event with invalid string to hostd just before the crash, hostd can't create a proper response to GetChange request from vpxa, hence vpxa gets an error on deserializing SOAP response body and decided to exit by itself.

It results in ESXi host not to respond to a heartbeat from vCenter.

Resolution

This is a known issue affecting ESXi 6.0

VMware vSphere ESXi 6.5 P02 (ESXi650-201712001)

To download, go to Customer Connect Downloads.

 


Workaround:

To workaround the issue, restart the ESXi host daemon and vCenter Agent services using these commands:

/etc/init.d/hostd restart

/etc/init.d/vpxa restart


Additional Information



vmx プロセス クラッシュが原因で、ESXi ホストが vCenter Server に応答しない