[VMC] VMs running Hardware V10 with VMware Tools 11.0.5 are seen HA restarting
search cancel

[VMC] VMs running Hardware V10 with VMware Tools 11.0.5 are seen HA restarting

book

Article ID: 334974

calendar_today

Updated On:

Products

VMware Cloud on AWS VMware Cloud on Dell EMC

Issue/Introduction

To detail the known scenario which would lead to these vSphere HA restarts.

Symptoms:

  • A given workload VM is configured with Hardware V10 and is running VMware Tools 11.0.5.
  • This specific VM is seen HA restarting, seemingly at random. The underlying host is healthy and workload VMs with updated Hardware versions or tools will not see impact.
  • This specific VM has a SATA device and/or Controller device attached to the VM, even if it is in the disconnected state.

 
In the vmware.log file for the VMX process, a signal 11 panic is seen:
YYYY-MM-DDThh:mm:ss Z[+1776.122] Wa(03) vcpu-1 - Caught signal 11 -- tid 93834351 (addr 0)
YYYY-MM-DDThh:mm:ss Z[+1776.122] In(05) vcpu-1 - SIGNAL: rip 0xcc514defb3 rsp 0xcc98113758 rbp 0xcc981137c0
YYYY-MM-DDThh:mm:ss Z[+1776.122] In(05) vcpu-1 - SIGNAL: rax 0x1c rbx 0xcc9669a1a0 rcx 0x2 rdx 0x0 rsi 0xcc52ede4a0 rdi 0x0
YYYY-MM-DDThh:mm:ss Z[+1776.122] In(05) vcpu-1 -         r8 0xcc9669a1b8 r9 0xcc9669a1b8 r10 0xcc52ede4a0 r11 0x202 r12 0xcc52ede4a0 r13 0xf2 r14 0xcc9667e000 r15 0xcc97c9d000
YYYY-MM-DDThh:mm:ss Z[+1776.123] Cr(01) vcpu-1 - PANIC: Unexpected signal: 11.
YYYY-MM-DDThh:mm:ss Z[+1777.081] Wa(03) vcpu-1 - A core file is available in "/vmfs/volumes/vsan:XXXXXXX/XXXXXXX/vmx-zdump.XXX"


Cause

A change made between hardware V10 and V11 causes the vmx process to panic when the specific configuration mentioned above is present.

Resolution

Upgrade the VM's hardware to V11 or newer. As a best practice, also upgrade the version of VMware Tools present on the device.

Workaround:
Remove the SATA and/or Controller device from the VM using the Edit Settings button.

Additional Information

Impact/Risks:
The customer's VMs will continue to randomly vSphere HA restart until the resolution or workaround is applied.