Handling powered off or suspended PMem VMs on host reboot
search cancel

Handling powered off or suspended PMem VMs on host reboot

book

Article ID: 332718

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • In a HA enabled cluster when a host is rebooted, powered-off and suspended PMem VMs fail to be relocated to other hosts.
  • The reboot causes the host to enter a NotResponding state consequently causing the VMs to be in disconnected state in the vSphere Client.
  • The disconnected VMs would change back to connected state once the rebooted host comes back up.
  • In the /var/run/log/vpxd-*.log, you see failures in an attempt to migrate these VMs onto other hosts.
  • You see messages similar to:
YYYY-MM-DDThh:mm:ss info vpxd[15833] [Originator@6876 sub=vpxLro opID=lro-294357-3f61ac0e-01] [VpxLRO] -- BEGIN lro-294358 -- -- HostMoReregisterPoweredOffVmLro --
YYYY-MM-DDThh:mm:ss verbose vpxd[15833] [Originator@6876 sub=vmmoVm opID=lro-294357-3f61ac0e-01] Relocating VM vm-1 from host id host-78 (##.##.#.#) to host id host-72
YYYY-MM-DDThh:mm:ss error vpxd[15833] [Originator@6876 sub=VmProv opID=lro-294357-3f61ac0e-01-01] Local-VC Host Migrate failed at vpx.vmprov.InvokeCallbacks for poweredOff VM 'vm-1' (vm-99, ds:///vmfs/volumes/########-######e3/vm-1/vm-1.vmx) on host-78 (##.##.#.#) in pool resgroup-69 with ds ds:///vmfs/volumes/########-######e3/ to host-72 (##.##.#.#) in pool resgroup-69 with ds ds:///vmfs/volumes/########-######e3/ with migId 4140586645260472458 with fault vmodl.fault.HostCommunication.



Environment

VMware vSphere ESXi 7.0.x

Cause

When a host with powered-off or suspended PMem VMs is rebooted, there is an attempt by the vCenter system to re-register these VMs on another compatible host in the cluster. Re-registration fails because vCenter system cannot initiate storage VMotion of the PMem NVDIMM because PMem is a host-local store and inaccessible when host is rebooting. Hence, the relocate operation fails and the VM remains in disconnected state on the rebooted host.

Resolution

In case the host did not come back after reboot or the host failure is permanent, to relocate powered off or suspended PMem VMs to another host with PMem resources, the following workaround can be used.

To manually register and reconfiguring the VMs to connect them back to vCenter Server:

1. Find the VM home directory in the shared store (vmfs, nfs, vsan, etc.) and re-register it on a host which has enough CPU/Mem/PMem resources for that VM. This registration of the VM succeeds but the NVDIMMs are all set to 0 Bytes.
2. Browse to the virtual machine in the vSphere Client. Right-click on the VM and select Edit Settings and remove all the NVDIMMs and select OK, thus initiating a VM reconfigure task.
3. The VM reconfigure task fails with the PMem vmdk not being found, however the NVDIMMs are removed from the VM config anyway.
4. Browse to the virtual machine in the vSphere Client. Right-click on the VM and select Edit Settings and add all the NVDIMMs and select OK, thus initiating a VM reconfigure task.
5. This VM reconfigure task to add NVDIMMs succeeds and New NVDIMMs are added to the VM. After this, the VM can be powered on the new host if needed.

Note: This same workaround applies for PMem VMs with the failover flag enabled or disabled.