VM Migration Fails at 68% with 'Module NVMAN power on failed'

Products

VMware vSphere ESXi

Issue/Introduction

VM live migration failure at 68%

Destination vmkernel logs:

YYYY-MM-DDTHH:MM:SS.442Z cpu48:30315019)Fil3: 5010: Lock failed on file: test-vm.vmx on vol '/vmfs/volumes/<datastore-name>' with FD: <FD c39 r70>
YYYY-MM-DDTHH:MM:SS.444Z cpu48:30315019)WARNING: Migrate: 6460: ##### D: Migration considered a failure by the VMX. It is most likely a timeout, but check the VMX log for the true error.
YYYY-MM-DDTHH:MM:SS.444Z cpu48:30315019)Migrate: 102: ##### D: MigrateState: Failed
YYYY-MM-DDTHH:MM:SS.444Z cpu48:30315019)WARNING: Migrate: 256: ##### D: Failed: Migration determined a failure by the VMX (0xbad0092) @0x42003a0b41bb
YYYY-MM-DDTHH:MM:SS.444Z cpu48:30315019)VMotion: 7473: ##### D: Estimated network bandwidth 2990.698 MB/s before failure

Destination vpxa logs:

YYYY-MM-DDTHH:MM:SS.863Z info vpxa[#####] [Originator@6876 sub=Default opID=#####-#####-auto-hkce-h5:XXXXX-71-01-31-01] [VpxLRO] -- ERROR task-#####-- -- vim.host.VMotionManager.initiateDestination:tracking: vim.fault.GenericVmConfigFault:
--> Result:
--> (vim.fault.GenericVmConfigFault) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = (vmodl.LocalizableMessage) [
--> (vmodl.LocalizableMessage) {
--> key = "msg.moduletable.powerOnFailed",
--> arg = (vmodl.KeyAnyValue) [
--> (vmodl.KeyAnyValue) {
--> key = "1",
--> value = "Nvman"
--> }
--> ],
--> message = "Module 'Nvman' power on failed. "
--> },
--> (vmodl.LocalizableMessage) {
--> key = "msg.migrate.resume.fail",
--> arg = <unset>,
--> message = "The VM failed to resume on the destination during early power on. "
--> },
--> (vmodl.LocalizableMessage) {
--> key = "faultTime",
--> arg = <unset>,
--> message = "YYYY-MM-DDTHH:MM:SS.638714Z"
--> }
--> ],
--> reason = "Module 'Nvman' power on failed. "
--> msg = "Module 'Nvman' power on failed. "
--> }
--> Args:
-->

vmware.log

YYYY-MM-DDTHH:MM:SS.244Z In(05) vcpu-0 - Closing disk 'scsi0:0'
YYYY-MM-DDTHH:MM:SS.244Z In(05) vcpu-0 - LWD: Closing disk 1760440B00 <================
YYYY-MM-DDTHH:MM:SS.795Z Er(02) worker-29931522 - IOFIPC: Unable to connect to UDS at /var/run/vmwarelwd/daemon: No such file or directory
YYYY-MM-DDTHH:MM:SS.795Z Er(02) worker-29931522 - IOFIPC: Error creating a connection to add to pool 'daemonId': No such file or directory
YYYY-MM-DDTHH:MM:SS.795Z In(05) worker-29931522 - IOFIPC: Client connection failed in a previous attempt; delaying 65536ms before next attempt
YYYY-MM-DDTHH:MM:SS.247Z Er(02) worker-29931525 - LWD: Failed to connect IPC client to serve request 17B3BBFED0 for disk 1760440B00; error: Connection timed out
YYYY-MM-DDTHH:MM:SS.247Z Wa(03) vcpu-0 - LWD: Error sending CloseDisk IPC to daemon for disk 1760440B00: Connection timed out
YYYY-MM-DDTHH:MM:SS.249Z In(05) vcpu-0 - LWD: Closed disk 1760440B00
YYYY-MM-DDTHH:MM:SS.249Z In(05) vcpu-0 - LWD: LwdFilter_Exit while on disk 1760440B00
YYYY-MM-DDTHH:MM:SS.251Z Er(02) worker-29931523 - LWD: Failed to connect IPC client to serve request 17A9A95480 for disk 0; error: Connection timed out
YYYY-MM-DDTHH:MM:SS.251Z In(05) vcpu-0 - IOFIPC: IPC Service is no longer accepting connections on 109
YYYY-MM-DDTHH:MM:SS.251Z Wa(03) vcpu-0 - IOFIPC: Unable to remove timer for server listening to 'filterId'
YYYY-MM-DDTHH:MM:SS.251Z In(05) vcpu-0 - IOFIPC: IPC Service is no longer accepting connections on 109
YYYY-MM-DDTHH:MM:SS.257Z In(05) vcpu-0 - IOFIPC: IPC management subsystem shut down
YYYY-MM-DDTHH:MM:SS.257Z In(05) vcpu-0 - LWD: LwdFilter_MinimalExit while on disk 1760440B00  <============
YYYY-MM-DDTHH:MM:SS.259Z In(05) vcpu-0 - DISKLIB-VMFS : "/vmfs/volumes/<datastore-name>/#####_4-flat.vmdk" : closed.

Closing disk takes a long time and retries due to LWD failed to connect IPC client

Environment

VMware vSphere ESXi 7.0

Cause

The virtual machine was previously protected with DP/LWD on the source host. After migrating the VM to another cluster, the cluster does not have the DP service running.

Resolution

Confirm the VMDK contains:

ddb.iofilters = "spif:vmwarelwd"

Enable dp on those hosts by calling EPS (enable protection service on cluster) or starting dpd service on destination esx host[s] and attempting migration.
Star the dpd service : /etc/init.d/dpd start

OR

If the vm is no longer part of dp then unprotect dp/lwd on this VM: Putting ESXi into maintenance mode, vMotioning, or reconfiguring a VM fails, or snapshots are slow, due to the vmwarelwd ioFilter being attached

Additional Information

Putting ESXi into maintenance mode, vMotioning, or reconfiguring a VM fails, or snapshots are slow, due to the vmwarelwd ioFilter being attached