Enabling Fault Tolerance in vCenter Server 5.x and 6.0 on a virtual machine fails with the error: A general system error occurred: Source detected that destination failed to resume
book
Article ID: 344643
calendar_today
Updated On:
Products
VMware vCenter ServerVMware vSphere ESXi
Issue/Introduction
Symptoms:
When enabling Fault Tolerance on a virtual machine, you see the error:
A general system error occurred: Source detected that destination failed to resume
The secondary virtual machine shows as Unprotected .
Running the following command to check if vmkstatelogger is listening for connections on the ESXi host(s) returns no result:
localcli network ip connection list | grep LISTEN | grep 8100
In the /var/log/hostd.log file on the primary ESXi host, you see entries similar to:
YYYY-MM-DDT<time> [62140B90 info 'ha-eventmgr'] Event 3106 : Fault Tolerance state of VMname on host 'Hostname' in cluster 'ClusterName' in ha-datacenter changed from Starting to Need Secondary VM YYYY-MM-DDT<time> [62140B90 verbose 'vm:/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx'] Need secondary reason: starting YYYY-MM-DDT<time> [62A5BB90 verbose 'vm:/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx'] VMotionStatusCb: Completed YYYY-MM-DDT<time> [62A5BB90 verbose 'vm:/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx'] VMotionResolveCheck: Firing ResolveCb YYYY-MM-DDT<time> [62A5BB90 info 'VMotionSrc (1442411082520420)'] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_FT YYYY-MM-DDT<time> [62A5BB90 info 'VMotionSrc (1442411082520420)'] ResolveCb: Failed with fault: (vmodl.fault.SystemError) { --> dynamicType = <unset>, --> faultCause = (vmodl.MethodFault) null, --> reason = "Source detected that destination failed to resume.", --> msg = "", --> } YYYY-MM-DDT<time> [62A5BB90 verbose 'VMotionSrc (1442411082520420)'] Migration changed state from MIGRATING to DONE </time></time></time></time></time></time></time>
In the /var/log/fdm.log file on the primary ESXi host, you see entries similar to:
YYYY-MM-DDT<time> [619B6B90 verbose 'FDM'] [FdmService] New event: Event=vim.event.VmFailoverFailed vm=/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx host=host-4082 tag=host-4082:-2006118068:13 YYYY-MM-DDT<time> [61934B90 warning 'Execution'] [ExecutionCommandUpdate::Deserialize] vm /vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx failed to failover on host-4082. Fault: [N3Vim5Fault29InvalidOperationOnSecondaryVmE:0xbee3a58] YYYY-MM-DDT<time> [61934B90 verbose 'Placement'] [PlacementManagerImpl::HandleFailedPlacement(VmFailover)] Remove /vmfs/volumes/ </time></time></time>datastore/virtual_machine/virtual_machine.vmx
In the /var/log/vmkernel.log file on the primary ESXi host, you see entries similar to:
YYYY-MM-DDT<time> cpu40:30199371)VMotion: 3878: 1442411082520420 S: Stopping pre-copy: only 41 pages left to send, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~0.005 MB/s, 52332800% t2d) YYYY-MM-DDT<time> cpu40:30207567)VSCSI: 6227: handle 8755(vscsi0:0):Destroying Device for world 30199371 (pendCom 0) YYYY-MM-DDT<time> cpu40:30207567)VMKStateLogger: 4756: 2150733668: Primary executing with FT id 2150733668 and vmx worldID 30207567 YYYY-MM-DDT<time> cpu16:30199378)VMotionSend: 3508: 1442411082520420 S: Sent all modified pages to destination (network bandwidth ~56.106 MB/s) YYYY-MM-DDT<time> cpu39:30207571)WARNING: Migrate: 269: 1442411082520420 S: Failed: Failed to resume VM (0xbad0044) @0x4180113889fc YYYY-MM-DDT<time> cpu43:30207562)WARNING: Migrate: 4998: 1442411082520420 S: Migration considered a failure by the VMX. It is most likely a timeout, but check the VMX log for the true error. YYYY-MM-DDT<time> cpu43:30207562)WARNING: VMKStateLogger: 4041: 2150733668: The secondary VM is not responding. </time></time></time></time></time></time></time>
In the /var/log/vpxa.log file on the primary ESXi host, you see entries similar to:
YYYY-MM-DDT<time> [61A8CB90 info 'Default' opID=########-########-##-##-##-##-dc] [VpxLRO] -- ERROR task-823027 -- -- vim.host.VMotionManager.initiateSourceEx: vmodl.fault.SystemError: --> Result: --> (vmodl.fault.SystemError) { --> dynamicType = <unset>, --> faultCause = (vmodl.MethodFault) null, --> reason = "Source detected that destination failed to resume.", --> msg = "A general system error occurred: Source detected that destination failed to resume." </time>
In the /var/log/hostd.log file on the secondary ESXi host, you see entries similar to:
YYYY-MM-DDT<time> [2953BB90 info 'Libs'] Vix: [9413 foundryVMPowerOps.c:973]: FoundryVMPowerStateChangeCallback: /vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx, vmx/execState/val = poweredOff. YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-DSCPTR: DescriptorDetermineType: failed to open 'VMname.vmdk': Could not find the file (600000003) YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-LINK : "virtual_machine.vmdk" : failed to open (The system cannot find the file specified). YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-CHAIN : "virtual_machine.vmdk" : failed to open (The system cannot find the file specified). YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-LIB : Failed to open 'virtual_machine.vmdk' with flags 0x15 The system cannot find the file specified (25). YYYY-MM-DDT<time> [2953BB90 info 'Libs'] SNAPSHOT: failed to open virtual_machine.vmdk: The system cannot find the file specified (25)
</time></time></time></time></time></time>
In the /vmfs/volumes/datastore/virtual_machine/vmware.log file on the primary ESXi host, you see entries similar to:
YYYY-MM-DDT<time>| vcpu-0| StateLogger::FT saving on primary to create new secondary YYYY-MM-DDT<time>| vcpu-0| StateLogger::Connection accepted, ft id 2150733668. YYYY-MM-DDT<time>| vcpu-0| StateLogger::STATE LOGGING ENABLED (interponly 0 interpbt 0) YYYY-MM-DDT<time>| vcpu-0| StateLogger::LOG data YYYY-MM-DDT<time>| vcpu-0| StateLogger::USING BOUNCE BUFFERS YYYY-MM-DDT<time>| vcpu-0| DISKLIB-VMFS : "/vmfs/volumes/datastore/virtual_machine/virtual_machine-flat.vmdk" : open successful (21) size = 64424509440, hd = 0. Type 3 YYYY-MM-DDT<time>| vcpu-0| DISKLIB-VMFS : "/vmfs/volumes/datastore/virtual_machine/virtual_machine-flat.vmdk" : closed. YYYY-MM-DDT<time>| vcpu-0| Progress 101% (none) YYYY-MM-DDT<time>| vcpu-0| Migrate: VM successfully stunned. YYYY-MM-DDT<time>| vcpu-0| MigrateSetState: Transitioning from state 3 to 4. YYYY-MM-DDT<time>| vmx| VMXVmdb_SetMigrationHostLogState: hostlog state transits to failure for migrate 'to' mid 1442411082520420 YYYY-MM-DDT<time>| vmx| MigrateSetStateFinished: type=1 new state=5 YYYY-MM-DDT<time>| vmx| MigrateSetState: Transitioning from state 4 to 5. YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: switching to new log file. YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: Now in new log file. YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: Source detected that destination failed to resume. YYYY-MM-DDT<time>| vmx| StateLogger::Migration of primary failed (creating secondary) YYYY-MM-DDT<time>| vmx| Migrate: Attempting to continue running on the source. </time></time></time></time></time></time></time></time></time></time></time></time></time></time></time></time></time></time>
In the /vmfs/volumes/datastore/virtual_machine/vmware.log file on the secondary ESXi host, you see entries similar to:
YYYY-MM-DDT<time>| vmx| StateLogger::FT restoring to create secondary YYYY-MM-DDT<time>| vmx| StateLogger::Connection to <IP_address> failed: failure YYYY-MM-DDT<time>| vmx| StateLogger::Secondary couldn't connect to primary with generation 0 YYYY-MM-DDT<time>| vmx| Progress 101% (none) YYYY-MM-DDT<time>| vmx| VMXVmdb_SetMigrationHostLogState: hostlog state transits to failure for migrate 'from' mid 1442411082520420 YYYY-MM-DDT<time>| vmx| MigrateSetStateFinished: type=2 new state=11 YYYY-MM-DDT<time>| vmx| MigrateSetState: Transitioning from state 10 to 11. YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: Failed to resume on destination. YYYY-MM-DDT<time>| vmx| StateLogger::Migration of secondary failed (creating secondary) YYYY-MM-DDT<time>| vmx| Msg_Post: Error YYYY-MM-DDT<time>| vmx| [msg.checkpoint.mrestoregroup.failed] An error occurred restoring the virtual machine state during migration. YYYY-MM-DDT<time>| vmx| [msg.checkpoint.migration.failedReceive] Failed to receive migration. YYYY-MM-DDT<time>| vmx| ---------------------------------------- YYYY-MM-DDT<time>| vmx| Module CheckpointLate power on failed. </time></time></time></time></time></time></time></time></time></time></time></time></time></time>
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Environment
VMware vSphere ESXi 6.0 VMware vSphere ESXi 5.1 VMware vSphere ESXi 5.5 VMware vSphere ESXi 5.0 VMware vCenter Server 5.1.x VMware vCenter Server 6.0.x VMware vCenter Server 5.5.x VMware vCenter Server 5.0.x
Cause
This issue occurs when the vmkstatelogger module for Fault Tolerance is not loaded properly on the ESXi host(s).
Resolution
To resolve the issue, unload and then reload the vmkstatelogger module on the primary and secondary ESXi hosts.
Place the affected ESXi hosts in maintenance mode.
Verify that the vmkstatelogger module is not listening for connections by running this command:
localcli network ip connection list | grep LISTEN | grep 8100
If this command returns information, there is another Fault Tolerance process running. You must stop this Fault Tolerance process or move the virtual machines to another host.
Unload the vmkstatelogger module by running this command: