在 vCenter Server 5.x 和 6.0 中对虚拟机启用 Fault Tolerance 失败,并显示以下错误: 出现一般系统错误: 源检测到目标无法恢复
search cancel

在 vCenter Server 5.x 和 6.0 中对虚拟机启用 Fault Tolerance 失败,并显示以下错误: 出现一般系统错误: 源检测到目标无法恢复

book

Article ID: 344659

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:
免责声明: 本文为 Enabling Fault Tolerance in vCenter Server 5.x and 6.0 on a virtual machine fails with the error: 出现一般系统错误: Source detected that destination failed to resume (2134015) 的翻译版本。 尽管我们会不断努力为本文提供最佳翻译版本,但本地化的内容可能会过时。 有关最新内容,请参见英文版本。


  • 对虚拟机启用 Fault Tolerance 时看到以下错误:

    A general system error occurred: Source detected that destination failed to resume

  • 辅助虚拟机显示为 Unprotected。
  • 运行以下命令以检查侦听 ESXi 主机上连接的 vmkstatelogger 是否未返回任何结果:

    localcli network ip connection list | grep LISTEN | grep 8100

  • 在主要 ESXi 主机的 /var/log/hostd.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time> [62140B90 info 'ha-eventmgr'] Event 3106 : Fault Tolerance state of VMname on host 'Hostname' in cluster 'ClusterName' in ha-datacenter changed from Starting to Need Secondary VM
    YYYY-MM-DDT<time> [62140B90 verbose 'vm:/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx'] Need secondary reason: starting
    YYYY-MM-DDT<time> [62A5BB90 verbose 'vm:/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx'] VMotionStatusCb: Completed
    YYYY-MM-DDT<time> [62A5BB90 verbose 'vm:/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx'] VMotionResolveCheck: Firing ResolveCb
    YYYY-MM-DDT<time> [62A5BB90 info 'VMotionSrc (1442411082520420)'] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_FT
    YYYY-MM-DDT<time> [62A5BB90 info 'VMotionSrc (1442411082520420)'] ResolveCb: Failed with fault: (vmodl.fault.SystemError) {
    --> dynamicType = <unset>,
    --> faultCause = (vmodl.MethodFault) null,
    --> reason = "Source detected that destination failed to resume.",
    --> msg = "",
    --> }
    YYYY-MM-DDT<time> [62A5BB90 verbose 'VMotionSrc (1442411082520420)'] Migration changed state from MIGRATING to DONE

    </time></time></time></time></time></time></time>
  • 在主要 ESXi 主机的 /var/log/fdm.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time> [619B6B90 verbose 'FDM'] [FdmService] New event: Event=vim.event.VmFailoverFailed vm=/vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx host=host-4082 tag=host-4082:-2006118068:13
    YYYY-MM-DDT<time> [61934B90 warning 'Execution'] [ExecutionCommandUpdate::Deserialize] vm /vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx failed to failover on host-4082. Fault: [N3Vim5Fault29InvalidOperationOnSecondaryVmE:0xbee3a58]
    YYYY-MM-DDT<time> [61934B90 verbose 'Placement'] [PlacementManagerImpl::HandleFailedPlacement(VmFailover)] Remove /vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx
    </time></time></time>

  • 在主要 ESXi 主机的 /var/log/vmkernel.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time> cpu40:30199371)VMotion: 3878: 1442411082520420 S: Stopping pre-copy: only 41 pages left to send, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~0.005 MB/s, 52332800% t2d)
    YYYY-MM-DDT<time> cpu40:30207567)VSCSI: 6227: handle 8755(vscsi0:0):Destroying Device for world 30199371 (pendCom 0)
    YYYY-MM-DDT<time> cpu40:30207567)VMKStateLogger: 4756: 2150733668: Primary executing with FT id 2150733668 and vmx worldID 30207567
    YYYY-MM-DDT<time> cpu16:30199378)VMotionSend: 3508: 1442411082520420 S: Sent all modified pages to destination (network bandwidth ~56.106 MB/s)
    YYYY-MM-DDT<time> cpu39:30207571)WARNING: Migrate: 269: 1442411082520420 S: Failed: Failed to resume VM (0xbad0044) @0x4180113889fc
    YYYY-MM-DDT<time> cpu43:30207562)WARNING: Migrate: 4998: 1442411082520420 S: Migration considered a failure by the VMX. It is most likely a timeout, but check the VMX log for the true error.
    YYYY-MM-DDT<time> cpu43:30207562)WARNING: VMKStateLogger: 4041: 2150733668: The secondary VM is not responding.
    </time></time></time></time></time></time></time>

  • 在主要 ESXi 主机的 /var/log/vpxa.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time> [61A8CB90 info 'Default' opID=013B99DF-000005A8-1d-95-c2-ff-dc] [VpxLRO] -- ERROR task-823027 -- -- vim.host.VMotionManager.initiateSourceEx: vmodl.fault.SystemError:
    --> Result:
    --> (vmodl.fault.SystemError) {
    --> dynamicType = <unset>,
    --> faultCause = (vmodl.MethodFault) null,
    --> reason = "Source detected that destination failed to resume.",
    --> msg = "A general system error occurred: Source detected that destination failed to resume."

    </time>
  • 在辅助 ESXi 主机的 /var/log/hostd.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time> [2953BB90 info 'Libs'] Vix: [9413 foundryVMPowerOps.c:973]: FoundryVMPowerStateChangeCallback: /vmfs/volumes/datastore/virtual_machine/virtual_machine.vmx, vmx/execState/val = poweredOff.
    YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-DSCPTR: DescriptorDetermineType: failed to open 'VMname.vmdk': Could not find the file (600000003)
    YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-LINK : "virtual_machine.vmdk" : failed to open (The system cannot find the file specified).
    YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-CHAIN : "virtual_machine.vmdk" : failed to open (The system cannot find the file specified).
    YYYY-MM-DDT<time> [2953BB90 info 'DiskLib'] DISKLIB-LIB : Failed to open 'virtual_machine.vmdk' with flags 0x15 The system cannot find the file specified (25).
    YYYY-MM-DDT<time> [2953BB90 info 'Libs'] SNAPSHOT: failed to open virtual_machine.vmdk: The system cannot find the file specified (25)
    </time></time></time></time></time></time>

  • 在主要 ESXi 主机的 /vmfs/volumes/datastore/virtual_machine/vmware.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time>| vcpu-0| StateLogger::FT saving on primary to create new secondary
    YYYY-MM-DDT<time>| vcpu-0| StateLogger::Connection accepted, ft id 2150733668.
    YYYY-MM-DDT<time>| vcpu-0| StateLogger::STATE LOGGING ENABLED (interponly 0 interpbt 0)
    YYYY-MM-DDT<time>| vcpu-0| StateLogger::LOG data
    YYYY-MM-DDT<time>| vcpu-0| StateLogger::USING BOUNCE BUFFERS
    YYYY-MM-DDT<time>| vcpu-0| DISKLIB-VMFS : "/vmfs/volumes/datastore/virtual_machine/virtual_machine-flat.vmdk" : open successful (21) size = 64424509440, hd = 0. Type 3
    YYYY-MM-DDT<time>| vcpu-0| DISKLIB-VMFS : "/vmfs/volumes/datastore/virtual_machine/virtual_machine-flat.vmdk" : closed.
    YYYY-MM-DDT<time>| vcpu-0| Progress 101% (none)
    YYYY-MM-DDT<time>| vcpu-0| Migrate: VM successfully stunned.
    YYYY-MM-DDT<time>| vcpu-0| MigrateSetState: Transitioning from state 3 to 4.
    YYYY-MM-DDT<time>| vmx| VMXVmdb_SetMigrationHostLogState: hostlog state transits to failure for migrate 'to' mid 1442411082520420
    YYYY-MM-DDT<time>| vmx| MigrateSetStateFinished: type=1 new state=5
    YYYY-MM-DDT<time>| vmx| MigrateSetState: Transitioning from state 4 to 5.
    YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: switching to new log file.
    YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: Now in new log file.
    YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: Source detected that destination failed to resume.
    YYYY-MM-DDT<time>| vmx| StateLogger::Migration of primary failed (creating secondary)
    YYYY-MM-DDT<time>| vmx| Migrate: Attempting to continue running on the source.

    </time></time></time></time></time></time></time></time></time></time></time></time></time></time></time></time></time></time>
  • 在辅助 ESXi 主机的 /vmfs/volumes/datastore/virtual_machine/vmware.log 文件中,您会看到类似以下内容的条目:

    YYYY-MM-DDT<time>| vmx| StateLogger::FT restoring to create secondary
    YYYY-MM-DDT<time>| vmx| StateLogger::Connection to <IP_address> failed: failure
    YYYY-MM-DDT<time>| vmx| StateLogger::Secondary couldn't connect to primary with generation 0
    YYYY-MM-DDT<time>| vmx| Progress 101% (none)
    YYYY-MM-DDT<time>| vmx| VMXVmdb_SetMigrationHostLogState: hostlog state transits to failure for migrate 'from' mid 1442411082520420
    YYYY-MM-DDT<time>| vmx| MigrateSetStateFinished: type=2 new state=11
    YYYY-MM-DDT<time>| vmx| MigrateSetState: Transitioning from state 10 to 11.
    YYYY-MM-DDT<time>| vmx| Migrate_SetFailure: Failed to resume on destination.
    YYYY-MM-DDT<time>| vmx| StateLogger::Migration of secondary failed (creating secondary)
    YYYY-MM-DDT<time>| vmx| Msg_Post: Error
    YYYY-MM-DDT<time>| vmx| [msg.checkpoint.mrestoregroup.failed] An error occurred restoring the virtual machine state during migration.
    YYYY-MM-DDT<time>| vmx| [msg.checkpoint.migration.failedReceive] Failed to receive migration.
    YYYY-MM-DDT<time>| vmx| ----------------------------------------
    YYYY-MM-DDT<time>| vmx| Module CheckpointLate power on failed.
    </time></time></time></time></time></time></time></time></time></time></time></time></time></time>

    注意: 上述日志摘录仅为示例。 日期、时间和环境变量可能会因环境而有所不同。


Environment

VMware vSphere ESXi 5.1
VMware vSphere ESXi 6.0
VMware vCenter Server 6.0.x
VMware vSphere ESXi 5.5
VMware vCenter Server 5.0.x
VMware vCenter Server 5.5.x
VMware vSphere ESXi 5.0
VMware vCenter Server 5.1.x

Cause

ESXi 主机上未正确加载 Fault Tolerance 的 vmkstatelogger 模块时,会出现此问题。

Resolution

要解决此问题,请在主要 ESXi 主机和辅助 ESXi 主机上先卸载 vmkstatelogger 模块,然后再重新加载。
  1. 将受影响的 ESXi 主机置于维护模式。
  2. 通过运行以下命令,验证 vmkstatelogger 模块是否未侦听连接:

    localcli network ip connection list | grep LISTEN | grep 8100

    如果此命令返回信息,则表示其他 Fault Tolerance 进程正在运行。 必须停止此 Fault Tolerance 进程,或者将虚拟机移动到其他主机。

  3. 通过运行以下命令卸载 vmkstatelogger 模块:

    vmkload_mod -u vmkstatelogger

  4. 通过运行以下命令重新加载该模块:

    vmkload_mod vmkstatelogger


Additional Information

Enabling Fault Tolerance in vCenter Server 5.x and 6.0 on a virtual machine fails with the error: A general system error occurred: Source detected that destination failed to resume