1. Planned migration takes a long time to complete and fails with operation timed out error
2. VM fails on 'Change recovery site storage to writable' step and displays : Error - Operation timed out: 900 seconds
/var/log/vmware/srm/vmware-dr.log:
2024-05-03T18:45:09.360Z error vmware-dr[39178] [SRM@6876 sub=Recovery ctxID=cbaabae9 opID=71c1c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1] Plan 'VMRecovery' failed: N2Dr16TimeoutExceptionEOperation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.296Z error vmware-dr[39198] [SRM@6876 sub=Recovery ctxID=cbaabae9 opID=71c1838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1] Plan execution (failover workflow) failed; plan id:bb5a722b-345c-46f1-b11f-8de586c39859, plan name: VMRecovery, error: N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.093Z error vmware-dr[39187] [SRM@6876 sub=Recovery ctxID=cbaabae9 opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124] [bb5a722b-345c-46f1-b12f-8de586c39859.failoverOrchJob] Replication reported failure for VM Venus [null], the vm will not be recovered: N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.094Z warning vmware-dr[39187] [SRM@6876 sub=Recovery ctxID=cbaabae9 opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124] Replication Result contained no Infos
2024-05-03T18:45:09.095Z verbose vmware-dr[39187] [SRM@6876 sub=Default ctxID=cbaabae9 opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124] [bb5a722b-345c-46f1-b12f-8de586c39859.failoverOrchJob] Setting job failure: N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.059Z verbose vmware-dr[01385] [SRM@6876 sub=Replication opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] EntityFailed: Received a failure update for protected VM Id=[dr.replication.ProtectedVm:574c6e3d-4999-4261-a035-c3dc928ceb8b:protected-vm-155751], error=
--> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.060Z verbose vmware-dr[01385] [SRM@6876 sub=Replication opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] HandleEntityCompletion: Queuing callback for completion of protected VM Id=[dr.replication.ProtectedVm:574c6e3d-4999-4261-a035-c3dc928ceb8b:protected-vm-155751]
2024-05-03T18:45:09.061Z error vmware-dr[39179] [SRM@6876 sub=Replication.VmRecoveryInterface ctxID=b9a6720f opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124] Protection VM 'dr.replication.ProtectedVm:protected-vm-155751' failed operation 'Recover'! There are '0' warnings for this Protection Group. Failure Reason: N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.051Z error vmware-dr[39178] [SRM@6876 sub=HbrProvider opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] HbrVmErrorFunc: Error during recovering VM vm-137301 protected VM protected-vm-155751 error
--> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.049Z verbose vmware-dr[39174] [SRM@6876 sub=LocalHms HMSUM opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957 tid=HTID-8e758ddc-883d-41e3-8eac-0216aafa022f.CreateFailoverImageExtended] Destroying filter with filter token '700'
2024-05-03T18:45:09.049Z error vmware-dr[39174] [SRM@6876 sub=HbrRecoveryEngine opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] StartImageCreationProcess: Unable to create a failover image for the HMS group MoId=GID-380bb735-b216-39ab-bef0-417695d9337c, name='Venus' in optimized reprotect mode
--> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.050Z error vmware-dr[39174] [SRM@6876 sub=HbrRecoveryEngine opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] CreateImageFailed: Unable to create an image for the HMS group (MoId=GID-380bb735-b216-39ab-bef0-417695d9337c, name='Venus'), error= --> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.050Z error vmware-dr[39174] [SRM@6876 sub=HbrRecoveryEngine opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] CreateImageFailed: Image creation for the HMS group (MoId=GID-380bb735-b216-39ab-bef0-417695d9337c, name='Venus') failed with an unknown error: --> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.051Z error vmware-dr[39174] [SRM@6876 sub=HbrRecoveryEngine opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] GetGroupImageDone: Unable to create an image for the HMS group (MoId=GID-380bb735-b216-39ab-bef0-417695d9337c, name='Venus')
--> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.051Z error vmware-dr[39174] [SRM@6876 sub=HbrRecoveryEngine opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] HbrGroupErrorCB: Failed to recover HMS group GID-380bb735-b216-39ab-bef0-417695d9337c error=
--> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
-->
[context]zKq7AVECAAQAAL/UWwEJdm13YXJlLWRyAADM6xtsaWJ2bWFjb3JlLnNvAAFZCw9saWJjb25uZWN0aW9uLWJhc2Uuc28AApg6A2xpYmZ1bmN0aW9uYWwuc28AAIKvQQDeSDUA4mE1ALCLSgOujgBsaWJwdGhyZWFkLnNvLjAABC/eD2xpYmMuc28uNgA=[/context]
2024-05-03T18:45:09.051Z error vmware-dr[39174] [SRM@6876 sub=HbrRecoveryEngine opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] HbrGroupErrorCB: Calling error for VM vm-137301 in HMS group GID-380bb735-b216-39ab-bef0-417695d9337c
2024-05-03T18:45:09.051Z error vmware-dr[39174] [SRM@6876 sub=HbrProvider opID=71c8c838-8f2a-4036-8a63-8f119f06ea3d-failover:b1a1:f93f:c124:2957] HbrGroupErrorFunc: Unable to recover HMS group MoId=GID-380bb735-b216-39ab-bef0-417695d9337c, error=
--> N2Dr16TimeoutExceptionE Operation timed out after '900.000591' seconds
/var/log/vmware/hbrsrv.log:
2024-05-03T18:44:26.148Z info hbrsrv[01322] [Originator@6876 sub=StatsLog] HbrEvent: {"eventID":"consolidateProgress","groupID":"GID-380bb731-b216-39ab-bef0-417695d9337c","diskID":"RDID-1f3caa1g-71b3-37cf-a76c-f34c859e2326","percentCompleted":8,"ETA":"2024-05-03T21:49:58.148709Z","serverID":"52da2711-b7e3-de2a-a359-7cec87e1d700","hbrEvent":1}
2024-05-03T18:45:06.155Z info hbrsrv[01629] [Originator@6876 sub=StatsLog groupID=GID-380bb731-b216-39ab-bef0-417695d9337c opID=hsl-246a8da0] HbrEvent: {"eventID":"consolidateProgress","groupID":"GID-380bb731-b216-39ab-bef0-417695d9337c","percentCompleted":55,"ETA":"2024-05-03T22:08:21.155268Z","serverID":"52da2711-b7e3-de2a-a359-7cec87e1d700","hbrEvent":1}
2024-05-03T18:47:46.185Z info hbrsrv[01322] [Originator@6876 sub=StatsLog] HbrEvent: {"eventID":"consolidateProgress","groupID":"GID-380bb731-b216-39ab-bef0-417695d9337c","diskID":"RDID-1f3caa1g-71b3-37cf-a76c-f34c859e2326","percentCompleted":10,"ETA":"2024-05-03T21:19:16.185035Z","serverID":"52da2711-b7e3-de2a-a359-7cec87e1d700","hbrEvent":1}
2024-05-03T18:48:16.181Z info hbrsrv[01322] [Originator@6876 sub=StatsLog] HbrEvent: {"eventID":"consolidateProgress","groupID":"GID-380bb731-b216-39ab-bef0-417695d9337c","diskID":"RDID-d687f081-92d4-38c5-8a27-6588cf844777","percentCompleted":10,"ETA":"2024-05-04T02:19:46.180999Z","serverID":"52da2711-b7e3-de2a-a359-7cec87e1d700","hbrEvent":1}
2024-05-03T18:51:06.224Z info hbrsrv[01322] [Originator@6876 sub=StatsLog] HbrEvent: {"eventID":"consolidateProgress","groupID":"GID-380bb731-b216-39ab-bef0-417695d9337c","diskID":"RDID-1f3caa1g-71b3-37cf-a76c-f34c859e2326","percentCompleted":11,"ETA":"2024-05-03T23:49:15.224013Z","serverID":"52da2711-b7e3-de2a-a359-7cec87e1d700","hbrEvent":1}
3. Re-running the recovery immediately also fails with this error - Object 'GID-11e66c24-996b-316e-89d8-e5f4e0cebc0b' is locked by another ongoing operation in vSphere Replication Management Server. Try again later.
Large VMs consisting of multiple disks can take a long time to merge their associated hbrdisks with parent disks. This consolidation process takes a long time to complete causing operation timed out errors.
Example: Base disk < hbrdisk A < hbrdisk B < hbrdisk C
In this high level example, hbrdisk C merges with hbrdisk B that merges with hbrdisk A that finally merges with the base disk.
1. Give sometime for the consolidation process to complete and then run the Recovery Plan again, it must succeed. You can monitor the target vSphere Replication appliances' hbrsrv.log to find out when the consolidation task completes.
2. For a permanent resolution to increase the timeout threshold, please increase the time out values of the settings below under both the SRM sites to a value that you deem correct.
Change Remote Manager Settings
A. Configure the maximum time to wait for a remote operation to complete. The default value is 900 seconds.
B. Configure an additional timeout period for tasks to complete on the remote site. The default value is 900 seconds.