Error "Operation aborted due to an unexpected error" when executing a Test Failover in VMware Cloud Availability
search cancel

Error "Operation aborted due to an unexpected error" when executing a Test Failover in VMware Cloud Availability

book

Article ID: 405831

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • DR test failed
  • All VMs integrity status is in red
  • The task fails with error "Operation aborted due to an unexpected error" and after that the Cleanup Fails also with error "Cleanup failed".
  • From the cloud.log file you will find error similar to:

com.vmware.h4.jobengine.JobExecution : Task XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX (WorkflowInfo
{type='failoverTestCleanup', resourceType='vmReplication', resourceId='C4-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX', isPrivate=false, resourceName='<VM_NAME>'}
) has failed

com.vmware.vcloud.client.exception.TimeoutException: Could not wait for the RT VM 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX' to refresh.
at com.vmware.h4.cloud.job.C4ReplicationBaseJob.waitRtVmCondition(C4ReplicationBaseJob.java:632)
at com.vmware.h4.cloud.job.C4ReplicationBaseJob.lambda$waitRtVmCondition$8(C4ReplicationBaseJob.java:639)
at com.vmware.h4.jobengine.JobEngine.execute(JobEngine.java:238)
at com.vmware.h4.jobengine.JobEngine.lambda$resumeLater$4(JobEngine.java:202)
at com.vmware.h4.jobengine.ThrottledExecutorService$1.run(ThrottledExecutorService.java:132)
at com.vmware.h4.common.mdc.MDCRunnableWrapper.run(MDCRunnableWrapper.java:32)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)

Environment

VMware Cloud Availability 4.7.3

Cause

The RTVM is an empty VM without disks. During failover, VCDA reload the VM in vSphere and attach the disks, then IT wait for RTVM (Cloud Director) to see that the VM have disks.

The revers is done in cleanup, where VCDA removes the disk and wait for Cloud Director and during this phase it times out.

Resolution

Workaround:

 : 
1. SSH on Manager Appliance and edit '/opt/vmware/h4/cloud/config/application.properties' 

2. Add this line to the file: vcd.rt.vms.condition.timeout=300000, and save the file

3. Restart the cloud service (see document Restart the services)