Multiple Manager Services running causes a race condition. This is generally caused by user error on installation or when a systems admin attempts to manually failover the services without properly stopping the problematic node.
Note: Manager Service automatic failover is introduced in vRealize Automation 7.3.
In vRealize Automation 7.3 upon IaaS Manager failover - the passive Manager Service is not able to stop one or more of the scheduled operations it manages (e.g., Data Collection) and the operations continue to be executed after the Manager Service node has entered the passive state. This results in multiple nodes racing for the same tasks, which leaves some of those tasks in an inconsistent state.
The automatic manager service failover trigger for this issue is resolved in vRealize Automation 7.3.1 and above, available at VMware Downloads.
Although one of the specific triggers for this issue is resolved in 7.3.1,7.4. Specifically the automatic manager service failover logic.
The scenario of the IaaS workflow engine becoming overloaded can still occur due to other triggers, for example if the manager service is manually started on the secondary IaaS manager node when automatic failover is not enabled.
In which case the same workaround below can still be followed.
To work around the issue: Stop the passive Manager Service and restart the active one.
If swapping the manager service components does not resolve the stuck in requested issue, follow the SQL instructions located below for vRealize Automation 7.3.
To work around the issue:
SELECT vm.VirtualMachineId, vm.VirtualMachineName, vm.VirtualMachineState, vm.RecCreationTime, vm.VMCreationDate
FROM InstanceState ins
JOIN VirtualMachine vm ON (ins.uidInstanceID = vm.VirtualMachineID)
WHERE vm.VirtualMachineState = 'Requested'
ORDER BY vm.VMCreationDate desc