In the current setup, the Recovery Plan is configured to execute four separate post power-on scripts sequentially. Each script is followed by a server reboot. During the validation of the VMware Site Recovery Manager (SRM) logs, we identified that after the first script execution, the server reboots. SRM attempts to execute the subsequent script immediately after the reboot, but fails because it cannot locate the script inside the guest operating system (OS). This failure occurs because the system is still in the process of rebooting when SRM tries to access the script.
From the logs, the error messages indicate this issue:
vmware-dr log entries for the time 2024-12-05T19:25:13.809+05:30 show the following:
2024-12-05T19:25:13.809+05:30 info vmware-dr[01194] [SRM@6876 sub=vmomi.soapStub[45] opID=6e3f3e24-ea2c-447e-92dc-6cebf0ee99ba-test:3ad1:ec73:6bc1:dbe5:9f45:5018:c46e] SOAP request returned HTTP failure; <SSL(<io_obj p:0x00007fda20068c20, h:40, <TCP '10.x.xx.75 : 48824'>, <TCP '10.x.xx.63 : 443'>>), /sdk>, method: initiateFileTransferFromGuest; code: 500(Internal Server Error); fault: (vim.fault.FileNotFound) { ---> SRM is unable to locate the script because the Guest OS is rebooting.
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> file = "C:\Windows\TEMP\vmware-SYSTEM\srm-vmware214\srmStdOut.log"
--> msg = "Received SOAP response fault from [<SSL(<io_obj p:0x00007fda20068c20, h:40, <TCP '10.x.xx.75 : 48824'>, <TCP '10.x.xx.63 : 443'>>), /sdk>]: initiateFileTransferFromGuest
--> File C:\Windows\TEMP\vmware-SYSTEM\srm-vmware214\srmStdOut.log was not found"
--> }
2024-12-05T19:25:13.885+05:30 error vmware-dr[01110] [SRM@6876 sub=Recovery ctxID=8cf9b2e4 opID=6e3f3e24-ea2c-447e-92dc-6cebf0ee99ba-test:3ad1:ec73:6bc1:dbe5:9f45:5018:c46e] [9b6cadea-4437-4e76-8859-6c6287b95720.postPowerOnCallouts-protected-vm-14xxxx53-POST-1] Complete with Error '(dr.recovery.fault.CalloutFailure) { --> Triggered when a function or process created by the user (as opposed to a system-defined or pre-built function) returns a value other than 0. The dr.recovery.fault.CalloutFailure fault in VMware Site Recovery Manager (SRM) indicates a failure in the callout process, often associated with errors in virtual machine customization.
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> result = (dr.recovery.CalloutResult) {
--> commandLine = "cmd.exe /c "C:\Scripts\rename_computer.cmd"",
--> output = "",
--> returnValue = 1115
2024-12-05T19:25:13.886+05:30 error vmware-dr[01242] [SRM@6876 sub=Recovery ctxID=8cf9b2e4 opID=6e3f3e24-ea2c-447e-92dc-6cebf0ee99ba-test:3ad1:ec73:6bc1:dbe5:9f45] [9b6cadea-4437-4e76-8859-6c6287b95720.postPowerOnCallouts-protected-vm-142xxx3] Per VM Callout failed to complete inside guest VM: [dr.replication.ProtectedVm:3909a231-641a-4d4e-92fd-227e744125a6:protected-vm-1425353]. Error: (dr.recovery.fault.CalloutFailure) { --> SRM is reporting that the script inside the Guest OS has failed.
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> result = (dr.recovery.CalloutResult) {
--> commandLine = "cmd.exe /c "C:\Scripts\rename_computer.cmd"",
--> output = "",
--> returnValue = 1115
2
024-12-05T19:25:13.888+05:30 error vmware-dr[01303] [SRM@6876 sub=Recovery ctxID=8cf9b2e4 opID=6e3f3e24-ea2c-447e-92dc-6cebf0ee99ba-test:3ad1:ec73:6bc1] [9b6cadea-4437-4e76-8859-6c6287b95720.failoverOrchJob] Failure while powering on VM W2xxx-Test [vm-2xxx3] --> This indicates that the VM is rebooting while SRM is attempting to execute the script.
2024-12-05T19:25:13.888+05:30 verbose vmware-dr[01303] [SRM@6876 sub=Default ctxID=8cf9b2e4 opID=6e3f3e24-ea2c-447e-92dc-6cebf0ee99ba-test:3ad1:ec73:6bc1] [9b6cadea-4437-4e76-8859-6c6287b95720.failoverOrchJob] Setting job failure: (dr.recovery.fault.CalloutFailure) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> result = (dr.recovery.CalloutResult) {
--> commandLine = "cmd.exe /c "C:\Scripts\rename_computer.cmd"",
--> output = "",
--> returnValue = 1115 --> The return value 1115 indicates that there was an issue with the VMware Tools script execution during the shutdown process of a virtual machine. This can be caused by a non-zero exit code from the poweroff-vm script.
Site Recovery Manager 8.x
Site Recovery Manager 9.x
Windows
To resolve the issue, a Post Power On Step with Prompt should be added between each script. This approach will introduce a user prompt that requires acknowledgment before SRM proceeds with the next recovery step. This pause will provide sufficient time for the guest OS to fully reboot and stabilize before SRM attempts to execute the next script.
Steps:
Modify the recovery plan by adding a Post Power On step with a user acknowledgment prompt after each script.
Ensure that after the VM reboots, the recovery process pauses, and the user must acknowledge the prompt.
Once the VM is online and fully rebooted, the user should click Dismiss on the prompt to resume the recovery process and execute the next script.
By introducing a prompt and pause between each script execution, the solution ensures that there is enough time for the VM to complete its reboot process and be fully accessible to SRM before the next script is executed.
This will allow SRM to locate the necessary scripts and complete the recovery plan without the "FileNotFound" error.
Refer to the document for details related to Create Message Prompts or Command Steps for Individual Virtual Machines