Symptoms:
- Recovery for VMs intermittently fail after 20 minutes during reconfiguration and subsequent power on
- The VMs in the target site are in a shutdown state
- The virtual NIC of the effected VMs are in a disconnected state and IP customization fails after retrying 20 times
- The correct port group needs to be manually assigned to the virtual NIC, and the Recovery Plan needs to be run again to overcome the failure
- As part of the SRM recovery workflow, the VM's network ID gets re-configured and updated correctly with the dvSwitch ID of the DR site
- However, during power on the dvSwitch ID of the VM gets reverted back to the dvSwitch ID of the production site
- This issue does not occur if the VM is powered on in the same ESXi host where it is re-configured
- The issue does not occur if DRS in the target site is disabled or set to manual
- The issue occurs only when DRS decides to power on the VM on a different host from the one where it originally resides and where the re-configure happens
- This issue is seen only when ESXi is used with Nutanix AOS Stargate for NFS
- "Hostd" logs for the re-configure operation on the ESXi host in the target site:
2023-07-13T07:07:20.763Z verbose hostd[2100051] [Originator@6876
sub=Vmsvc.vm:/vmfs/volumes/c9bb7c2f-35be9a24/Test/Test.vmx opID=4e7b7aa4-9162-4907-800e-
4d36124733ba-failover:e94a:dde7:2670:3cb8:05dc-9f-01-75-7127 user=vpxuser:user\SRM-
39c5cb64-a157-45e0-8651-8e798808a7d6] Reconfigure: (vim.vm.ConfigSpec) {
--> createDate = "2022-10-27T00:50:27.35248Z",
--> files = (vim.vm.FileInfo) {
--> vmPathName = "[]/vmfs/volumes/c9bb7c2f-35be9a24/Dummy2/Dummy2.vmx",
--> },
--> deviceChange = (vim.vm.device.VirtualDeviceSpec) [
--> (vim.vm.device.VirtualDeviceSpec) {
--> operation = "edit",
--> device = (vim.vm.device.VirtualVmxnet3) {
--> key = 4000,
--> deviceInfo = (vim.Description) {
--> label = "Network adapter 1",
--> summary = "DVSwitch: 50 00 f4 cf b3 8f 98 b4-3c e9 61 af 7c d5 25 b0"
--> },
--> backing = (vim.vm.device.VirtualEthernetCard.DistributedVirtualPortBackingInfo) {
--> port = (vim.dvs.PortConnection) {
--> switchUuid = "50 18 b9 f9 48 87 b4 36-dc 50 c0 8d ba 42 ce 6c",
--> portgroupKey = "dvportgroup-36",
--> portKey = "80",
--> connectionCookie = 2075781137
- Hostd logs during registration of the VM on the target ESXi host where DRS decides to power the VM on the dvSwitch ID changes here and as expected, hostd reports that the DVS cannot be found:
2023-07-13T07:07:23.529Z warning hostd[2099929] [Originator@6876
sub=Hostsvc.NetworkProvider opID=4e7b7aa4-9162-4907-800e-4d36124733ba-
failover:e94a:dde7:2670:594c:ad7a-62-01-01-06-01-38-a024] GetDvsById: dvs 50 00 f4 cf b3 8f 98
b4-3c e9 61 af 7c d5 25 b0 not found
2023-07-13T07:07:23.529Z warning hostd[2099929] [Originator@6876
sub=Hostsvc.NetworkProvider opID=4e7b7aa4-9162-4907-800e-4d36124733ba-
failover:e94a:dde7:2670:594c:ad7a-62-01-01-06-01-38-a024] Error getting dvs 50 00 f4 cf b3 8f 98 b4-3c e9 61 af 7c d5 25 b0 : Fault cause: vim.fault.NotFound