Planned migration for virtual machine fails with the error "A problem occurred with the storage on datastore path '[datastore] vm_name/hbrdisk.RDID-xxxxx-xxxxx-xxxxx.vmdk'."

Products

VMware Live Recovery

Issue/Introduction

Symptoms:

Planned migration of virtual machines fails during the synchronize storage phase with the below error:

A problem occurred with the storage on datastore path '[datastore] vm_name/hbrdisk.RDID-xxxxx-xxxxx-xxxxx.vmdk'
The affected virtual machines are configured for replication using vSphere Replication
On validating the replication status, the vm is found to be in an error state with the same error mentioned above
Reconfiguring the replication for the affected VM fails with the same error
The vm contains of multiple virtual disks but the issue is reported only for one specific RDID disk
There are no issues reported on the target datastore where the replica files exist and all other disks in the vm remain healthy and unaffected

Environment

vSphere Replication 8.x

vSphere Replication 9.x

Cause

This issue is caused by a generic storage access failure during the replication process, where the NFC (Network File Copy) service is unable to open or write to a remote disk (.vmdk) file at the replication target location. Specifically, NFC error code 23 indicates a failure to open the replica disk file, which prevents synchronization and reconfiguration of replication..

Cause Validation

On the source ESXi host, the /var/log/vmkernel.log shows errors during the replication session initialization, indicating failure to update HBR files for a specific RDID:

2025-05-27T04:39:41.787Z cpu33:16326177)Hbr: 3410: Command: INIT_SESSION: error result=Failed gen=-1: Error for (datastoreUUID: "67b70ba7-dbbbxxxx-xxxx-000af7a5aba2"), (diskId: "RDID-afd5d3fd-xxxx-xxxx-xxxx-96a733a8a379"), (hostId: "host-20047"), (pathn$
2025-05-27T04:39:41.787Z cpu33:16326177)WARNING: Hbr: 3438: Command INIT_SESSION failed (result=Failed) (isFatal=FALSE) (Id=0) (GroupID=GID-17996dbb-xxxx-xxxx-xxxx-110c73f788a0)
2025-05-27T04:39:41.787Z cpu33:16326177)WARNING: Hbr: 5093: Failed to establish connection to [10.#.#.#]:31031 (groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0): Failure

On the target vSphere Replication appliance, the /var/log/vmware/hbrsrv.log shows NFC error code 23, confirming failure to open the replica disk:

2025-05-27T10:09:39.855+05:30 info hbrsrv[01962] [Originator@6876 sub=Libs groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [NFC ERROR]NfcFssrvrProcessErrorMsg: received NFC error 23 from server: NfcFssrvrOpen: Failed to open '/vmfs/volumes/67b70ba7-dbbbd2bc-dac7-000af7a5aba2/vm_name/hbrdisk.RDID-afd5d3fd-xxxx-xxxx-xxxx-96a733a8a379.3434312.100826641431978.vmdk': The request will complete asynchronously (NFC_ASYNC)
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] HbrError for (datastoreUUID: "67b70ba7-dbbbd2bc-dac7-000af7a5aba2"), (hostId: "host-20047"), (pathname: "vm_name/hbrdisk.RDID-afd5d3fd-xxxx-xxxx-xxxx-96a733a8a379.3434312.100826641431978.vmdk"), (flags: on-disk-open, nfc-error, retriable) stack:
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [0] Class: NFC Code: 23
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [1] NFC error: NFC_ASYNC
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [2] Code set to: Generic storage error.
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [3] Set error flag: retriable
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [4] Set error flag: nfc-error
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [5] Can't open remote disk /vmfs/volumes/67b70ba7-dbbbd2bc-dac7-000af7a5aba2/vm_name/hbrdisk.RDID-afd5d3fd-xxxx-xxxx-xxxx-96a733a8a379.3434312.100826641431978.vmdk
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [6] Set error flag: on-disk-open
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [7] Attempt 1 of 4, will retry after 50 ms.
2025-05-27T10:09:40.150+05:30 info hbrsrv[01962] [Originator@6876 sub=Main groupID=GID-17996dbb-1c63-46c2-b41f-110c73f788a0 opID=hsl-393fc3ea] [8] Ignored error.

The logs confirm that the failure is isolated to one RDID disk. The system retries the operation but fails repeatedly, resulting in replication errors and failure to initiate planned migration.

Resolution

In this case the recovery will not proceed further due to the synchronization failures for one of the disks. In order to restore the production, follow the below steps

1. Remove the virtual machine from the protection group

Select the Protection Groups tab, select a protection group, and on the right pane, click the Virtual Machines tab.
Right-click a virtual machine and select Remove Protection.
Click Yes to confirm the removal of protection from the virtual machine.
Right click on the virtual machine and select Remove vm

2. Select the Recovery Plans tab, right-click the recovery plan which failed, and select Delete.

3. Select the protection groups tab, right click on the protection group and select delete

4. Manually power on the vm on the DC site and restore production

Once the production is restored, follow the below steps to recover the vm successfully

1. Reconfigure the replication of the virtual machine by excluding the problematic disk from replication - Exclude a Disk from the Replication

2. Reconfigure the replication again to include the problematic disk without using seeds - Include a Disk to the Replication

3. Recreate the protection group and recovery plan - Create vSphere Replication Protection Groups

4. Once the vm replication is in OK state, perform a test and then initiate a planned migration