Unable to configure replication on VM with vSAN target cluster reached physical capacity

Products

VMware vSAN VMware Live Recovery

Issue/Introduction

When starting a new replication job targeting a vSAN cluster the following error is received:

A runtime error occurred in the vSphere Replication Management Server. Exception details 'https://<server FQDN>:8123/ invocation failed with "com.vmware.vim.vomi.core.exception.MarshallException: Missing value for managed object reference"'.

The target vSAN cluster is using deduplication and compression and had one or more physical disks reach capacity over 80% full triggering reactive rebalance during the replication attempt (this can be seen in the vSAN Skyline Health check or in the hosts clomd logs).

Environment

vSphere Replication (all versions)
vSAN OSA (all versions)

Cause

During the initial replication attempt the vSAN namespace (VM directory) creation on the target cluster succeeded, but the replication failed due to space issue.
The delete operation to clean up the namespace after the failure hung and entered an incomplete state.
When the replication is subsequently attempted the directory is attempted to be created, found existing, attempted to be deleted, not found, and repeats in a loop.

See Additional Information section for related log entries.

Resolution

To prevent this issue monitor space consumption and keep the cluster utilization below 80%.

To resolve this problem once encountered:

Identify the VM name involved in the failure
Identify the VM namespace using esxcli vsan debug object list | less and search for the VM name to find the associated vmdk name and path
Path:
/vmfs/volumes/vsan:528d2ba6fb8e2599-############/<Namespace UUID>/<VM Name>.vmdk
Search for the Namespace UUID from the path
Object UUID:
1f29ec62-ce81-6389-87cc-############
Version:
15
Owner:
<Owning Host Name>
Place the Owning Host into maintenance mode and reboot
Retry replication for the VM

Additional Information

clom log from a host showing reactive rebalance ongoing, and unable to move data due to capacity utilization across disks:

2025-10-06T14:57:20.147Z info clomd[209####] [Originator@6876 opID=180587####] CLOMProcessWorkItem: Op REACTIVE_REBALANCE starts:180587####
2025-10-06T14:57:20.155Z info clomd[209####] [Originator@6876 opID=180587####] CLOMReconfigure: Reconfiguring 90de3163-0c08-3f24-de45-############ workItem type REACTIVE_REBALANCE
2025-10-06T14:57:20.174Z error clomd[209####] [Originator@6876 opID=180587####] CLOMBalance_CheckMoveGoodnessV2: Failed to move comp 90de3163-f01d-312b-d276-############:52dd7cfe-6475-237f-1d6d-############ size: 88534081863 initSD: 0.028839, finSD: 0.077276 srcFF 0.885608 destFF 0.939870
2025-10-06T14:57:20.174Z warning clomd[209####] [Originator@6876 opID=180587####] CLOMBalance_CheckMoveGoodnessV2: Unsetting disk assignment!
2025-10-06T14:57:20.174Z error clomd[209####] [Originator@6876 opID=180587####] CLOMReplaceComponentsWork: Partial fix supported but failed to find any successful fixes

hostd log from target side host:

2025-10-06T15:00:51.774Z info hostd[2103275] [Originator@6876 sub=Vimsvc.TaskManager opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] Task Created : haTask--vim.VirtualD
iskManager.deleteVirtualDisk-########
2025-10-06T15:00:51.777Z info hostd[2469409] [Originator@6876 sub=Libs opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] [NFC INFO]Nfc_NewClientV3: Successfully created a
new local client. Client name : UnknownClient OpId : UnknownOpId
2025-10-06T15:00:51.778Z info hostd[2134173] [Originator@6876 sub=OsfsClient opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] Delete VMDKs in /vmfs/volumes/vsan:<Datastore UUID>/<VM Name>/<VM Name>_1.vmdk
2025-10-06T15:00:51.778Z error hostd[2134173] [Originator@6876 sub=OsfsClient opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] Object search failed: N7Vmacore24InvalidArgumentExceptionE(Invalid argument)
--> ==[/context]
2025-10-06T15:00:51.779Z info hostd[2134173] [Originator@6876 sub=OsfsClient opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] 0 vmdks found
2025-10-06T15:00:51.779Z info hostd[2134173] [Originator@6876 sub=OsfsClient opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] Done deleting objects in directory: /vmfs/volumes/vsan:<Datastore UUID>/<VM Name>/<VM Name>_1.vmdk
2025-10-06T15:00:51.780Z error hostd[2134173] [Originator@6876 sub=Nfcsvc opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] file delete force (/vmfs/volumes/vsan:<Datastore UUID>/<VM Name>/<VM Name>_1.vmdk -> /vmfs/volumes/vsan:<Datastore UUID>/<VM Name>/<VM Name>_1.vmdk) by 'vpxuser' ended with error code'16'
2025-10-06T15:00:51.780Z info hostd[2134173] [Originator@6876 sub=Vimsvc.TaskManager opID=cc1ac5a6-a746-4d10-82e5-############-###-#######-##-##-#### user=vpxuser:<domain>\<user>] Task Completed : haTask--vim.VirtualDiskManager.deleteVirtualDisk-######## Status error
2025-10-06T15:00:51.780Z info hostd[2134173] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=cc1ac5a6-a746-4d10-82e5-############-HMS-3238001-ec-d6-a640] Event 2653530 : Deletion of file or directory /vmfs/volumes/vsan:<Datastore UUID>/<VM Name>/<VM Name>_1.vmdk from <Datastore Name> was initiated from 'VMware-client/[email protected]' and completed with status 'Failure'
2025-10-06T15:00:52.786Z info hostd[2103229] [Originator@6876 sub=Libs opID=cc1ac5a6-a746-4d10-82e5-############-HMS-3238001-ec-d6-a640] VsanFileSystemImpl: vSAN datastore cid <Datastore UUID>, aid <Datastore UUID> totalRawCapacity: 272693449945088, usedRawCapacity: 237636238433119
2025-10-06T15:00:52.787Z info hostd[2103229] [Originator@6876 sub=Hostsvc.DatastoreSystem opID=cc1ac5a6-a746-4d10-82e5-############-HMS-3238001-ec-d6-a640] RefreshVdiskDatastores: Done refreshing datastores.
2025-10-06T15:00:53.506Z info hostd[2103241] [Originator@6876 sub=Vimsvc.TaskManager opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Task Created : haTask--vim.DatastoreNamespaceManager.DeleteDirectory-########
2025-10-06T15:00:53.508Z info hostd[5609260] [Originator@6876 sub=AdapterServer opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] AdapterServer caught exception; <<52360b09-1153-7af9-757d-############, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 41394'>>, ha-datastore-namespace-manager, vim.DatastoreNamespaceManager.DeleteDirectory>, N3Vim5Fault12FileNotFound9ExceptionE(Fault cause: vim.fault.FileNotFound
--> )
--> [context] ==[/context]
2025-10-06T15:00:53.509Z info hostd[5609260] [Originator@6876 sub=Vimsvc.TaskManager opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Task Completed : haTask--vim.DatastoreNamespaceManager.DeleteDirectory-65379901 Status error
2025-10-06T15:00:53.509Z info hostd[5609260] [Originator@6876 sub=Solo.Vmomi opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Activation finished; <<52360b09-1153-7af9-757d-############, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 41394'>>, ha-datastore-namespace-manager, vim.DatastoreNamespaceManager.DeleteDirectory>
2025-10-06T15:00:53.509Z verbose hostd[5609260] [Originator@6876 sub=Solo.Vmomi opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Arg datacenter:
--> (null)
2025-10-06T15:00:53.509Z verbose hostd[5609260] [Originator@6876 sub=Solo.Vmomi opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Arg datastorePath:
--> "/vmfs/volumes/vsan:<Datastore UUID>/<VM Name>"
2025-10-06T15:00:53.509Z info hostd[5609260] [Originator@6876 sub=Solo.Vmomi opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Throw vim.fault.FileNotFound
2025-10-06T15:00:53.509Z info hostd[5609260] [Originator@6876 sub=Solo.Vmomi opID=cc1ac5a6-a746-4d10-82e5-############-HMS-########-##-##-#### user=vpxuser:VSPHERE.LOCAL\com.vmware.vr-sa-15b51c02-011d-48c9-97ee-############] Result:
--> (vim.fault.FileNotFound) {
--> file = "/vmfs/volumes/vsan:<Datastore UUID>/<VM Name>",
--> msg = "",
--> }

hms log from target replication appliance (/opt/vmware/hms/logs/hms.log)

2025-10-06 15:04:36.515 ERROR com.vmware.hms.TaskRunnable [hms-main-thread-#####] (..jvsl.util.Slf4jUtil) [operationID=cb692270-e1d0-4603-8ae7-############-###-#######,sessionID=30F7####, operationID=cb692270-e1d0-4603-8ae7-############-###-#######,sessionID=30F7####, task=HTID-46d19301-7f3c-4ec9-bcb8-############] | runTask-failed name: "Configure Replication Secondary"; class: com.vmware.hms.job.impl.SecondaryVc2VcConfigureReplicationWorkflow; groupMoId: GID-03f6eb01-0194-4993-9355-############; hbrTag: null; err: com.vmware.vim.binding.hms.fault.UnableToCreateDiskFault; time: 10557 ms
com.vmware.vim.binding.hms.fault.UnableToCreateDiskFault: com.vmware.vim.binding.hms.fault.UnableToCreateDiskFault

2025-10-06 15:04:37.002 DEBUG com.vmware.hms.replication.secondaryGroup [#####-#] (..hms.replication.SecondaryGroupImpl) [operationID=cb692270-e1d0-4603-8ae7-############-###-#######,sessionID=30F74440] | remoteSetConfigurationState(GID-03f6eb01-0194-4993-9355-############, (hms.VersionedConfigurationState) {
dynamicType = null,
dynamicProperty = null,
configurationState = error,
version = 2
}, (hms.fault.UnableToCreateDiskFault) {
faultCause = (vim.fault.FileNotFound) {
faultCause = null,
faultMessage = null,
file = [<Datastore Name>] <Namespace UUID>/<VM Name>-000001.vmdk
},
faultMessage = null,
originalMessage = null,
groupName = <VM Name>,
deviceKey = 2000,
sourceDiskFileName = [Source Datastore Name] <Namespace UUID>/<VM Name>-000001.vmdk,
diskFile = [<Datastore Name>]<Namespace UUID>/<VM Name>-000001.vmdk
}) : (hms.VersionedStatus) {
dynamicType = null,
dynamicProperty = null,
status = error,
version = 1