DOM: DOM2PCPrintDescriptor:1797: [105568173:0x4313fe8f3718] => Stuck descriptor
2022-05-31T11:42:46.065Z: [vSANCorrelator] 10605891965954us: [vob.vsan.lsom.devicerepair] vSAN device 521a74ce-####-####-####-########daf is being repaired due to I/O failures, and will be out of service until the repair is complete. If the device is part of a dedup disk group, the entire disk group will be out of service until the repair is complete.
2022-05-31T11:42:46.065Z: [vSANCorrelator] 10606062774178us: [esx.problem.vob.vsan.lsom.devicerepair] Device 521a74ce-####-####-####-########daf is in offline state and is getting repaired
2022-06-03 01:44:16,575 INFO vsandevicemonitord stderr None, stdout b"VsanUtil::ReadFromDevice: Failed to open /vmfs/devices/disks/naa.500a075128######, errno (5)\nVsanUtil::GetVsanDisks: Error occurred 'Failed to open device /vmfs/devices/disks/naa.500a075128######', create disk with null id\nVsanUtil::ReadFromDevice: Failed to open /vmfs/devices/disks/naa.500a075128######, errno (5)\nErrors: \nUnable to mount: Failed to open device /vmfs/devices/disks/naa.500a075128######\n" from command /sbin/localcli vsan storage diskgroup mount -d naa.500a075128######.
2022-06-03 01:44:16,575 INFO vsandevicemonitord Mounting failed on VSAN device naa.500a075128######.
2022-06-03 01:44:16,575 INFO vsandevicemonitord Repair attempt 131 for device 521a74ce-####-####-####-########daf
VMware vSAN 7.x
As the RELOG on the failed disk did not happen, this led to PLOG build-up leading to congestion and latencies at the VM level.
RELOG is an internal process of vSAN which is used to free up the space in LSOM layer for log reclamation.
RELOG doesn't happen on device if device remains in repair state for long time which might lead to log buildup.
The issue has been fixed in 6.7 U3 P05 and 7.0 U3D and higher respectively.
The above behavior is reported in ESXi 7.0 GA / 7.0 U1.
After applying the fix, vSAN shall process relog on the disk under repair to avoid PLOG log build-up.