Absent Disk removal via vSphere UI or command line fails with error "A general system error occurred: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error"
Example :
Name,Drive Type,Disk Tier,Capacity,Virtual SAN Health Status,State,Transport Type,Adapter
Absent VSAN Disk (VSAN UUID:12345678-1234-1234-1234-19ooo265####),HDD,Capacity,0.00 B,--,Dead or Error,
The disk is replaced. But, failed disk entry we are still seeing on the diskgroup list as UUID when running vdq -IH command on ESXi host.
# vdq -iH
Mappings:
DiskMapping[0]:
SSD: naa.60000000000000000000000000000000
MD: naa.60000000000000000000000000000001
MD: naa.60000000000000000000000000000002
MD: naa.60000000000000000000000000000003
MD: naa.60000000000000000000000000000004
MD: naa.60000000000000000000000000000005
MD: naa.60000000000000000000000000000006
MD: 12345678-1234-1234-1234-19ooo265#### ------> Failed disk entry
cmmds-tool find -u <DISK UUID> -f json may show the absent disk type as "DISK_INCOMING". The disk has already been replaced by the customer and can be confirmed by a stale entry seen in the CMMDS data using the following command.
# cmmds-tool find -u 12345678-1234-1234-1234-19ooo265xxxx -f json
{
"entries":
[
{
"uuid": "12345678-1234-1234-1234-19ooo265####", >> Verify the DISK NAA ID for it's health
"owner": "00000000-0000-0000-0000-000000000000",
"health": "Unhealthy",
"revision": "42",
"type": "DISK_INCOMING", >> failed disk with this status means the stale entry
"flag": "0",
"minHostVersion": "0",
"md5sum": "71d234da97c86bde01f267bf33a81306",
"valueLen": "8",
"content": "[[ ]]",
"errorStr": "(null)"
}
]
}
DISK_INCOMING means that the vSAN Data Components are being relocated to this disk.
Unable to remove the Absent vSAN Disk via vSphere Web Client and fails with below error:
"A general system error occurred: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information"
Unable to remove the Absent vSAN Disk via command line and fails with below error
# esxcli vsan storage remove -u 12345678-1234-1234-1234-19ooo265xxxx
Unable to remove device: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information
Vmkernel.log indicates the below error :
2019-04-23T02:53:57.703Z cpu18:36099 opID=70f0a7c5)WARNING: PLOG: PLOG_ExecVSIOp:1552: Disk 12345678-1234-1234-1234-19ooo265#### not found in plog list
2019-04-23T02:54:54.468Z cpu15:37184 opID=2519065f)WARNING: PLOG: PLOG_ExecVSIOp:1552: Disk 12345678-1234-1234-1234-19ooo265#### not found in plog list
vSAN OSA 7.x, 8.x, 9.x
We see this issue when the failed vSAN disk is replaced but the stale disk entry is not removed from the vSAN CMMDS data.
This typically happens when the customer replaces the failed disk prior to removing it from the vSAN disk group, or if it's a failed cache tier disk.
The scenario may also be seen if a capacity disk fails in a deduplication-enabled disk group. The rest of the disk group will not be healthy in vSAN, and will need to be recreated.
This activity must be done in presence of the VMware GSS - vSAN TSE only.
esxcli vsan storage remove -u <UUID of the cache disk of the disk group>