Absent Disk removal via vSphere UI or command line and fails with error "A general system error occurred: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error"
Example :
Name,Drive Type,Disk Tier,Capacity,Virtual SAN Health Status,State,Transport Type,Adapter
Absent VSAN Disk (VSAN UUID:12345678-1234-1234-1234-19ooo265####),HDD,Capacity,0.00 B,--,Dead or Error,
The disk is replaced. But, failed disk entry we are still seeing on the diskgroup list as UUID when running vdq -IH command on ESXi host.
# vdq -iH
Mappings:
DiskMapping[0]:
SSD: naa.60000000000000000000000000000000
MD: naa.60000000000000000000000000000001
MD: naa.60000000000000000000000000000002
MD: naa.60000000000000000000000000000003
MD: naa.60000000000000000000000000000004
MD: naa.60000000000000000000000000000005
MD: naa.60000000000000000000000000000006
MD: 12345678-1234-1234-1234-19ooo265#### ------> Failed disk entry
cmmds-tool find -u <DISK UUID> -f json indicates the absent disk type as "DISK_INCOMING". The disk has already been replaced by the customer and can be confirmed by a stale entry seen in the CMMDS data using the following command.
# cmmds-tool find -u 12345678-1234-1234-1234-19ooo265xxxx -f json
{
"entries":
[
{
"uuid": "12345678-1234-1234-1234-19ooo265####", >> Verify the DISK NAA ID for it's health
"owner": "00000000-0000-0000-0000-000000000000",
"health": "Unhealthy",
"revision": "42",
"type": "DISK_INCOMING", >> failed disk with this status means the stale entry
"flag": "0",
"minHostVersion": "0",
"md5sum": "71d234da97c86bde01f267bf33a81306",
"valueLen": "8",
"content": "[[ ]]",
"errorStr": "(null)"
}
]
}
DISK_INCOMING means that the vSAN Data Components are being relocated to this disk.
Unable to remove the Absent vSAN Disk via vSphere Web Client and fails with below error:
"A general system error occurred: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information"
Unable to remove the Absent vSAN Disk via command line and fails with below error
# esxcli vsan storage remove -u 12345678-1234-1234-1234-19ooo265xxxx
Unable to remove device: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information
Vmkernel.log indicates the below error :
2019-04-23T02:53:57.703Z cpu18:36099 opID=70f0a7c5)WARNING: PLOG: PLOG_ExecVSIOp:1552: Disk 12345678-1234-1234-1234-19ooo265#### not found in plog list
2019-04-23T02:54:54.468Z cpu15:37184 opID=2519065f)WARNING: PLOG: PLOG_ExecVSIOp:1552: Disk 12345678-1234-1234-1234-19ooo265#### not found in plog list
We see this issue when the failed vSAN disk is replaced but somehow the stale disk entry is not flushed from the vSAN CMMDS data.
This typically happens when the customer replaces the failed disk prior to removing it from the vSAN disk group or remove the disk group if dedup is enabled or if it's a failed cache tier disk.
The activity must be done in presence of the VMware GSS - vSAN TSE only.
Note : Ensure to have at least 30% free space in vSAN Datastore before performing full data migration.
Example :
# vdq -iH
Mappings:
DiskMapping[0]:
SSD: naa.60000000000000000000000000000000
MD: 52b3f7d8-9f71-17cb-XXXX-19a9f265abfb
esxcli vsan storage remove -u <UUID of the absent capacity disk to remove>