Unable to delete absent Disk with UUID via vSphere UI or command line
search cancel

Unable to delete absent Disk with UUID via vSphere UI or command line

book

Article ID: 326475

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

vSAN Disk Management indicates disk is marked as absent vSAN.

Absent Disk removal via vSphere UI or command line fails with error "A general system error occurred: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error"
 
Example :
Name,Drive Type,Disk Tier,Capacity,Virtual SAN Health Status,State,Transport Type,Adapter
Absent VSAN Disk (VSAN UUID:12345678-1234-1234-1234-19ooo265####),HDD,Capacity,0.00 B,--,Dead or Error,

 
The disk is replaced. But, failed disk entry we are still seeing on the diskgroup list as UUID when running vdq -IH command on ESXi host. 
 
# vdq -iH
Mappings:
   DiskMapping[0]:
           SSD:  naa.60000000000000000000000000000000
            MD:  naa.60000000000000000000000000000001
            MD:  naa.60000000000000000000000000000002
            MD:  naa.60000000000000000000000000000003
            MD:  naa.60000000000000000000000000000004
            MD:  naa.60000000000000000000000000000005
            MD:  naa.60000000000000000000000000000006
            MD:  12345678-1234-1234-1234-19ooo265#### ------> Failed disk entry 
 

cmmds-tool find -u <DISK UUID> -f json may show the absent disk type as "DISK_INCOMING". The disk has already been replaced by the customer and can be confirmed by a stale entry seen in the CMMDS data using the following command.
 
#  cmmds-tool find -u 12345678-1234-1234-1234-19ooo265xxxx -f json
{
 "entries":
[
 {
   "uuid": "12345678-1234-1234-1234-19ooo265####",     >> Verify the DISK NAA ID for it's health
   "owner": "00000000-0000-0000-0000-000000000000",
   "health": "Unhealthy",
   "revision": "42",
   "type": "DISK_INCOMING",                            >> failed disk with this status means the stale entry                     
   "flag": "0",
   "minHostVersion": "0",
   "md5sum": "71d234da97c86bde01f267bf33a81306",
   "valueLen": "8",
   "content": "[[ ]]",
   "errorStr": "(null)"
 }
]
}

 
DISK_INCOMING means that the vSAN Data Components are being relocated to this disk.

Unable to remove the Absent vSAN Disk via vSphere Web Client and fails with below error:
 
"A general system error occurred: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information"
 
Unable to remove the Absent vSAN Disk via command line and fails with below error
 
# esxcli vsan storage remove -u 12345678-1234-1234-1234-19ooo265xxxx
Unable to remove device: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information

 
Vmkernel.log indicates the below error :
 
2019-04-23T02:53:57.703Z cpu18:36099 opID=70f0a7c5)WARNING: PLOG: PLOG_ExecVSIOp:1552: Disk 12345678-1234-1234-1234-19ooo265#### not found in plog list
2019-04-23T02:54:54.468Z cpu15:37184 opID=2519065f)WARNING: PLOG: PLOG_ExecVSIOp:1552: Disk 12345678-1234-1234-1234-19ooo265#### not found in plog list


Environment

vSAN OSA 7.x, 8.x, 9.x

Cause

We see this issue when the failed vSAN disk is replaced but the stale disk entry is not removed from the vSAN CMMDS data.

This typically happens when the customer replaces the failed disk prior to removing it from the vSAN disk group, or if it's a failed cache tier disk. 

The scenario may also be seen if a capacity disk fails in a deduplication-enabled disk group. The rest of the disk group will not be healthy in vSAN, and will need to be recreated.

Resolution

This activity must be done in presence of the VMware GSS - vSAN TSE only.

  • There should be a valid backup of the data.
  • Put the ESXi host into Maintenance Mode with Ensure Accessibility before removing the disk group to ensure there will be no impact to data accessibility. 
  • If deduplication is not enabled:
    • Select the vSAN disk group which has the ghost disk entry and do a full data migration of healthy capacity disks in the DG, leaving the cache disk and the ghost vSAN disk entries.
  • If deduplication is enabled:
    • The entire disk group will already be offline, and will need to be destroyed and recreated.
  • Remove the disk group with no data migration.
    • Use this command if removal via the Disk Management view does not work.
    esxcli vsan storage remove -u <UUID of the cache disk of the disk group>

 

  • Recreate the disk group.

 

Additional Information

More information on command line instructions to manage the disks.