VM not accessible after physical disk failure in vSAN
search cancel

VM not accessible after physical disk failure in vSAN

book

Article ID: 404029

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptom:

  • One of the physical disk reported failure.

  • One of the VM was reported as 'inaccessible' in the vCenter inventory.

  • Affected VM has its namespace and both virtual disks configured with vSAN RAID1 storage policy.

  • vSAN Skyline health shows just one object to be inaccessible and that object represents the virtual disk-1 of the VM.

  • vSAN upgrade pre-check shows uuid vsan object inaccessible.

Environment

VMware vSAN 8.x

Cause

  • VM is inaccessible as the backend vSAN object of one of the VMDK is not healthy.
    The object is inaccessible and fails to rebuild with the other Active components in the object.

  • Two components in this object were 'Reconfiguring' when the physical disk failure brought a third component in 'Absent' state.

  • Due to this issue, 'Reconfiguring' components did not restore 'Active' state and whole object is in inaccessible state.

  • Output of running command : # esxcli vsan debug object list -u <object_UUID>, is as follows

    (Note : This object UUID is received from vSAN Skyline Health)

RAID_1
    Component: ########-####-####-####-############
    Component State: ACTIVE, Address Space(B): 273804165120 (255.00GB), Disk UUID: ########-####-####-####-############, Disk Name: naa..#################:2
    Votes: 2, Capacity Used(B): 276396244992 (257.41GB), Physical Capacity Used(B): 273657364480 (254.86GB), Total 4K Blocks Used(B): 273635536896 (254.84GB), Host Name: <esxi fqdn>
    Component: ########-####-####-####-############
    Component State: ACTIVE, Address Space(B): 273804165120 (255.00GB), Disk UUID: ########-####-####-####-############, Disk Name: naa..#################:2
    Votes: 2, Capacity Used(B): 276396244992 (257.41GB), Physical Capacity Used(B): 273657364480 (254.86GB), Total 4K Blocks Used(B): 273635536896 (254.84GB), Host Name: <esxi fqdn>
RAID_1
    Component: ########-####-####-####-############
    Component State: ABSENT, Address Space(B): 273804165120 (255.00GB), Disk UUID: ########-####-####-####-############, Disk Name: N/A, Transient: 1
    Votes: 1, Host UUID: None
RAID_1
    Component: ########-####-####-####-############
    Component State: RECONFIGURING, Address Space(B): 273804165120 (255.00GB), Disk ########-####-####-####-############, Disk Name: naa. .#################:2
    Votes: 1, Capacity Used(B): 279315480576 (260.13GB), Physical Capacity Used(B): 2206203904 (2.05GB), Total 4K Blocks Used(B): 2193866752 (2.04GB), Host Name: <esxi fqdn>
    Component: ########-####-####-####-############
    Component State: RECONFIGURING,
    Votes: 1, Capacity Used(B): 279315480576 (260.13GB),
    Address Space(B): 273804165120 (255.00GB), Disk UUID: ########-####-####-####-############, Disk Name: naa.#################:2
    Physical Capacity Used(B): 2206203904 (2.05GB)

Resolution

  • Due to the 'Absent' component, 'Reconfigure' of the other two components would not complete.
    This renders the object and hence the VM inaccessible.

  • Remove the VM from vCenter inventory and restore the VM from backup as the object could not be recovered.

  • Remove the Inaccessible object using following command:
    /usr/lib/vmware/osfs/bin/objtool delete -u <UUID-of-inaccessible-object> -f

    (Note : Use this command cautiously ensuring correct UUID is entered as this is a destructive command and object can not be recovered once deleted).