VMs on vSAN fail to power on with Unable to enumerate all disks. 5 (Input/output error)
search cancel

VMs on vSAN fail to power on with Unable to enumerate all disks. 5 (Input/output error)

book

Article ID: 370584

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • VMs reside on vSAN datastore
  • All vSAN objects are reporting as healthy
  • vSAN Skyline Health is reporting no issues
  • VMs fail to power on with the below error

You see the below messages in the hostd.log on the host the VM resides:

2024-06-23T17:26:55.835Z Er(163) Hostd[2103480]: [Originator@6876 sub=Libs opID=lxlcjff5-159689-auto-3f7u-h5:70061404-98-01-01-c4-3eb4 sid=52acff62 user=vpxuser:USER.CH\admuser] OBJLIB-VSANOBJ: VsanObjReadPolicyInt: Failed to readPolicy for object f5600166-####-####-####-1423f296a836: Input/output error (327682).
2024-06-23T17:26:55.835Z Er(163) Hostd[2103480]: [Originator@6876 sub=Libs opID=lxlcjff5-159689-auto-3f7u-h5:70061404-98-01-01-c4-3eb4 sid=52acff62 user=vpxuser:USER.CH\admuser] OBJLIB-VSANOBJ: VsanObjGetExtParams: Could not read policy for 'vsan://520295x4x1x82xx1-07###82f8xx5765x/f5600166-####-####-####-1423f296a836'.

2024-06-23T17:26:55.835Z In(166) Hostd[2103480]: [Originator@6876 sub=Libs opID=lxlcjff5-159689-auto-3f7u-h5:70061404-98-01-01-c4-3eb4 sid=52acff62 user=vpxuser:USER.CH\admuser] SNAPSHOT: Snapshot DiskTreeAddDiskHierarchy: Couldn't add disk '/vmfs/volumes/vsan:520295x4x1x82xx1-07###82f8xx5765x/f5600166-####-####-####-1423f296a836/VM0034_1.vmdk': Input/output error (5).
2024-06-23T17:26:55.836Z In(166) Hostd[2103480]: [Originator@6876 sub=Libs opID=lxlcjff5-159689-auto-3f7u-h5:70061404-98-01-01-c4-3eb4 sid=52acff62 user=vpxuser:USER.CH\admuser] VigorOfflineGetAllDisks: Failed to retrieve disk files: Input/output error


Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

The following commands do not return any results:

  • esxcli vsan debug object list -u <OBJECT UUID>
  • cmmds-tool find -f python -u <OBJECT UUID>

Environment

VMware vSAN

Cause

This is due to the backing object no longer exists on the vSAN datastore. This is typically due to a vSAN object being marked for deletion due to some sort of manual intervention, and once the locks were released (VM power-off, etc.,) on the object - vSAN proceeded with deleting the object.

vSAN doesn't just delete objects at random. If an object is deleted it will be due to user intervention or in rare circumstances a bug in the code. VMware by Broadcom has seen these types of occurrences in the following scenarios:

  • Interacting with VMware Support Log bundles within ESXi - See KB Interacting with VMware Support Log bundles within ESXi may impact vSAN Object Accessibility for more details
  • User run script deleting vSAN objects (VMware by Broadcom only recommends using scripts with extreme caution to ensure what is being deleted is no longer needed)
  • User accidentally deleting the wrong files/VMs from the environment 
    Note: In the above scenarios the objects were marked for deletion by vSAN however since the objects were locked due to an active process the objects were not deleted until such time that lock was released by either a VM migration, power cycle, or HA event.

Resolution

Restore the VM/affected vmdk from backup.

In order to potentially determine the cause of why the object no longer exists on vSAN.  The entire cluster logs including vCenter will need to be collected as close to the time of the event as possible including the names of the affected VMs, timestamps, and time zone where the cluster is located.

These logs will need to be uploaded to VMware by Broadcom to the case opened with support for review. We will provide a best effort review of the logs provided, but can not guarantee if a cause can be found. As the object marked for deletion could have happened days, weeks, months ago and any pertinent data would no longer be available and have rolled off the system.