A vSAN Disk enters a Permanent Device Loss State and the Devices briefly show offline in ILO
search cancel

A vSAN Disk enters a Permanent Device Loss State and the Devices briefly show offline in ILO

book

Article ID: 414016

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When using vSAN and HPE iLO on firmware version 1.7 or below, the disks may go offline during maintenance tasks. 

You may see hotplug interupts in the VMkernel.log when entering into Maintenance Mode: 


yyyy-mm-ddThh:mm:ss In(14) vobd[#######]: [UserLevelCorrelator] 422269070104us: [esx.audit.maintenancemode.entered] The host has entered maintenance mode.
yyyy-mm-ddThh:mm:ss In(###) vmkernel: cpu10:2097438)PCIEHP: ####: 0000:##:##.#: hotplug slot:0x##: num reads=1 slot status=0x108.
yyyy-mm-ddThh:mm:ss In(###) vmkernel: cpu10:2097438)PCIEHP: ####: 0000:##:##.#: hotplug slot:0x## (0000:65:00.0) Adapter removed.
yyyy-mm-ddThh:mm:ss In(###) vmkernel: cpu10:2097438)PCIEHP: ####: 0000:##:##.#: hotplug slot:0x##: Setting PowerIndicator State BLINKING

yyyy-mm-ddThh:mm:ss In(###) vmkernel: cpu6:2097436)PCIEHP: ####: 0000:##:##.#: hotplug slot:0x##: num reads=1 slot status=0x108.
yyyy-mm-ddThh:mm:ss In(###) vmkernel: cpu6:2097436)PCIEHP: ####: 0000:##:##.#: hotplug slot:0x## (0000:64:00.0) Adapter removed.

 

In this case you will see the disk go offline from the iLO for a few seconds prior to reconnecting. Due to the PDL you may need to reboot to clear this from the host. (See Permanent Device Loss (PDL) and All-Paths-Down (APD) on host).

Environment

VMware vSAN 8.x

VMware vSAN 7.x

Cause

This is caused by a bug in the iLO firmware when the iLO sends a message to reinitialize the disk. When this disk restarts we lose access to it from the ESXi host so it creates the PDL error.

 

Resolution

Reach out to HPE to get the fixed iLO firmware. 

Additional Information