vCenter Server alarm "Cannot connect to Storage"
search cancel

vCenter Server alarm "Cannot connect to Storage"

book

Article ID: 407403

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • An Alarm is triggered in the vCenter Server for multiple hosts "Cannot connect to Storage".

 

Environment

VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

All-Paths-Down (APD) situation occurs when all paths to a device are down. As there is no indication of whether this is a permanent or temporary device loss, the ESXi host keeps reattempting to establish connectivity. APD-style situations commonly occur when the LUN is incorrectly un-presented from the ESXi host.

The timeout period begins when the storage device becomes unavailable to the ESXi host and enters the APD state. By default, the APD timeout is set to 140 seconds. While the timeout lasts, the host continues its attempts to reestablish connectivity with the device. When the timeout ends and the device does not recover, the host stops its attempts to retry any I/O that is not coming from virtual machines.
The reasons for an APD state can be, for example, a failed switch or a disconnected storage cable.

  • In the /var/log/vobd.log file of the ESXi host all paths down timeout error is observed:

    YYYY-MM-DD T00:26:51.504Z: [APDCorrelator] 2682686563317us: [esx.problem.storage.apd.timeout] Device or filesystem with identifier [11ace9d3-7bebe4e8] has entered the All Paths Down Timeout state after being in the All Paths Down state for 140 seconds. I/Os will now be fast failed.

Resolution

Due to the nature of an APD situation, there is no clean way to recover.

  • The APD situation needs to be resolved at the storage array/fabric layer to restore connectivity to the host.
  • All affected ESXi hosts may require a reboot to remove any residual references to the affected devices that are in an APD state.
  • To resolve this issue, identify the cause of the disconnected LUNs by reviewing the environment, such as Storage array, SAN switch, Device failure, etc.

If the virtual machines on the datastores remain responsive, power off the virtual machines or migrate them to a different datastore or host.

Additional Information