All Paths Down (APD) is not triggered for LUNs behind IBM SAN Volume Controller (SVC)
search cancel

All Paths Down (APD) is not triggered for LUNs behind IBM SAN Volume Controller (SVC)

book

Article ID: 338053

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
All Paths Down (APD) is not triggered for LUNs behind IBM SAN Volume Controller (SVC) target even when no paths can service I/Os.

The vmkernel logs will report repeating messages similar to the below

018-04-02T18:33:01.623Z cpu21:33633)WARNING: NMP: nmpDeviceAttemptFailover:603: Retry world failover device "naa.600507680c80802c1800000000000072" - issuing command 0x43ba40eaa180
2018-04-02T18:33:01.623Z cpu21:33633)WARNING: vmw_psp_rr: psp_rrSelectPath:1315: Could not select path for device "naa.600507680c80802c1800000000000072".
2018-04-02T18:33:01.623Z cpu21:33633)WARNING: NMP: nmpDeviceAttemptFailover:678: Retry world failover device "naa.600507680c80802c1800000000000072" - failed to issue command due to Not found (APD), try again...


Environment

VMware vSphere ESXi 6.7
VMware vSphere ESXi 6.5

Cause

Should an issue result in all the paths to the backend storage servicing an IBM SAN Volume Controller (SVC) become unavailable, the ESXi hosts may not report the expected APD state for the affected volumes. e.g. Power failure on the backend storage.

As a result, IO will not be fast failed and the host may become unresponsive which will require a reboot to recover from.

Resolution

Option 1: VMware ESXi Resolution

This issue is resolved in the releases below
  • ESX6.5P03 (Build 10884925)
  • ESX6.7Update 1 (Build 10302608)
  • The fix is disabled by default. 
  • To enable the fix, change the ESXi config option /Scsi/ExtendAPDCondition from the default of 0 to 1.
  • To edit this advanced configuration, select the ESXi/ESX host in the Inventory Panel, and then navigate to Configuration > Software > Advanced Settings to launch the Settings window.
  • Change the values of the “Scsi.ExtendAPDCondition” from the default of 0 to 1
For a full list of alternative methods, see Configuring advanced options for ESX/ESXi (1038578).

Option 2: Alternative IBM SVC Resolution

The IBM fix is included in  APAR HU01839 which is available in 8.2.1.0+, 8.1.3.4+ and 7.8.1.8+ releases 

This will address the problem only if the host LUN type is changed from generic to 'adminlun' via a cli command or VVOLS using the Storwize GUI.

Please contact IBM support for further information in relation to the IBM resolution, referencing this VMware KB article.