Trespassing of LUNs on the array when using Active Non-optimized (ANO) path
book
Article ID: 323089
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms:
You observe trespassing of the LUNs on the array
The array has been configured correctly as per the storage vendor's recommendation.
The path policy on all the ESX/ESXi hosts accessing the LUN is set correctly.
The logs contain entries similar to:
Aug 5 10:47:24 vmkernel: 16:13:50:01.679 cpu1:10072)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41027ef3ea40) to NMP device "naa.60060160267022007a3f86f93998de11" failed on physical path "vmhba2:C0:T0:L2" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6
Aug 5 10:47:24 vmkernel: 16:13:50:01.679 cpu1:10072)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160267022007a3f86f93998de11": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Aug 5 10:47:25 vmkernel: 16:13:50:02.684 cpu9:4227)VMW_SATP_ALUA: satp_alua_activatePaths: Activation disallowed due to follow-over.
Aug 5 10:47:25 vmkernel: 16:13:50:02.685 cpu1:4471)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "naa.60060160267022007a3f86f93998de11" - issuing command 0x41027ef3ea40
Aug 5 10:47:25 vmkernel: 16:13:50:02.685 cpu1:148055)NMP: nmpCompleteRetryForPath: Retry world recovered device "naa.60060160267022007a3f86f93998de11"
Aug 5 10:47:25 vmkernel: 16:13:50:03.381 cpu1:145880)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41027f238d40) to NMP device "naa.60060160267022007a3f86f93998de11" failed on physical path "vmhba2:C0:T0:L2" H:0x0 D:0x2 P:0x0 Valid sense data: 0x
Aug 5 10:47:25 vmkernel: 16:13:50:03.381 cpu1:145880)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160267022007a3f86f93998de11": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Aug 5 10:47:26 vmkernel: 16:13:50:03.684 cpu4:4471)WARNING: NMP: nmpDeviceAttemptFailover: Retry world failover device "naa.60060160267022007a3f86f93998de11" - issuing command 0x41027f238d40
Aug 5 10:47:26 vmkernel: 16:13:50:03.685 cpu1:145865)NMP: nmpCompleteRetryForPath: Retry world recovered device "naa.60060160267022007a3f86f93998de11"
This issue occurs if the Path Selection policy (PSP) is configured to use the active non-optimized (ANO) path to issue I/O.
When Round Robin path selection policy (PSP_RR) is used with ALUA SATP, the default PSP_RR setting is to use only active-optimized paths if they are available.
You can override this setting with a command to include ANO paths using commands below :
esxcli storage nmp psp roundrobin deviceconfig set --useano=1 -d <naa of the device>
Some ALUA arrays automatically initiate a LUN failover (trespass) when enough I/O is directed to a non-optimized path. This happens so I/O is internally re-directed to an active optimized path.
Configuring PSP_RR to use ANO paths for such arrays may result in path thrashing and poor I/O performance.
You experience this issue if the array is configured with fixed path policy and the preferred path is non-optimized.
Resolution
To resolve this issue, make sure that the ANO path is not enabled for I/O.
To make sure that the ANO path is not enabled for I/O:
Determine if the LUN is configured to use the ANO path using this command:
esxcli storage nmp psp roundrobin deviceconfig get -d naa_of_device
For example:
esxcli storage nmp psp roundrobin deviceconfig get -d naa.600507680282008c400000000000010f
You see output similar to:
Byte Limit: 10485760 Device: naa.600507680282008c400000000000010f IOOperation Limit: 1000 Limit Type: Iops Use Active Unoptimized Paths: true
Change this setting by running this command for the LUN that is thrashing:
esxcli storage nmp psp roundrobin deviceconfig set --useano=0 -d device_naa
Note: By default, when you set a path policy, the ANO value is set to 0 (false). The host uses the non-optimized path only when there are no optimized paths available.
There are two use-cases for using the active non-optimized ALUA paths:
When there are no optimized paths available.
During SAN boot, the multi-path driver is not loaded yet, and the simple BIOS SAN boot agent uses whatever paths it finds first.