Identifying correct LUN pathing settings
The first step in identifying shared storage connectivity issues is to make sure your SAN and ESX Server are configured to work properly with each other. For instance, note if your SAN is an Active / Active or Active / Passive storage array. An Active / Active storage array will use a path policy of "Fixed" and an Active / Passive storage array will use a path policy of "Most Recently Used (MRU )". Additionally make sure to use the correct "Host Mode Type" on your shared storage (SAN) for LUNs presented to your ESX hosts.
Setting an incorrect storage path policy for your SAN model, may cause "path thrashing" which in turn may cause your shared storage devices to disconnect from your ESX Server hosts.
For information on whether your certified storage device is an "active / active" device that requires a "fixed " path policy or an "active / passive" device that requires an "mru" path policy, find your certified and supported storage device in VMware's online list of supported and certified Storage / SAN devices for your version of ESX Server:
Defining your Host Mode Type
Using the wrong "Host Mode Type" for LUNs presented to ESX Server may also cause shared storage disconnects. Consult with your storage vendor for the specific "Host Mode Type" you need to use on your storage device, so that the LUNs you present to ESX Server version 2.5.x and 3.x systems function properly.
Using the VMkernel error log to diagnose storage issues
Additionally you may login to your ESX Server service console as root and check /var/log/vmkernel log file for entries similar to:
Feb 10 13:41:16 esx02 vmkernel: 93:07:30:44.339 cpu14)WARNING: SCSI: 5663: vmhba1:0:30:1 status = 2/0 0x6 0x29 0x0
The hex values represents SCSI Command Descriptor Block (CDB) error codes comprised of Sense Key, Sense Code, and Extended Sense codes.
The above error message translates to:
Device Check Condition
Host no errors
ABORTED COMMAND
COMMANDS CLEARED BY ANOTHER INITIATOR
Additional error messages that indicate storage connectivity problems are:
Device Check Condition
Host no errors
UNIT ATTENTION
POWER ON RESET or BUS DEVICE RESET OCCURRED
Error messages like those listed above appearing in the ESX Server's /var/log/vmkernel log file indicate the shared storage device has encountered problems that caused it to disconnect from ESX Server. Consequently the shared storage connectivity failure causes virtual machines to disconnect and stop responding until shared storage connectivity is restored. Review your shared storage device log files for any indication of failures and contact your storage vendor for additional assistance.