In a VxRAil environment the lsu-lsi-lsi-msgpt3-plugin could conflict with PTAgent resulting in the ESXi host becoming unresponsive.
search cancel

In a VxRAil environment the lsu-lsi-lsi-msgpt3-plugin could conflict with PTAgent resulting in the ESXi host becoming unresponsive.

book

Article ID: 334767

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
ESXi may become unresponsive on VxRail Dell platforms with VxRail 4.0.302 or earlier release. You may experience one or more of the following symptoms. 
  • Lock file exists - /etc/vmware/esx.conf.LOCK
  • hostd not responding
The following error may be seen in CLI to one or more nodes:
Exception occurred: Error interacting with configuration file /etc/vmware/esx.conf: Timout while waiting for lock, /etc/vmware/esx.conf.LOCK, to be released. Another process has kept this file locked for more than 30 seconds. The process currently holding the lock is esxcfg-mpath(237504201).  This is likely a temporary condition.  Please try your operation again.

The vmkernel.log of one or more nodes may contain something similar to:
2017-07-30T11:49:26.388Z cpu79:37028)FSS: 6264: Failed to open file 'naa.5002538a072cb670'; Requested flags 0x5, world: 37028 [DellPTAgent], (Existing flags 0x4005, world: 36267 [hostd-worker]): Busy
2017-07-30T11:50:28.040Z cpu66:37028)FSS: 6264: Failed to open file 'naa.5002538a072cb620'; Requested flags 0x5, world: 37028 [DellPTAgent], (Existing flags 0x4005, world: 35579 [hostd-worker]): Busy


AND / OR:
2017-06-17T16:22:54.246Z warning hostd-probe[FF8D0350] [Originator@6876 sub=Default] TimeoutException -- Operation timed out hostd detected to be non-responsive

The hostd.log of one or more nodes may contain something similar to:
2017-06-17T15:19:28.454Z warning hostd[47781B70] [Originator@6876 sub=Hostsvc.NetworkProvider opID=54708a35 user=vpxuser] Error getting dvs 5b 45 05 50 62 c4 95 15-61 3d 4f 33 c1 1d 67 37 : Error interacting with configuration file /etc/vmware/esx.conf: Timout while waiting for lock, /etc/vmware/esx.conf.LOCK, to be released. Another process has kept this file locked for more than 30 seconds. The process currently holding the lock is esxcfg-mpath(578473882).  This is likely a temporary condition.  Please try your operation again.

Environment

VMware vSphere ESXi 6.0

Cause

Certain ESXi commands will result in excessive IOCTL commands to the storage controllers. This results in the commands taking a long time to be returned and eventually the ESXi host becomes unresponsive. 
 
By removing the lsu-lsi-lsi-msgpt3-plugin could mitigate the issue.
 

[root@prme-vxrail-12-03:~] esxcli software vib list | grep msgpt3-plugin
lsu-lsi-lsi-msgpt3-plugin      1.0.0-1vmw.600.0.0.2494585           VMware  VMwareCertified   2017-02-07

// vdu_-a-.txt:
For '/opt/lsi/lib/libstorelibir-3.so':
                  tardisk dell_pta.v00:       586789
 
For '/opt/dell/DellPTAgent/lsi/lib/libstorelibir-3.so':
                  tardisk dell_pta.v00:       586789

For '/usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so':
                  tardisk lsu_lsi_.v01:       406996

Resolution

This issue has been addressed in VxRail 4.5.x, 4.0.400
  • Removed lsu-lsi-lsi-msgpt3-plugin from VxRail 4.5.x, 4.0.400 Golden Image
  • Removed lsu-lsi-lsi-msgpt3-plugin from VxRail 4.5.x, 4.0.400 upgrade composite bundle
  • Removed lsu-lsi-lsi-msgpt3-plugin when upgrading to 4.5.x or 4.0.400


Workaround:
Customer on 4.0.302 or earlier release impacted by this, use either option below to remedy the issue.

a) If /etc/vmware/esx.conf.LOCK exists
1. Stage the removal action

# esxcli software vib remove --no-live-install -n lsu-lsi-lsi-msgpt3-plugin
Removal Result
   Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
   Reboot Required: true
   VIBs Installed:
   VIBs Removed: VMware_bootbank_lsu-lsi-lsi-msgpt3-plugin_1.0.0-1vmw.600.0.0.2494585

   VIBs Skipped:

2. The VIB removal is not effective until ESXi host takes a reboot

b) If /etc/vmware/esx.conf.LOCK doesn't exist
1. Log into node and perform following command to list process linked with plug-in:

# lsof | grep msgpt3
2. Stop the processes that are linked with plug-in, one at a time, with:
# /etc/init.d/<name of process> stop
3. Confirm that there are no more process running that are linked with plug-in:
# lsof | grep msgpt3
4. If the output of the command from step 3 is blank, remove the plugin:
# localcli software vib remove -n lsu-lsi-lsi-msgpt3-plugin
5. Restart the stopped processes:
# /etc/init.d/<name of process> start
6. Check to make sure there no process linked to lsu-lsi-lsi-msgpt3-plugin:
# lsof | grep msgpt3

Example of removing plugin from a node:
# lsof | grep msgpt3
35350  smartd        MMAP  -1   /usr/lib/vmware/lsu_plugins/liblsu_lsi_lsi_msgpt3_plugin.so (prot:--/len:12288)
35350  smartd        MMAP  -1   /usr/lib/vmware/lsu_plugins/liblsu_lsi_lsi_msgpt3_plugin.so (prot:--/len:8192)
35350  smartd        MMAP  -1   /usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so (prot:--/len:393216)
35350  smartd        MMAP  -1   /usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so (prot:--/len:16384)
34915  storageRM     MMAP  -1   /usr/lib/vmware/lsu_plugins/liblsu_lsi_lsi_msgpt3_plugin.so (prot:--/len:12288)
34915  storageRM     MMAP  -1   /usr/lib/vmware/lsu_plugins/liblsu_lsi_lsi_msgpt3_plugin.so (prot:--/len:8192)
34915  storageRM     MMAP  -1   /usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so (prot:--/len:393216)
34915  storageRM     MMAP  -1   /usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so (prot:--/len:16384)
34901  sdrsInjector  MMAP  -1   /usr/lib/vmware/lsu_plugins/liblsu_lsi_lsi_msgpt3_plugin.so (prot:--/len:12288)
34901  sdrsInjector  MMAP  -1   /usr/lib/vmware/lsu_plugins/liblsu_lsi_lsi_msgpt3_plugin.so (prot:--/len:8192)
34901  sdrsInjector  MMAP  -1   /usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so (prot:--/len:393216)
34901  sdrsInjector  MMAP  -1   /usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so (prot:--/len:16384)


# /etc/init.d/smartd stop
watchdog-smartd: Terminating watchdog process with PID 35303
smartd stopped
# /etc/init.d/storageRM stop
watchdog-storageRM: Terminating watchdog process with PID 34868
storageRM stopped
# /etc/init.d/sdrsInjector stop
watchdog-sdrsInjector: Terminating watchdog process with PID 34847
sdrsInjector stopped

# localcli software vib remove -n lsu-lsi-lsi-msgpt3-plugin
Removal Result:
   Message: Operation finished successfully.
   Reboot Required: false
   VIBs Installed:
   VIBs Removed: VMware_bootbank_lsu-lsi-lsi-msgpt3-plugin_1.0.0-1vmw.600.0.0.2494585
   VIBs Skipped:

# /etc/init.d/sdrsInjector start
sdrsInjector started
# /etc/init.d/storageRM start
storageRM started
# /etc/init.d/smartd start
smartd started
# /etc/init.d/hostd start
Ramdisk 'hostd' with estimated size of 1803MB already exists
hostd started.

# lsof | grep msgpt3