In a VxRAil environment the lsu-lsi-lsi-msgpt3-plugin could conflict with PTAgent resulting in the ESXi host becoming unresponsive.
book
Article ID: 334767
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms: ESXi may become unresponsive on VxRail Dell platforms with VxRail 4.0.302 or earlier release. You may experience one or more of the following symptoms.
Lock file exists - /etc/vmware/esx.conf.LOCK
hostd not responding
The following error may be seen in CLI to one or more nodes: Exception occurred: Error interacting with configuration file /etc/vmware/esx.conf: Timout while waiting for lock, /etc/vmware/esx.conf.LOCK, to be released. Another process has kept this file locked for more than 30 seconds. The process currently holding the lock is esxcfg-mpath(237504201). This is likely a temporary condition. Please try your operation again.
The vmkernel.log of one or more nodes may contain something similar to: 2017-07-30T11:49:26.388Z cpu79:37028)FSS: 6264: Failed to open file 'naa.5002538a072cb670'; Requested flags 0x5, world: 37028 [DellPTAgent], (Existing flags 0x4005, world: 36267 [hostd-worker]): Busy 2017-07-30T11:50:28.040Z cpu66:37028)FSS: 6264: Failed to open file 'naa.5002538a072cb620'; Requested flags 0x5, world: 37028 [DellPTAgent], (Existing flags 0x4005, world: 35579 [hostd-worker]): Busy
AND / OR: 2017-06-17T16:22:54.246Z warning hostd-probe[FF8D0350] [Originator@6876 sub=Default] TimeoutException -- Operation timed out hostd detected to be non-responsive
The hostd.log of one or more nodes may contain something similar to: 2017-06-17T15:19:28.454Z warning hostd[47781B70] [Originator@6876 sub=Hostsvc.NetworkProvider opID=54708a35 user=vpxuser] Error getting dvs 5b 45 05 50 62 c4 95 15-61 3d 4f 33 c1 1d 67 37 : Error interacting with configuration file /etc/vmware/esx.conf: Timout while waiting for lock, /etc/vmware/esx.conf.LOCK, to be released. Another process has kept this file locked for more than 30 seconds. The process currently holding the lock is esxcfg-mpath(578473882). This is likely a temporary condition. Please try your operation again.
Environment
VMware vSphere ESXi 6.0
Cause
Certain ESXi commands will result in excessive IOCTL commands to the storage controllers. This results in the commands taking a long time to be returned and eventually the ESXi host becomes unresponsive.
By removing the lsu-lsi-lsi-msgpt3-plugin could mitigate the issue.
// vdu_-a-.txt: For '/opt/lsi/lib/libstorelibir-3.so': tardisk dell_pta.v00: 586789
For '/opt/dell/DellPTAgent/lsi/lib/libstorelibir-3.so': tardisk dell_pta.v00: 586789
For '/usr/lib/vmware/lsu_plugins/lsi-lsi-msgpt3-plugin/libstorelibir-3.so': tardisk lsu_lsi_.v01: 406996
Resolution
This issue has been addressed in VxRail 4.5.x, 4.0.400
Removed lsu-lsi-lsi-msgpt3-plugin from VxRail 4.5.x, 4.0.400 Golden Image
Removed lsu-lsi-lsi-msgpt3-plugin from VxRail 4.5.x, 4.0.400 upgrade composite bundle
Removed lsu-lsi-lsi-msgpt3-plugin when upgrading to 4.5.x or 4.0.400
Workaround: Customer on 4.0.302 or earlier release impacted by this, use either option below to remedy the issue.
a) If /etc/vmware/esx.conf.LOCK exists 1. Stage the removal action # esxcli software vib remove --no-live-install -n lsu-lsi-lsi-msgpt3-plugin Removal Result Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective. Reboot Required: true VIBs Installed: VIBs Removed: VMware_bootbank_lsu-lsi-lsi-msgpt3-plugin_1.0.0-1vmw.600.0.0.2494585
VIBs Skipped: 2. The VIB removal is not effective until ESXi host takes a reboot
b) If /etc/vmware/esx.conf.LOCK doesn't exist 1. Log into node and perform following command to list process linked with plug-in: # lsof | grep msgpt3 2. Stop the processes that are linked with plug-in, one at a time, with: # /etc/init.d/<name of process> stop 3. Confirm that there are no more process running that are linked with plug-in: # lsof | grep msgpt3 4. If the output of the command from step 3 is blank, remove the plugin: # localcli software vib remove -n lsu-lsi-lsi-msgpt3-plugin 5. Restart the stopped processes: # /etc/init.d/<name of process> start 6. Check to make sure there no process linked to lsu-lsi-lsi-msgpt3-plugin: # lsof | grep msgpt3