ESXi hosts booting from SD card (Note: SD Card revised guidance) may experience single image managed cluster remediation failures.
Remediation of cluster failed:
Remediation failed for Host : <esxi_hostname>
<esxi_hostname> - Failed to remediate host
Lifecycle log file on the ESXi will show errors similar to below snippets :/var/run/log/lifecycle.log
Er(11) lifecycle[2119667]: imagemanagerctl:96 [InstallationError]
Er(11)[+] lifecycle[2119667]: VMware_locker_tools-light_12.3.5.22544099-23305545: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2119711'.
Er(11) lifecycle[2119667]: imagemanagerctl:101 Traceback (most recent call last):
Er(11) lifecycle[2119667]: imagemanagerctl:101 File "/tmp/esx-update-2119667/lib64/python3.8/site-packages/vmware/esximage/Installer/LockerInstaller.py", line 107, in close
Er(11) lifecycle[2119667]: imagemanagerctl:101 File "/lib64/python3.8/site-packages/vmware/runcommand.py", line 290, in waitProcessToComplete
Er(11) lifecycle[2119667]: imagemanagerctl:101 vmware.runcommand.RunCommandError: Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2119711'.
Er(11) lifecycle[2119667]: imagemanagerctl:101
Er(11) lifecycle[2119667]: imagemanagerctl:101 During handling of the above exception, another exception occurred:
Er(11) lifecycle[2119667]: imagemanagerctl:101
Er(11) lifecycle[2119667]: imagemanagerctl:101 Traceback (most recent call last):
Er(11) lifecycle[2119667]: imagemanagerctl:101 File "/tmp/esx-update-2119667/lib64/python3.8/site-packages/vmware/esximage/HostImage.py", line 947, in _download_and_stage
Er(11) lifecycle[2119667]: imagemanagerctl:101 File "/tmp/esx-update-2119667/lib64/python3.8/site-packages/vmware/esximage/HostImage.py", line 816, in _verify_and_write_payload
Er(11) lifecycle[2119667]: imagemanagerctl:101 File "/tmp/esx-update-2119667/lib64/python3.8/site-packages/vmware/esximage/Installer/LockerInstaller.py", line 112, in close
Er(11) lifecycle[2119667]: imagemanagerctl:101 esximage.Errors.InstallationError: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2119711'.
In case of a baseline cluster, using pre-defined baselines, below would be the error message and respective log snippets:
During Install, it would error out with : "An error occurred during host configuration".
Log snippets:
vmware-vum-server.log:
yyyy-mm-ddT info vmware-vum-server[09945] [Originator@6876 sub=VciRemediateTask.RemediateTask{603}] [vciTask
Base 1372] SerializeToVimFault fault:
--> (integrity.fault.VcIntegrityFault) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>
--> msg = "Install error on host: , error details: Platform Configuration Error: /usr/sbin/esxupdate returned with exit status: 15"
--> }
--> Converted fault:
--> (vim.fault.ExtendedFault) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>,
--> faultTypeId = "com.vmware.vcIntegrity.VcIntegrityFault",
--> data = (vim.KeyValue) [
--> (vim.KeyValue) {
--> key = "faultCause",
--> value = ""
--> },
--> (vim.KeyValue) {
--> key = "faultMessage",
--> value = ""
--> }
--> ]
--> msg = "Install error on host: , error details: Platform Configuration Error: /usr/sbin/esxupdate returned with exit status: 15"
--> }
hostd.log:
yyyy-mm-ddT info hostd[2100677] [Originator@6876 sub=Hostsvc.VmkVprobSource] VmkVprobSource::Post event: (vim.event.EventEx) {
--> key = 58,
--> chainId = 1025431560,
--> createdTime = "",
--> userName = "",
--> host = (vim.event.HostEventArgument) {
--> name = "",
--> host = 'vim.HostSystem:ha-host'
--> },
--> eventTypeId = "esx.problem.esximage.install.stage.error",
--> arguments = (vmodl.KeyAnyValue) [
--> (vmodl.KeyAnyValue) {
--> key = "1",
--> value = "(Updated) Cisco-UCS-Custom-ESXi-7-15843807_4.1.1-a"
--> },
--> (vmodl.KeyAnyValue) {
--> key = "2",
--> value = "VMware_locker_tools-light_12.3.5.22544099-23794019: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2171551'."
--> }
--> ],
--> objectId = "ha-host",
--> objectType = "vim.HostSystem",
--> }
yyyy-mm-ddTinfo hostd[2100677] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 7601 : Could not stage image profile '(Updated) Cisco-UCS-Custom-ESXi-7-15843807_4.1.1-a': VMware_locker_tools-light_12.3.5.22544099-23794019: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2171551'.
esxupdate.log:
esxupdate.1:yyyy-mm-ddT esxupdate: 2171510: vmware.runcommand: INFO: runcommand called with: args = '['/usr/lib/vmware/vob/bin/addvob', 'vob.user.esximage.install.stage.error', '(Updated) Cisco-UCS-Custom-ESXi-7-15843807_4.1.1-a', "VMware_locker_tools-light_12.3.5.22544099-23794019: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2171551'."]', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
esxupdate.1:yyyy-mm-ddT esxupdate: 2171510: esxupdate: ERROR: esximage.Errors.InstallationError: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2171551'.
esxupdate.1:yyyy-mm-ddT esxupdate: 2171510: esxupdate: ERROR: esximage.Errors.InstallationError: VMware_locker_tools-light_12.3.5.22544099-23794019: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2171551'.
esxupdate.6:yyyy-mm-ddT esxupdate: 2139653: esxupdate: ERROR: esximage.Errors.InstallationError: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2139687'.
esxupdate.6:yyyy-mm-ddT1 esxupdate: 2139653: esxupdate: ERROR: esximage.Errors.InstallationError: VMware_locker_tools-light_12.3.5.22544099-23794019: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2139687'.
esxupdate.7:yyyy-mm-ddT esxupdate: 2139653: vmware.runcommand: INFO: runcommand called with: args = '['/usr/lib/vmware/vob/bin/addvob', 'vob.user.esximage.install.stage.error', '(Updated) Cisco-UCS-Custom-ESXi-7-15843807_4.1.1-a', "VMware_locker_tools-light_12.3.5.22544099-23794019: Error while waiting for untar process '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']': Timeout (30 seconds) expired waiting for output from command '['/bin/tar', 'xzf', '-', '-C', '/locker/packages/']', pid '2139687'."]', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
This issue is caused when untar operation of individual VIBs takes more than the default 30 seconds timeout during the remediation process. For environments using SD card as the OSData partition, there are chances that VMware tools untar process takes slightly higher duration.
This is a known issue with vSphere ESXi 7.x & 8.x, engineering is actively working towards fixing this issue in a future release.
Workaround 1:
Step 1: Manually increase the timeout of the untar process:
Note: These steps are non-persistent and will revert back to default after the host undergoes reboot.
vsish -e set /config/VisorFS/intOpts/VisorFSPristineTardisk 0
cp /lib64/python3.8/site-packages/vmware/esximage/Installer/LockerInstaller.py /tmp/LockerInstaller.py
cp /tmp/LockerInstaller.py /tmp/LockerInstaller.py.bak
chmod 777 /tmp/LockerInstaller.py
vi /tmp/LockerInstaller.py
def close(self, timeout=30): <<<<<< make this value 120
'''Close untar stream and wait for process completion.
Parameters:
* timeout - the amount of time in seconds, to wait for output
or completion of the process.
'''
def close(self, timeout=120):
'''Close untar stream and wait for process completion.
Parameters:
* timeout - the amount of time in seconds, to wait for output
or completion of the process.
'''
vi
' editor using !wq
mv /tmp/LockerInstaller.py /lib64/python3.8/site-packages/vmware/esximage/Installer/LockerInstaller.py
vsish -e set /config/VisorFS/intOpts/VisorFSPristineTardisk 1
Step 2: For VLCM, manually upgrade VMware Tools that is in the version mentioned in the vLCM Image desired state.
esxcli software component apply -n VMware-VM-Tools -d /<path>/VMware-ESXi-X.0-XXXXXXXX-depot.zip
Step 3: Proceed Remediation
esxcli software vib update
instead of esxcli software profile update
.
Workaround 2:
Step 1: Create a new baseline or modify the existing baseline or image with a newer VMware Tools package (eg 12.4.x or newer)
Step 2: Proceed Remediation