NSX-T VIB install/upgrade fails when SNMP is enabled on the ESX host.
search cancel

NSX-T VIB install/upgrade fails when SNMP is enabled on the ESX host.

book

Article ID: 327367

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
The following error message appear in the esxupdate.log
 /var/log/esxupdate.log
20**-**-))T13:42:43Z esxupdate: 2181140: HostImage: DEBUG: installer LiveImageInstaller failed: ([], '([], "Error in running rm /tardisks/nsx_esx_.v00:\\nReturn code: 1\\nOutput: rm: can\'t remove \'/tardisks/nsx_esx_.v00\': Device or resource busy\\n\\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.")'). Clean up the installation.
20**-**T13:42:43Z esxupdate: 2181140: vmware.runcommand: INFO: runcommand called with: args = 'localcli system visorfs ramdisk list | grep /stageliveimage && localcli system visorfs ramdisk remove -t /tmp/stageliveimage', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR: Traceback (most recent call last):
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR:   File "/build/mts/release/bora-13981272/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Installer/LiveImageInstaller.py", line 1352, in RunCmdWithRetries
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR:   File "/build/mts/release/bora-13981272/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Installer/LiveImageInstaller.py", line 1344, in RunCmdWithMsg
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR: vmware.esximage.Errors.InstallationError: ([], "Error in running rm /tardisks/nsx_esx_.v00:\nReturn code: 1\nOutput: rm: can't remove '/tardisks/nsx_esx_.v00': Device or resource busy\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.")
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR: 
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR: During handling of the above exception, another exception occurred:
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR: 
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR: Traceback (most recent call last):
2019-10-10T13:42:43Z esxupdate: 2181140: root: ERROR:   File "/build/mts/release/bora-13981272/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Installer/LiveImageInstaller.py", line 285, in Remediate
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR:   File "/build/mts/release/bora-13981272/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Installer/LiveImageInstaller.py", line 401, in _RemoveVibs
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR:   File "/build/mts/release/bora-13981272/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Installer/LiveImageInstaller.py", line 556, in _UnmountTardisk
20**-**T13:42:43Z esxupdate: 2181140: root: ERROR:   File "/build/mts/release/bora-13981272/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Installer/LiveImageInstaller.py", line 1357, in RunCmdWithRetries


Check the SNMP status from the affected Host:-
[root@iccvxr1-101:~] /etc/init.d/snmpd status
snmpd is running


Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 2.x

Cause

This is introduced after rebootless upgrade was enabled in NSX-T Datacenter 2.4.0.

Resolution

This issue is fixed in NSX-T Datacenter 2.5 and greater.

Workaround:

Disable SNMP before doing upgrade and enable it after the upgrade is completed.

Before starting the upgrade, disable SNMP:-
esxcli system snmp set --enabled false

After install is completed, bounce the daemon back:-.
esxcli system snmp set --enabled true

Or  Disable the SNMP from  vCenter:-



Additional Information

Impact/Risks:
Manual intervention is required to recover the transport node. Since at the point of this failure VM are vMotioned out VM traffic is not affect. But any other traffic like vmk traffic on the NSX-T switch will be down on the transport node until the transport node is recovered by manual intervention.