ERROR task-executor-4-1-workitem-MP-DataMigrationDryRun MPRollingUpgradeServiceImpl 4023 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP30459" level="ERROR" subcomp="upgrade-coordinator"] Failed to create local backup on '<NSX Manager Node IP>' nodeFailure reason is "Failed to run 'umount /dev/mapper/nsx-config__bak' command" in /var/log/syslog:
INFO task-executor-4-1-workitem-MP-DataMigrationDryRun ExecutionMonitorServiceImpl 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Execution monitor service invoked to react to failure of node DataMigrationDryRun [Failed to create Local backup.]
INFO http-nio-127.0.0.1-7442-exec-5 UpgradeQueryServiceImpl 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Returning upgrade status summary for MP details as Failed to create Local backup.
NSX 2365 - [nsx@6876 comp="nsx-manager" subcomp="node-mgmt" username="root" level="WARNING" invalid="true"] Response payload for InvokeGetClusterCentralAPI failed validation: Invalid body in response - Request does not allow body content
NSX 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] response: Response:: status code:200 OK status:OK body:123"status": "Failed", "progress_percentage": 80, "failure_reason": "Failed to run 'umount /dev/mapper/nsx-config__bak' command.", "_self": {"href": "/cluster/<Cluster UUID>/node/local-backup/status", "rel": "self"}}
NSX 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Execution monitor service invoked to react to failure of node DataMigrationDryRun [Local backup creation failed on <NSX Manager Node IP> node]
NSX 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Retrieving upgrade unit with id DataMigrationDryRun
NSX 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Found upgrade unit with id: DataMigrationDryRun
NSX 4023 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Failed upgrade unit DataMigrationDryRun belongs to a component for which [pause_on_error] property is true. Initiating Pause on execution...
Local backup fails because user is already connected to "/config_bak/backup/4.1.2.4.0.23786742" in /var/log/nvpapi/api_server.log:
NB: NSX version seen in the logs may be different as this is based on the local NSX environment.
napi.root.node.local_backup INFO [LCL-BKP]Starting local backup generation for node version 4.2.1.3.0.24533887
--
napi.root.node.local_backup INFO [LCL-BKP]lsof /dev/mapper/nsx-config__bak: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 170103 root cwd DIR 252,1 4096 279044 /config_bak/backup/4.1.2.4.0.23786742/cluster-node-backups/4.1.2.4.0.23786742-<Cluster UUID>-<NSX Manager Node IP>
--
napi.root.node.local_backup INFO [LCL-BKP]Unmounting /dev/mapper/nsx-config__bak
api_server ERROR Exception in thread
api_server ERROR Thread-367 (_generate_local_backup)
api_server ERROR :
api_server ERROR Traceback (most recent call last):
api_server ERROR File "/opt/vmware/nsx-node-api/bin/python/management_api/napi/root/node/local_backup.py", line 583, in _generate_local_backup
api_server ERROR
api_server ERROR mount_config_bak()
api_server ERROR
api_server ERROR File "/opt/vmware/nsx-node-api/bin/python/management_api/napi/root/node/local_backup.py", line 401, in mount_config_bak
api_server ERROR _unmount_device(CONFIG_BAK_PARTITION_DEVICE)
api_server ERROR File "/opt/vmware/nsx-node-api/bin/python/management_api/napi/root/node/local_backup.py", line 390, in _unmount_device
api_server ERROR _run_unmount_cmd(path)
api_server ERROR File "/opt/vmware/nsx-node-api/bin/python/management_api/napi/rest_routine_roothelper.py", line 192, in new_fn
api_server ERROR raise return_exception
api_server ERROR Exception
api_server ERROR :
api_server ERROR Failed to run 'umount /dev/mapper/nsx-config__bak' command.
api_server ERROR
During handling of the above exception, another exception occurred.
VMware NSX-T Data Center 3.x
VMware NSX 4.x
The backup process (which is part of the upgrade workflow) failed as it was unable to unmount the config_bak partition which is required when NSX is performing a backup. The reason, as per the logs, is that a user was already connected to /config_bak/backup/4.1.2.4.0.23786742 and that prevented unmounting the partition _unmount_device(CONFIG_BAK_PARTITION_DEVICE).
This is a condition that may occur in a VMware NSX environment.
Workaround
Ensure you are not connected to the /config_bak folder before retrying the NSX upgrade Data Migration Dry Run.
As per the NSX Pre-Upgrade Tasks List, terminate any active SSH sessions or local shell scripts that may be running on the NSX Manager or the NSX Edge nodes, before you begin the upgrade process.