NSX Manager backups are failing intermittently after upgrade to 4.0/4.1 release
search cancel

NSX Manager backups are failing intermittently after upgrade to 4.0/4.1 release

book

Article ID: 324251

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Manager backups are failing intermittently after upgrade to 4.0/4.1 release
  • You see an error message similar to the following in the NSX UI:

    either sftp server disk full or bad directory path or check if the directory path length is beyond 260 character limit on windows server

  • From NSX manager syslog(/var/log/syslog.log), the following error messages are observed

    YYYY-MM-DDTHH:MM:SS <hostname> NSX ##### - [nsx@#### comp="nsx-manager" subcomp="node-mgmt" username="root" level="ERROR" errorCode="NOD110"] Cluster backup file copy operation failed due to 404 Not Found#015#012Content-Type: application/json#015#012Content-Length: 329#015#012#015#012{"error_code": 36219, "error_message": "Specified uri '<sftp backup directory>/cluster-node-backups/<nsx version>-<uuid>-<ip>/backup-YYYY-MM-DDTHH_MM_SSUTC/cluster_backup-<uuid>-<ip>-nsx-ufo-backup-restore.tar' not found.", "module_name": "node-services"}

    YYYY-MM-DDTHH:MM:SS <hostname> NSX ##### SYSTEM [nsx@#### comp="nsx-manager" errorCode="MP29111" level="ERROR" reqId="<uuid>" subcomp="manager" username="admin"] Cluster backup for NSX_UFO_BACKUP_RESTORE had error

    YYYY-MM-DDTHH:MM:SS <hostname> NSX ##### - [nsx@#### comp="nsx-manager" errorCode="MP29111" level="ERROR" reqId="<uuid>" s2comp="backup-restore" subcomp="manager" username="admin"] Cluster backup failed with ; BackupAsyncStatus [BackupStatus [status=URI_NOT_FOUND, statusDetail=Specified uri '<sftp backup directory>/cluster-node-backups/<nsx version>-<uuid>-<ip>/backup-YYYY-MM-DDTHH_MM_SSUTC/cluster_backup-<uuid>-<ip>-nsx-ufo-backup-restore.tar' not found., remoteUri=<sftp backup directory>/cluster-node-backups/<nsx version>-<uuid>-<ip>/backup-YYYY-MM-DDTHH_MM_SSUTC/cluster_backup-<uuid>-<ip>-nsx-ufo-backup-restore.tar, errorCode=null, startTime=#########, endTime=#########]; responseBody=null].

Environment

VMware NSX 4.0.0.1
VMware NSX 4.1.0
VMware NSX 4.1.0.2
VMware NSX 4.1.1

Cause

This issue happens due to following configuration at the end of sshd_config.
Customer has configured "ChrootDirectory /sftpfiles" for sftp user group.

------------
Match group sftp
ChrootDirectory /sftpfiles
#X11Forwarding no
#AllowTcpForwarding no
ForceCommand internal-sftp
------------

Resolution

This issue is resolved in VMware NSX 4.2.0

Workaround:

  1. SSH using root user to nsx-manager
  2. cd /opt/vmware/nsx-node-api/bin/python/management_api/napi/root/node/file_store
  3. cp utils.py utils.py.bak

    Note: Make sure the utils.py.bak file is created via the ls command.

  4. Open the utils.py file with a text editor

    Search for check_prev_output in utils.py

    Update the following block:
    ------------------------
    if check_prev_output and _SFTP_RE.search(prev_output):
    	check_prev_output = False
    	output = prev_output

    ------------------------

    Replace the above with the following:
    Make sure the cmd_list[3] == "-i" check is added in the if block.
    ------------------------
    if cmd_list[3] == "-i" and check_prev_output and _SFTP_RE.search(prev_output):
    	check_prev_output = False
    	output = prev_output

    ------------------------

  5. Run: service nsx-mp-api-server restart

Note: if the Issue  come back, preform the same  task on all 3 nodes.

Additional Information