vCenter Server 8.0 Update 3i scheduled backups fail sporadically with "server validation failure"
search cancel

vCenter Server 8.0 Update 3i scheduled backups fail sporadically with "server validation failure"

book

Article ID: 435925

calendar_today

Updated On:

Products

VMware vCenter Server VMware vCenter Server 8.0

Issue/Introduction

  • After upgrading to vCenter Server 8.0 Update 3i (P08), scheduled file-based backups fail intermittently.
  • Manual backups triggered through the VAMI (Virtual Appliance Management Interface) succeed.
  • The backup job status reports: Backup job failed to start due to server validation failure.
  • In /var/log/vmware/applmgmt/backupScheduler.log, you see a twisted error during the schedule run:
    ERROR: Failed to issue the Schedules.run request. Exception: {messages : [LocalizableMessage(id='com.vmware.appliance.unknown-twisted-error', default_message='Internal error. See logs for details. https://vcenter.fqdn/appliance/support-bundle', args=['vcenter.fqdn'], params=None, localized=None)], data : None, error_type : None}
    Traceback (most recent call last):
      File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/Scheduler.py", line 133, in ExecScheduleRun
        status = svc_handle.run(scheduleId, comment='SCHEDULED')
      File "/usr/lib/applmgmt/pyclient/applmgmt_client-1.0-py2.7.egg/com/vmware/appliance/recovery/backup_client.py", line 1197, in run
        return self._invoke('run',
      File "/usr/lib/applmgmt/vapi/lib/vapi_runtime-2.100.0-py2.py3-none-any.whl/vmware/vapi/bindings/stub.py", line 345, in _invoke
        return self._api_interface.native_invoke(ctx, _method_name, kwargs)
      File "/usr/lib/applmgmt/vapi/lib/vapi_runtime-2.100.0-py2.py3-none-any.whl/vmware/vapi/bindings/stub.py", line 295, in native_invoke
        raise TypeConverter.convert_to_python(method_result.error,  # pylint: disable=E0702
    com.vmware.vapi.std.errors_client.Error: {messages : [LocalizableMessage(id='com.vmware.appliance.unknown-twisted-error', default_message='Internal error. See logs for details. https://vcenter.fqdn/appliance/support-bundle', args=['vcenter.fqdn'], params=None, localized=None)], data : None, error_type : None}
  • In /var/log/vmware/applmgmt/applmgmt.log, the following SFTP/curl errors are present:
    INFO:plugins.SftpStorageIOLib:URL exist : sftp://sftp.server.fqdn/data/backup/vCenter/
    ERROR:plugins.FtpStorageIOLib:sftp cmd failed. RC: 21, Err: curl: (21) mkdir command failed: Operation failed
    , Cmd: ['/usr/bin/curl', '--fail', '-u', '#########:****', '--connect-timeout', '120', '--ssl-reqd', '-k', '--noproxy', '*', '--silent', '--show-error', '-Q', 'mkdir /data/backup/vCenter/backup_YYYYMMDD-HHMM/', 'sftp://sftp.server.fqdn/data/backup/vCenter/'], stdout:
    ERROR:plugins.FtpStorageIOLib:sftp cmd failed. RC: 21, Err: curl: (21) rmdir command failed: Operation failed
    , Cmd: ['/usr/bin/curl', '--fail', '-u', '#########:****', '--connect-timeout', '120', '--ssl-reqd', '-k', '--noproxy', '*', '--silent', '--show-error', '-Q', 'rmdir /data/backup/vCenter/backup_YYYYMMDD-HHMM/', 'sftp://sftp.server.fqdn/data/backup/vCenter/'], stdout:
    ERROR:backupRestoreAPI:Backup Server validation failed with PluginError: PluginError('Plugin error occurred. ErrCode: 102, Args: ()')
    DEBUG:plugins.SftpStorageIOLib:Delete known host file with command: rm /root/.ssh/br_temp_known_hosts
    ERROR:backupRestoreAPI:Backup Server validation failed for scheduled backup: Plugin error occurred. ErrCode: 102, Args: ()
    ERROR:backupRestoreAPI:Validation failure error message: Structure('LocalizableMessage', dict(id='com.vmware.applmgmt.backup.failed_start_validation', defaultMessage='Backup job failed to start due to server validation failure.', args=[]))
    ERROR:backupRestoreAPI:set message: [defaultdict(<function ConfigManager.defaultFactory at 0x7f971c9a57e0>, {'id': 'com.vmware.applmgmt.backup.failed_start_validation', 'defaultMessage': 'Backup job failed to start due to server validation failure.', 'args': []})]
    ERROR:backupRestoreAPI:Saving state history with state: FAILED: <util.BackupRestoreState.BackupRestoreState object at 0x7f9716674f40>

Environment

vCenter 8.0 Update 3i

Cause

vCenter 8.0 Update 3i introduced a pre-backup validator script designed to verify the backup server's health before initiating the full backup process. This script performs the following:

  1. Creates a temporary directory on the backup server named backup_<YYYYMMDD-HHMM>.
  2. Transfers sample files to test write capabilities.
  3. Deletes the temporary directory.

When multiple vCenter Server instances point to the same backup directory and are scheduled to start at the exact same time, they attempt to create the same temporary directory name simultaneously. This causes a collision on the SFTP/backup server, where the second mkdir command fails, leading to the validation failure and the cancellation of the backup job.

Resolution

There is currently no resolution. This issue is under investigation by Engineering.

Workaround To prevent directory name collisions during the validation phase, stagger the backup schedules for all vCenter Server instances sharing the same backup location.

  1. Log in to the vCenter Server Management Interface (VAMI) at https://vcenter_fqdn:5480.
  2. Navigate to Backup > Schedule.
  3. Edit the backup schedule.
  4. Offset the start time of each vCenter by at least 5 to 10 minutes (e.g., vCenter 1 at 01:00, vCenter 2 at 01:05, vCenter 3 at 01:10).
  5. Save the changes.