NSX Network Detection and Response - Change the status of a stuck backup manually
search cancel

NSX Network Detection and Response - Change the status of a stuck backup manually

book

Article ID: 323971

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Change the status of a stuck backup manually using the CLI.

Symptoms:
To configure and run a backup, please follow the guide: NSX Network Detection and Response - How to backup appliances (900094)

After configuring the backup, in some cases, backup jobs show the status Pending or Running but it never changes, in order to be able to run a new backup we need to either wait for the current to finish or change the status of it manually.

Resolution

The Manager UI will show the status of backups, waitingrunningsuccessful or failed:

image.png

The time the manager will take to finish a backup will depend on several factors like the amount of data being compressed and transferred, network connectivity and performance, among others.

Steps:

1. If we suspect the backup is stuck or never finished, we need to validate some things using the CLI of the manager:

a. There are no tar/gz/ssh processes visible in the active processes.
i.e: 

image.png

b. The lvs command does not show a snapshot partition.
i.e:

image.png

The screenshots confirm the backup is still running so we recommend to leave it running until finished.

However, if a backup has been in the same state for more than 4 days, we might need to stop it manually.

2. Verify the status of the last job running the command:

# previct-backup.py job lastatus
LAST_JOB 3:running LAST_JOB

The output shows running.

The possible states for a backup are:
Waiting (1) 
Running (2)
Successful (3)
Failed (4)


3. Now we need to change the status of the last job manually in the data base of the manager by running the command:

mysql --defaults-file=/root/.my.cnf -e 'update pcloud.backup_job set job_status=4,description=\"Internal error\",end_date=UTC_TIMESTAMP() where job_status=2;'

The number in the last portion of the command (where job_status=2) will depend on the current status from the output of the command in step 2 and be sure to include all single and double quotes as shown in the example.

Note: We have to be extremely careful when modifying the data base since it can break completely the appliance, so be sure to run ONLY the command in the step number 3.
 

4. After changing the status of the backup job, we can check the status of it using the previct-backup utility in the CLI:

# previct-backup.py job lastatus
LAST_JOB 3:failed LAST_JOB

We can also verify there are no processes or partition snapshots as mentioned in step 1.


If further assistance is needed feel free to create a support request using our Customer Connect Portal:
How to file a Support Request in Customer Connect and via Cloud Services Portal (2006985)
 


Additional Information

Note: This article is applicable to the standalone NSX Network Detection and Response product (formerly Lastline) and is not intended to be applied to the NSX NDR feature of NSX-T.