When you have a planned network maintenance or general maintenance window that may impact your TKGI environment, you can use the following steps before the maintenance operation begins:
- Disable the bosh resurrector. Run commnd bosh update-resurrection off
- Log in on each cluster that will go into maintenance
- To list all the jobs running. Run a command similar to bosh -d <service-instance> ssh master -c "sudo monit summary"
- Then stop all the jobs running. Run a command similar to bosh -d <service-instance> ssh master -c "sudo monit stop <job>"
- Check if the service has stopped by issuing a command similar to bosh -d <service-instance> ssh master -c "sudo monit status <job>".
Once the maintenance window is completed you can complete the following steps to
- Log in on each cluster that will go into maintenance
- To list all the jobs and its status. Run a command similar to bosh -d <service-instance> ssh master -c "sudo monit summary"
- Run previously stopped jobs. Run a command similar to bosh -d <service-instance> ssh master -c "sudo monit start <job>".
- Check if the service has resumed by running a command similar to bosh -d <service-instance> ssh master -c "sudo monit status <job>".
- Enable the bosh resurrector. Run commnd bosh update-resurrection off
Note: During the maintenance window, all persistent volume operations will be queued and will resume once the service has started again.