Options:
The command you want to use depends a bit on the circumstances of what you're trying to do. Here are some guidelines that you can follow to select the correct command:
- If you are troubleshooting a failed `bosh deploy` or Apply Changes with Operation (Ops) Manager, you should never use `bosh recreate`, `bosh start`, `bosh stop` or `bosh restart`. Running any of these commands can cause BOSH to recreate VMs, and not just the VMs you requested, using the last known good state of the deployment. This happens because you're in the middle of applying changes, and these commands can trigger bosh to put the deployment back into the last known working state, which was prior to your failed deploy.
If you are in this situation and need to recreate a VM, you should use `bosh cck` & pick the option to recreate. The `bosh cck` command will recreate and use the most current state. If `bosh cck` does not detect the VM you want to recreate as having a problem, you can power off the VM in your IaaS (before you do this, make sure BOSH resurrection is off or the resurrector will detect and attempt to recreate before you can run `bosh cck). This will allow `bosh cck` to detect it's having a problem and give you the option to recreate the VM.
If you are just trying to restart the processes on a VM, you will need to `bosh ssh` to the VM and run `monit restart` instead. - If you are troubleshooting and would like to do root cause analysis, you should never use `bosh recreate` or `bosh cck` with the option to recreate. These options will delete the current VM where the problem is occurring. This also deletes logs & application state. In most cases, to do root cause analysis you need to minimally have application logs, but often also need other state (memory dumps, thread dumps, etc.) from the VM in question. Because recreating the VM destroys this state, it makes root cause analysis difficult or impossible.
- Because there is more work involved, doing a `bosh recreate` will always take longer. To reduce wait times, you should generally use `bosh start`, stop or restart when they are sufficient for the task at hand. Only do a recreate when you truly need a new & fresh VM.
- Similarly, using monit to restart an individual process will be faster than using `bosh restart` to restart all of the processes on the VM. If you only need to restart a single process, you can do that with `monit restart <proc>` and save some time.
The BOSH CLI offers the following options to interact with your services:
- bosh start
- bosh stop
- bosh restart
- bosh recreate
BOSH Start
This command will start processes that are running on a Bosh deployed VM. It does not affect the state of the VM, so it will not create or recreate the VM. The VM would need to exist for this command to be effective.
This command is the same as running
`monit start all` when you are on the VM.
BOSH Stop
This command is the opposite of
`bosh start`. It will stop all the processes that are running on a BOSH deployed VM. It does not affect the state of the VM, so it will not stop or delete a VM unless you specifically pass it the
--hard flag.
This command, without the
--hard flag, is the same as running `
monit stop all` when you are on the VM.
BOSH Restart
This command is a combination of
`bosh stop` and
`bosh start`. It will stop then start the processes on a BOSH deployed VM. It does not affect the state of the VM itself and will not cause the VM to be recreated.
This command is the same as running
`monit restart all` when you are on the VM.
BOSH Recreate
This command will stop all processes, stop the VM, destroy the ephemeral, create a new VM, and start all processes on a BOSH deployed VM.
Because this destroys and recreates the VM itself, there is no direct equivalent using
`monit`. There are other ways to recreate a VM though, for example you could power off or delete the VM in your IaaS. Then you could wait for the BOSH Resurrector to recreate the VM or you could run
`bosh cck` and pick the option to recreate the VM.
For more references on these commands, please see the BOSH Docs
here.
Some usage tips for convergence concerns:
- bosh -d cf-XXXX deploy /var/tempest/workspaces/default/deployments/cf-XXX.yml - Use this when supplying a new manifest, such as an Apply Change that failed and you wish to push forward.
- bosh deploy --fix - Use this when supplying a new manifest (Apply Changes) and when there are VMs that are in an unresponsive_agent state
- bosh recreate <VM> - Use this if the goal is to do a convergence of the entire deployment with last known `good state` manifest. Such as when a VM instance is having an issue with the new manifest or a deploy got half way through and failed and you want to revert.
- Operators can try to recreate the failed VM and bosh will roll back all the VMs that were successful and recreate them too. If there are no other VMs in a bad state, it will only act on the instance VM you specify.
- To avoid a full recreate and roll back of the deployment, operators can use the no convergence flag mentioned below.
See guidelines below for specific details:
- bosh recreate <VM> --no-converge - Use this when you wish to recreate a specific VM instance that was updated to a new manifest and that is in a bad state. The instance will be reverted to the last known 'good state.' Bosh will not try to converge any other instances that are out of state and only act on the given instance being specified.