Relocate Bosh VMs to another cluster in vSphere
search cancel

Relocate Bosh VMs to another cluster in vSphere

book

Article ID: 381468

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated (TKGi) VMware Tanzu Application Service VMware Tanzu Application Service for VMs

Issue/Introduction

This procedure will describe how to relocate Bosh created VMs from one cluster to another, between vCenters if necessary.

It is recommended to test in a lab environment before making any changes in production.

This process describes how to move the Harbor tile and does not cover service instance tiles, such as TPCF data services or TKGI clusters.

Resolution

High level overview

  1. Move all VMs to a shared datastore, by recreating them
  2. Prepare config in Operations Manager for the target cluster
  3. Apply changes to recreate the VMs on the target cluster
  4. Move all VMs to the target datastore, by recreating them

Requirements

  • This process was tested using Ops Manager 3.0.33.
  • It is strongly recommended to use Bosh Backup and Restore to backup the director and any tiles which will be changed.
  • There must be a shared datastore that exists on both the source and target cluster and both source and target datastore MUST have the SAME name.
  • The network must be available on the source and target clusters and both source and target port group MUST have the SAME name.
  • Tested with NSX being disabled in the vCenter config. If NSX is enabled, then extra testing would be required!
  • This procedure does not apply to service instances, additional steps are needed to recreate service instances.

Procedure

1. Move the disks to the shared datastore

This step uploads the stemcell to the shared datastore, re-creates the VMs with ephemeral snapshots from the shared datastore, attaches the persistent new disks from the shared datastores to copy persistent data and finally detach the source persistent disk.

  1. Update the persistent disk and ephemeral disk on the Bosh director "vCenter Config" tab to the shared datastore.
  2. On the "Director Config" tab check Recreate VMs deployed by the BOSH Director.
  3. Apply changes to move the Bosh director and all non-service instance VMs to the shared disk.

2. Recreate all the VMs on the target cluster

This step configures Ops Manager with the target infrastructure and re-creates all the VMs on the target infrastructure. For a highly available system such as TPCF, this process will be online, unless there are any singletons such as built in blob store or static IP addresses are defined, as they will need to be temporarily unset.

2.1 Edit bosh-state.json
  1. Open an SSH session to the Ops Manager VM, using the user "ubuntu" and the private key that is the pair to the public key Ops Manager is installed with.
  2. Elevate privileges to root:
    sudo -i
  3. Take a backup of /var/tempest/workspaces/default/deployments/bosh-state.json.
    cp /var/tempest/workspaces/default/deployments/bosh-state.json /var/tempest/workspaces/default/deployments/bosh-state.json.bak
  4. Edit /var/tempest/workspaces/default/deployments/bosh-state.json, to remove the stemcells section. This will force a re-upload of the stemcell on the next apply changes.
    Source:
    ...
    "stemcells": [
            {
                "id": "d00c524b-998d-49ff-67ef-200938f8565f",
                "name": "bosh-vsphere-esxi-ubuntu-jammy-go_agent",
                "version": "1.572",
                "api_version": 3,
                "cid": "sc-25fcc87d-eaf0-4a81-8d6c-7aa4072ced0d"
            }
        ],
    ...
    Target:
    ...
    "stemcells": []
    ...
2.1 Tile settings
  1. (if the target cluster is in a different vCenter) On the Bosh "vCenter Config" page, add the new vCenter, taking care to specify the correct datastores.
  2. On the Bosh "Create Availability Zones" page, add the target cluster as an availability zone.
  3. Edit the required networks on the Bosh "Create Networks" page, to add the new availability zone to all required networks.
  4. On the Bosh "Director Config" page, check "Recreate VMs deployed by the BOSH Director" (this gets cleared after the previous successful apply changes).
  5. (If a static IP is defined on the Harbor deployment or in any other tile) Remove it by setting the field to blank and note down any IPs.
  6. Enable Ops Manager Advanced mode. Advanced mode will time out after a short period of time.
  7. On the Bosh "Assign AZs and Networks" page, update the assigned availability zones of the Bosh director
  8. On any tiles that need to be moved, update their assigned availability zones.
  9. Cleanly shut down the Bosh director VM (not any other VMs). The VM name can be found on the "Status" tab under the Bosh tile.
  10. Apply changes in Ops Manager to re-create all non-service instance VMs to the target cluster.
  11. (If a static IP is defined on the Harbor deployment or in any other tile) Add back any static IPs which were previous defined and apply changes again.

3. Tidy up steps

  1. On the source cluster edit the powered off Bosh director VM to detach (NOT delete) the persistent disk, which is the 3rd disk, by selecting "Remove device". Note vSphere will not let you delete this disk by accident as it is locked where it is powered on on the target cluster.
  2. Delete the the powered off source Bosh director VM by selecting "Delete from Disk" in the VM context menu.
  3. On the Bosh director "vCenter Config", update the persistent disk and ephemeral disks to the target datastores.
  4. On the Bosh "Director Config" tab, check "Recreate VMs deployed by the BOSH Director".
  5. Apply changes in Ops Manager to move the Bosh director and all non-service instance VMs to the target disk.