Unable to recreate missing worker VM
search cancel

Unable to recreate missing worker VM

book

Article ID: 394082

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

If a worker VM and its persistent disk have been deleted from vSphere and are no longer existing, trying to recreate it using bosh commands (recreate, cck or deploy) could result in an error.

Example error from bosh task logs:

Error: CPI error 'Bosh::Clouds::DiskNotFound' with message 'Could not find disk with id 'disk-xxxxxxxx-9999-xxxx-9999-xxxx9999xxxx'' in 'create_vm' CPI method (CPI request ID: 'cpi-999999')

 

Environment

VMware Tanzu Kubernetes Grid Integrated Edition

Cause

The Bosh Director is failing to find the persistent disk because it was deleted for some reason.  It cannot recreate the VM because of the missing persistent disk that it knows should be attached to the VM.

Resolution

Steps to recover:

  1. Confirm that the worker VM and disk are no longer existing in vSphere.
  2. Run bosh cck on the cluster deployment.

    bosh -d $DEPLOYMENT cck

  3. Choose the option to "Delete the VM reference" of the missing VM.
  4. Choose the option to "Delete the Disk reference" of the missing disk.
  5. Run a bosh ignore on all the instances except the missing VM instance.
  6. Retrieve and save the manifest of the cluster deployment.

    bosh -d $DEPLOYMENT manifest > $CLUSTERNAME-manifest.yaml

  7. Run a bosh deploy using the manifest.

    bosh -d $DEPLOYMENT deploy $CLUSTERNAME-manifest.yaml --fix

  8. If the bosh deployment was successful, run a bosh unignore on the instances that were ignored.  If it was successful, then the worker should have been created with a new persistent disk.