"failed to add disk" error while deploying Supervisor Services or vSphere Pods.
search cancel

"failed to add disk" error while deploying Supervisor Services or vSphere Pods.

book

Article ID: 385016

calendar_today

Updated On:

Products

VMware vSphere with Tanzu

Issue/Introduction

  • Deploying Supervisor service or vSphere Pod fails with "failed to setup images: failed to add disk <Datastore Name fdc/xxxxx.vmdk>"

  • Failed Pod yaml from UI or describing the failed pod will show the below error messages:
    status:
      conditions:
      - lastProbeTime: null
        lastTransitionTime: "YY-MM-DDTHH:MM:SSZ"
        status: "True"
        type: PodScheduled
      - lastProbeTime: null
        lastTransitionTime: "YY-MM-DDTHH:MM:SSZ"
        reason: UnknownContainerStatuses
        status: "False"
        type: Initialized
      - lastProbeTime: null
        lastTransitionTime: "YY-MM-DDTHH:MM:SSZ"
        reason: UnknownContainerStatuses
        status: "False"
        type: ContainersReady
      - lastProbeTime: null
        lastTransitionTime: "YY-MM-DDTHH:MM:SSZ"
        reason: UnknownContainerStatuses
        status: "False"
        type: Ready
      message: 'failed to setup images: failed to add disk [<DatastoreName>] fcd/xxxxx.vmdk:
        VM.AddDevice failed error = context deadline exceeded Post "https://localhost/sdk":
        context deadline exceeded: ErrImageSetup'
      phase: Failed
      qosClass: BestEffort
      reason: ErrImageSetup
  • From the ESXi host, you will see the below log snippets in /var/run/log/spherelet.log

    "YY-MM-DDTHH:MM:SSZ" No(5) spherelet[<OP_ID>]: time=""YY-MM-DDTHH:MM:SSZ"" level=error msg="unexpected fault: &{{{{<nil> [{{} msg.disk.hotadd.Failed [{{} 1 scsi2:0}] Failed to add disk 'scsi2:0'.} {{} msg.disk.hotadd.poweron.failed [{{} 1 scsi2:0}] Failed to power on 'scsi2:0'.} {{} msg.disk.noBackEnd [{{} 1 /vmfs/volumes/<Datastore-Name>/fcd/xxxxx.vmdk}] Cannot open the disk '/vmfs/volumes/<Datastore-Name>/fcd/xxxxx.vmdk' or one of the snapshot disks it depends on. } {{} msg.disklib.INVALIDMULTIWRITER [] Thin/TBZ/Sparse disks cannot be opened in multiwriter mode} {{} vob.fssvec.OpenFile.file.failed [] File system specific implementation of OpenFile[file] failed} {{} msg.disk.invalidClusterDisk [{{} 1 VMware ESX} {{} 2 /vmfs/volumes/<Datastore-Name>/fcd/xxxxx.vmdk}] VMware ESX cannot open the virtual disk \"/vmfs/volumes/<Datastore-Name>/fcd/xxxxx.vmdk\" for clustering. Verify that the virtual disk was created using the thick option. }]}}} Failed to add disk 'scsi2:0'.} taskerror: Failed to add disk 'scsi2:0'." VM-OP=AddDevice namespace=svc-contour-domain-cX pod=envoy-xxxxx uid=<UUID>

Environment

vSphere with Tanzu 8.x

Cause

This issue will occur when the Storage Policy does not have the volume allocation type with "Fully Initialized".

Resolution

Resolution 1: To Create a New Storage Policy.

  1. Create a New Storage Policy and make sure you select the "Fully Initialized" under Volume allocation type. 

  2. From VC> Workload Management> Supervisor> Configure> Storage Policy> Edit the newly created Storage policy. 

  3. Re-try the Supervisor Services or vSphere pods deployment.

Resolution 2: To update the existing Storage Policy.

  1. Modify the existing Storage Policy to "Fully Initialized".
    • Edit VM Storage Policy> VMFS rules> Placement> Volume allocation type> Fully Initialized
  2. If there is a failed deployment for the Supervisor Services, you need to clean up the image disk to ensure the storage policy changes are applied.
    • Run the below command to delete all the existing imagedisk CRs:
      • kubectl delete imagedisk <name> -n vmware-system-kubeimage
Note: This can be safely deleted as the ImageDisk CRs and corresponding vmdks are caches. New Pod will be created with the new volume allocation type.