Velero Backup Failure – “Failed to create snapshot: failed to take snapshot of the volume”
search cancel

Velero Backup Failure – “Failed to create snapshot: failed to take snapshot of the volume”

book

Article ID: 420293

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime VMware Tanzu Kubernetes Grid Management

Issue/Introduction

  • In certain vSphere environments, administrators may observe that Velero backup jobs fail partially or completely. The failure occurs even when backup commands are executed with correct parameters.
  • Typical commands attempted include:

          velero backup create ##### --include-namespace ###### --snapshot-move-data --wait

  • During these operations, Velero logs may report errors similar to the following:

      level=error msg="Fail to wait VolumeSnapshot turned to ReadyToUse: CSI got timed out with error: Failed to create snapshot: failed to take snapshot of the volume ##########: rpc error: code = DeadlineExceeded desc = context deadline exceeded"Backup=##### Operation ID=##### Source PVC=##### 

  message: /VolumeSnapshotContent snapcontent-#### has error: Failed to create snapshot: failed to take snapshot of the volume ####: "rpc error: code = DeadlineExceeded desc = context deadline exceeded" 

  • This issue is commonly associated with timeouts during snapshot preparation when using the CSI plugin.

Environment

Velero

Tanzu Kubernetes Grid Management

Cause

The failure occurs because the Kopia CSI PVC plugin encounters a timeout while preparing the data upload. As a result, the snapshot creation process does not complete successfully, leading to backup job failures.

Resolution

To bypass the slow CSI snapshot and “prepare upload” process, administrators can run the backup using the filesystem backup method. This approach avoids reliance on CSI snapshots and mitigates timeout issues.

Run the following command:

velero backup create ##### --include-namespace ##### --default-volumes-to-fs-backup --include-cluster-resources=true --wait

Additional Information

The filesystem backup method leverages direct file system operations instead of CSI snapshots, reducing the likelihood of timeout errors.

For detailed guidance, refer to the official Velero documentation: 🔗 https://velero.io/docs/v1.17/file-system-backup/