Velero Repo Maintenance Job pod fails with "Fatal: config or key <key-value> is damaged: ciphertext verification failed"
search cancel

Velero Repo Maintenance Job pod fails with "Fatal: config or key <key-value> is damaged: ciphertext verification failed"

book

Article ID: 405021

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

Velero repo-maintenance-job pods are failing on the VKS cluster due to errors encountered during Restic prune operations. Specifically, the prune step fails with a ciphertext verification error, which prevents repository cleanup and may impact backup efficiency or retention.

$ k logs <velero-maintenance-job-pod> -n velero
time="2025-07-02T08:06:01Z" level=error msg="Restic command fail with ExitCode: 1. Process ID is 18, Exit error is: exit status 1" logSource="pkg/util/exec/exec.go:66"
time="2025-07-02T08:06:01Z" level=error msg="An error occurred when running repo prune" error="failed to prune repo: error running command=restic prune --repo=<REDACTED_REPO_URL> --password-file=<REDACTED_PASSWORD_PATH> --cache-dir=<REDACTED_CACHE_DIR>, stdout=, stderr=Fatal: config or key <REDACTED_KEY> is damaged: ciphertext verification failed\n: exit status 1" error.file="pkg/repository/restic/repository.go:123" error.function="(*RepositoryService).exec" logSource="pkg/cmd/cli/repomaintenance/maintenance.go:72"

Environment

VMware vSphere Kubernetes Service

Velero 1.12+

Cause

Velero’s Restic based backup jobs are failing due to corruption in the repository’s encryption data. This causes Restic to reject prune and unlock operations with ciphertext verification failed errors. Since multiple repositories show the same issue, it likely stems from transient I/O interruptions or degraded storage performance during backup writes.

Resolution

  • If the problem happens while using Restic based backup, please Migrate to Kopia which replaces Restic as the default volume backup engine in Velero v1.12+. Restic is officially deprecated in Velero v1.15.
  • If corruption errors still persist after upgrading to v1.15+, raise a support case with Broadcom for further investigation.