Commvault Kubernetes Backup Failure - Worker Pod Creation Failed Due to Pod Security Admission Policy Violation in Namespace
search cancel

Commvault Kubernetes Backup Failure - Worker Pod Creation Failed Due to Pod Security Admission Policy Violation in Namespace

book

Article ID: 415616

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

During a Commvault backup, a worker pod failed to be created for the prometheus deployment in a namespace, leading to backup failure. Commvault logs indicated:

  • Failed to create PVC from snapshot: K8sApp::CreateVolsFromSnapshot() - Attempting to create PVC [<YOUR_NAMESPACE>][prometheus-pvc-prometheus-cv-<>] from snapshot
  • Worker pod not found after creation attempt: K8sApp::CreateTARWorker() - Failed to fetch Pod failure reason. ... App not found. [Kind : Pod][Namespace : <YOUR_NAMESPACE>][Name: prometheus-pvc-prometheus-cv-<>]
  • Cleanup of PVC and snapshot after failure.
  • Backup job fails: VSBkpWorker::BackupVMFileCollection() - Failed to open file collection object.

Investigation of kube-apiserver audit logs on a guest cluster control plane node revealed that the attempt to create the pod in the namespace was blocked by a Pod Security Admission (PSA) policy, showing a "403 Forbidden" error:

{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/YOUR_NAMESPACE/pods",
"verb": "create",
"user": {
"username": "system:serviceaccount:default:commvault-sa",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:default",
"system:authenticated"
]
},
"objectRef": {
"resource": "pods",
"namespace": "YOUR_NAMESPACE",
"name": "prometheus-pvc-prometheus-cv-<>",
"apiVersion": "v1"
},
"responseStatus": {
"status": "Failure",
"message": "pods \"prometheus-pvc-prometheus-cv-<>\" is forbidden: violates PodSecurity \"restricted:latest\": allowPrivilegeEscalation != false (container \"cvcontainer\" must set securityContext.allowPrivilegeEscalation=false),unrestricted capabilities (container \"cvcontainer\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"cvcontainer\" must set securityContext.runAsNonRoot=true),seccompProfile (pod or container \"cvcontainer\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")",
"reason": "Forbidden",
"code": 403
},
"annotations": {
"pod-security.kubernetes.io/enforce-policy": "restricted:latest"
}
}

Cause

The backup failure is caused by Kubernetes Pod Security Admission (PSA) policies, specifically the "restricted:latest" profile, which prevents the creation of the necessary worker pod. The Kubernetes API server rejects the worker pod definition because its security context violates the enforced PSA policies. This rejection is evidenced by a "403 Forbidden" error in the kube-apiserver audit logs, detailing specific security context violations such as allowPrivilegeEscalationcapabilities.droprunAsNonRoot, and seccompProfile.

Resolution

To resolve this issue, the Pod Security Admission (PSA) enforcement level in the affected namespace needs to be relaxed to allow for privileged pods. This can be achieved by applying a label to the namespace.

Steps to Relax PSA Enforcement:

  1. Identify the affected namespace (e.g., YOUR_NAMESPACE).
  2. Execute the following kubectl command to overwrite the Pod Security enforcement label for the namespace:
    kubectl label --overwrite ns YOUR_NAMESPACE pod-security.kubernetes.io/enforce=privileged
    • Note: Replace YOUR_NAMESPACE with the actual name of your namespace.

This action will allow the worker pod to be created successfully, enabling the backup process to proceed without further security policy violations.

Additional Information

For more detailed information on configuring PSA for TKR 1.25 and later, please refer to the official documentation: Managing Security for TKR Service Clusters.