During a Commvault backup, a worker pod failed to be created for the prometheus deployment in a namespace, leading to backup failure. Commvault logs indicated:
Failed to create PVC from snapshot: K8sApp::CreateVolsFromSnapshot() - Attempting to create PVC [<YOUR_NAMESPACE>][prometheus-pvc-prometheus-cv-<>] from snapshotWorker pod not found after creation attempt: K8sApp::CreateTARWorker() - Failed to fetch Pod failure reason. ... App not found. [Kind : Pod][Namespace : <YOUR_NAMESPACE>][Name: prometheus-pvc-prometheus-cv-<>]Cleanup of PVC and snapshot after failure.Backup job fails: VSBkpWorker::BackupVMFileCollection() - Failed to open file collection object.Investigation of kube-apiserver audit logs on a guest cluster control plane node revealed that the attempt to create the pod in the namespace was blocked by a Pod Security Admission (PSA) policy, showing a "403 Forbidden" error:
{"kind": "Event","apiVersion": "audit.k8s.io/v1","level": "RequestResponse","stage": "ResponseComplete","requestURI": "/api/v1/namespaces/YOUR_NAMESPACE/pods","verb": "create","user": {"username": "system:serviceaccount:default:commvault-sa","groups": ["system:serviceaccounts","system:serviceaccounts:default","system:authenticated"]},"objectRef": {"resource": "pods","namespace": "YOUR_NAMESPACE","name": "prometheus-pvc-prometheus-cv-<>","apiVersion": "v1"},"responseStatus": {"status": "Failure","message": "pods \"prometheus-pvc-prometheus-cv-<>\" is forbidden: violates PodSecurity \"restricted:latest\": allowPrivilegeEscalation != false (container \"cvcontainer\" must set securityContext.allowPrivilegeEscalation=false),unrestricted capabilities (container \"cvcontainer\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"cvcontainer\" must set securityContext.runAsNonRoot=true),seccompProfile (pod or container \"cvcontainer\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")","reason": "Forbidden","code": 403},"annotations": {"pod-security.kubernetes.io/enforce-policy": "restricted:latest"}}
The backup failure is caused by Kubernetes Pod Security Admission (PSA) policies, specifically the "restricted:latest" profile, which prevents the creation of the necessary worker pod. The Kubernetes API server rejects the worker pod definition because its security context violates the enforced PSA policies. This rejection is evidenced by a "403 Forbidden" error in the kube-apiserver audit logs, detailing specific security context violations such as allowPrivilegeEscalation, capabilities.drop, runAsNonRoot, and seccompProfile.
To resolve this issue, the Pod Security Admission (PSA) enforcement level in the affected namespace needs to be relaxed to allow for privileged pods. This can be achieved by applying a label to the namespace.
Steps to Relax PSA Enforcement:
kubectl label --overwrite ns YOUR_NAMESPACE pod-security.kubernetes.io/enforce=privilegedThis action will allow the worker pod to be created successfully, enabling the backup process to proceed without further security policy violations.
For more detailed information on configuring PSA for TKR 1.25 and later, please refer to the official documentation: Managing Security for TKR Service Clusters.