Spark App IDS Degraded Due to PVC Pending – Storage Policy Compatibility Issue
search cancel

Spark App IDS Degraded Due to PVC Pending – Storage Policy Compatibility Issue

book

Article ID: 434085

calendar_today

Updated On:

Products

VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

  • Alert observed: Platform Service spark-app-ids is degraded

 

Spark driver logs showing: Refrence article  https://knowledge.broadcom.com/external/article?articleNumber=384122

command output as per above kb > k -n nsxi-platform get pod spark-app-ids-driver -o jsonpath='{.metadata.ownerReferences[0].kind}' 

 WARN task-starvation-timer TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
 WARN task-starvation-timer TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
  • Some pods may stuck in Pending state with below error 

 Error seen while describing pods:  0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims
preemption: 0/8 nodes are available: preemption is not helpful for scheduling  

commands to find the affected PVCs from SSPi:
 k get pvc -A | grep pending

This indicates that the pods were unable to be scheduled because the required PersistentVolumeClaims (PVCs) were not successfully bound. Kubernetes could not provision or attach the required storage.

csi controller logs may indicate below error  :

Error Message: "No compatible datastore found for storagePolicy" indicating a conflict between the requested storage policy and available infrastructure datastores.

Identify the CSI controller pod

 kubectl get pods -A | grep -i csi

Typically, you’ll see something like:

  • vsphere-csi-controller-xxxxx

  • Namespace usually: vmware-system-csi (or similar)
k logs -n vmware-system-csi <csi-controller-pod-name>

CSI controller pods usually have multiple containers, such as:

  • vsphere-csi-controller

  • csi-provisioner

  • csi-attacher

  • csi-resizer

  • liveness-probe

k logs -f -n vmware-system-csi <pod-name> -c vsphere-csi-controller

Environment

SSP 5.1

Cause

  • The storage policy associated with the PVC are not having  any available datastores.
  • The applied storage policy did not return any compatible storage

  • This may  led to storage provisioning failure

  • As a result, PVCs may remain unbound and pods will stay in Pending state

 

Resolution

Ensure that storage policy used for  SSP shows  valid datastores. If in compatible datastore doesnt show any information please contact Broadcom support for further trouble shooting.  For example : below screenshot of storage policy doent have any data-stores in compatible list. 

vCenter server >Policies and Profiles >VM Storage Polices> Used policy for SSP