Alert observed: Platform Service spark-app-ids is degraded
Spark driver logs showing: Refrence article https://knowledge.broadcom.com/external/article?articleNumber=384122
command output as per above kb > k -n nsxi-platform get pod spark-app-ids-driver -o jsonpath='{.metadata.ownerReferences[0].kind}'
WARN task-starvation-timer TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
WARN task-starvation-timer TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Error seen while describing pods: 0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims
preemption: 0/8 nodes are available: preemption is not helpful for scheduling
commands to find the affected PVCs from SSPi:
k get pvc -A | grep pending
This indicates that the pods were unable to be scheduled because the required PersistentVolumeClaims (PVCs) were not successfully bound. Kubernetes could not provision or attach the required storage.
csi controller logs may indicate below error :
Error Message: "No compatible datastore found for storagePolicy" indicating a conflict between the requested storage policy and available infrastructure datastores.
Identify the CSI controller pod
kubectl get pods -A | grep -i csi
Typically, you’ll see something like:
vsphere-csi-controller-xxxxx
vmware-system-csi (or similar)k logs -n vmware-system-csi <csi-controller-pod-name>CSI controller pods usually have multiple containers, such as:
vsphere-csi-controller
csi-provisioner
csi-attacher
csi-resizer
liveness-probe
k logs -f -n vmware-system-csi <pod-name> -c vsphere-csi-controller
SSP 5.1
The applied storage policy did not return any compatible storage
This may led to storage provisioning failure
As a result, PVCs may remain unbound and pods will stay in Pending state
Ensure that storage policy used for SSP shows valid datastores. If in compatible datastore doesnt show any information please contact Broadcom support for further trouble shooting. For example : below screenshot of storage policy doent have any data-stores in compatible list.
vCenter server >Policies and Profiles >VM Storage Polices> Used policy for SSP