Security Intelligence status showing Down and NAPP shows degraded on the UI

search cancel

Security Intelligence status showing Down and NAPP shows degraded on the UI

book

Article ID: 402357

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware NSX

Issue/Introduction

Symptoms:

1) After enabling NSX Intelligence/Security Intelligence, it shows as down. And on further inspection shows a few spark application executors in Pending state and the error shows "persistentvolumeclaim "rawflowcorrelator-xxxxxx-exec-x-pvc-0" not found"

SSH into NSX Manager CLI using root credentials and run the below to see if the rawflowcorrelator pod status.

# napp-k get pods -n nsxi-platform | grep rawflow
nsxi-platform       rawflowcorrelator-0x0x0x0x0x0x-exec-3                         0/1     Pending     0                37m     <none>            <none>                     <none>           <none>

2) Describe the the pod for more details on the error and observe if you are seeing the below under the "Events"

# napp-k describe pod rawflowcorrelator-0x0x0x0x0x0x-exec-3  
"Warning FailedScheduling 5m55s (x553 over 8h) default-scheduler 0/8 nodes are available: persistentvolumeclaim "rawflowcorrelator-0x0x0x0x0x0x-exec-3-pvc-0" not found. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling."

3) If above error event is observed under describe, check if the PVC for the Pending executor is present or missing and if missing then proceed to the resolution steps - To check whether or not the PVC for the executor pod is missing, please run the below and observe nothing pops up in the output.

# napp-k get pvc | grep rawflow

If any one of the above symptoms do not match, this KB is not a relevant match for your problem statement. Try checking other KBs for relevant matches.

Environment

NAPP 4.1.2

Cause

PVC creation attempts from the spark-app-rawflow-driver or spark-app-overflow-driver pods failed during startup due to slow network connection.

Resolution

The application experiencing "Pending" executors needs to be restarted.

If rawflow executors are stuck in Pending state, then you need to restart the spark-app-rawflow-driver pod.

# napp-k  delete pod spark-app-rawflow-driver -n nsxi platform

If overflow executors are stuck, restart the spark-app-overflow-driver pod.

# napp-k  delete pod spark-app-overflow-driver -n nsxi platform

Deleting the driver pod restarts the flow processing applications.

This causes a momentary disruption to processing flows but the flows will be queued up and processed after the pod restarts.

Feedback

thumb_up Yes

thumb_down No