You are running SSP 5.0 and later and are seeing an alarm with the description:
"The disk storage usage of Security Services Platform node {{ .ResourceID }} is currently {{ .Value }}%, which is above the threshold value."
This alarm indicates that one or more worker nodes in your Security Services Platform (SSP) cluster are experiencing high disk storage usage, potentially impacting application performance, logging, and overall node health.
vDefend SSP >= 5.0
High disk usage on worker nodes can occur due to various reasons:
k describe node { .ResourceID }
' command and pick any pod from the nsxi-platform
namespace listed under the node. Restart the pod (deployment/statefulset):k -n nsxi-platform get pod <pod-name> -o jsonpath='{.metadata.ownerReferences[0].kind}' (note: <service-name> is <pod-name> without hash number )
If the output is StatefulSet
, follow the StatefulSet
restart steps.
If the output is ReplicaSet
, it belongs to a Deployment
If {{ service-name }} is stateful
set, run:
k -n nsxi-platform rollout restart statefulset {{ service-name }}
Otherwise, run:
k -n nsxi-platform rollout restart deployment {{ service-name }}
Wait for ~10 minutes and check if pods are up and running. (k -n nsxi-platform get pods
to check restarted pod are up)
Note: When pod terminates, it causes a temporary unavailability of any services it provides until a new pod is scheduled and becomes ready.
Once scale is done, Validate that new nodes are operational
k get nodes
Expected output :
NAME STATUS ROLES AGE VERSION
node-1 Ready <role> xxm v1.xx.x
node-2 Ready <role> xxm v1.xx.x
new-node Ready <role> xxm v1.xx.x # Ensure the new node is Ready
k -n nsxi-platform get pods -o wide
' , and repeat first two steps to get services across the new node.