vDefend SSP Alarm: Metrics service CPU usage is high or very high

search cancel

vDefend SSP Alarm: Metrics service CPU usage is high or very high

book

Article ID: 384109

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

Alarm with the below description

"The CPU usage of Metrics service {service-name }/{ pod } is currently { value }%, which is above the threshold value."

Example:

The CPU usage of Metrics service metrics-manager/metrics-manager-59959c48b7-zgjx7 is currently 95%, which is above the threshold value.

The CPU usage of Metrics service metrics-postgresql-ha-postgresql/metrics-postgresql-ha-postgresql-0 is currently 93%, which is above the threshold value.

The {service-name} will be any of the following

1. metrics-manager
2. metrics-query-server
3. metrics-app-server
4. metrics-db-helper
5. metrics-postgresql-ha-postgresql
6. metrics-postgresql-ha-pgpool

The { pod } is dynamic and will be prefixed by the service name i.e. metrics-manager-xxxxxxxxxx-xxxxx
The { value } will be dynamic and will represent the current CPU usage of the { pod }

If the alarm stays open for more than 30 minutes or if its occurring multiple times, please proceed to the Resolution section

Environment

vDefend SSP >= 5.0

Cause

The current CPU limits for {service-name} is not enough to process the current load.

Resolution

Maintenance window required for remediation?
No

Try re-starting the deployment/statefulset. This should take care of any transient issues

Log into SSPI root shell

If {service-name} is metrics-postgresql-ha-postgresql

Run 'k rollout restart statefulset metrics-postgresql-ha-postgresql -n nsxi-platform'

else

Run 'k rollout restart deployment {service-name} -n nsxi-platform'

Wait for ~20 minutes and check if the alarm is auto-resolved

If the alarm persists, follow the steps below
if {service-name} is metrics-manager , metrics-query-server or metrics-app-server follow the steps below to scale-out the service

- On the Security Services Platform UI

1. 1. On the Security Services Platform UI
  2. Navigate to System | Platform & Features | Core Services
  3. On the Metrics tile click Scale Out

- Log into SSPI root shell
- Run 'k rollout restart deployment {service-name} -n nsxi-platform'
- Run 'k rollout restart deployment metrics-manager -n nsxi-platform'

For any other service, please open a ticket with Broadcom Support.

Feedback

thumb_up Yes

thumb_down No