SSP: Metrics Memory Usage High alarm observed on SSP UI due to metrics services scale out

search cancel

SSP: Metrics Memory Usage High alarm observed on SSP UI due to metrics services scale out

book

Article ID: 401844

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

Metrics Memory Usage High alarm observed on SSP UI

Environment

SSP 5.0 and NSX 4.2.1

Cause

The memory alarm on the Metrics Postgres instance is primarily triggered by a high volume of metrics being ingested and processed. This typically indicates that the current memory allocation is insufficient to handle the load efficiently. The spike in memory usage is largely driven by the scale of metric reporters in the system, particularly those related to large-scale components such as LSPs, DFW, and IDPS. Each of these components contributes significantly to the overall metrics footprint, depending on their deployment size.

Resolution

Note: This issue is fixed in SSP 5.1

STEP1: set the metrics replicas as below and verify all metrics pods are up by using below command in SSPI cli using root credentials

check all metrics pods are up and running by executing below command in SSPI cli and if all metrics pods are up set replicas as below

alias kn='k -n nsxi-platform'

kn get pods | grep metrics

kn scale deployment metrics-manager --replicas=2
kn scale deployment metrics-app-server --replicas=1
kn scale deployment metrics-query-server --replicas=1

STEP2: updated below values of metrics-postgresql-ha-pgpool using command

kn edit deployment/metrics-postgresql-ha-pgpool

initial values:
PGPOOL_NUM_INIT_CHILDREN=200
PGPOOL_MAX_POOL=2
Updated the values to
PGPOOL_NUM_INIT_CHILDREN=80
PGPOOL_MAX_POOL=1

This will restart metrics-postgresql-ha-pgpool, wait for both replicas to be running using the below command

kn get pods | grep metrics-postgresql-ha-pgpool

STEP3: Update memory-related configurations in the configmap using below command:

kn edit configmap metrics-postgresql-ha-postgresql-extended-configuration

Update the following parameters in the data section:

shared_buffers = 1500MB

work_mem: 8MB

STEP4: Update POSTGRESQL_MAX_CONNECTIONS

kn set env sts/metrics-postgresql-ha-postgresql POSTGRESQL_MAX_CONNECTIONS=180

This will restart metrics-postgresql-ha-postgresql-0, wait for the pod to be in a running state and verify All pods were up and running

kn get pods | grep metrics-postgresql-ha-postgresql-0

Additional Information

if still observes , kindly involve Broadcom Support for further troubleshooting .

Feedback

thumb_up Yes

thumb_down No