vDefend SSP Alarm: Platform Service status is down or degraded
search cancel

vDefend SSP Alarm: Platform Service status is down or degraded

book

Article ID: 384122

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

You are running vDefend SSP 5.0 or later and have encountered an alarm with the description:
"Platform Service {{ .ResourceID }} is degraded."

This indicates that one of the Platform Services is currently degraded or in an unhealthy state, impacting its functionality.

These are the services , which are part of the platform services, that could throw service degraded alarm. 

Deployments

  1. authelia

  2. authserver

  3. cluster-api

  4. monitor

  5. postgresql-ha-pgpool

  6. sentinel

Statefulsets

  1. nsx-config-0

  2. nsx-config-1

  3. postgresql-ha-postgresql

If the alarm stays open for more than 30 minutes or if its occurring multiple times, proceed to the Resolution section.

Environment

vDefend SSP Version: 5.0 and later

Cause

One or more pods of platform service {{ .ResourceID }} are not in a running state

Resolution

Steps to resolve:

Restart the deployment/statefulset. This should take care of any transient issues

  • Log into SSPI root shell.

  • Get the pod name for the {{ .ResourceID }} , using 'k -n nsxi-platform get pods | grep {{ .ResourceID }}'
  • k -n nsxi-platform get pod <pod-name> -o jsonpath='{.metadata.ownerReferences[0].kind}'

    • If the output is StatefulSet, follow the StatefulSet restart steps.
    • If the output is ReplicaSet, it belongs to a Deployment 
  • If {{ .ResourceID }} is statefulset run:

    k -n nsxi-platform rollout restart statefulset {{ .ResourceID }} 
  • Otherwise, run:

    k -n nsxi-platform rollout restart deployment {{ .ResourceID }}

Wait for ~20 minutes and check if the alarm is auto-resolved. (k -n nsxi-platform get pods to check restarted pod are up)

If the alarm persists, check for the following 

  • Check for disk usage alarms, KB: 384119
  • Check for memory usage alarms, KB: 384120
  • Check for CPU usage alarms, KB: 384118

If none of the above is applicable, Open a ticket with Broadcom Support.