Load Balancer Service Status Degraded Alarm
search cancel

Load Balancer Service Status Degraded Alarm

book

Article ID: 372264

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm for Load Balancer Service Status Degraded
Event ID: load_balancer.lb_status_degraded

Alarm Description:

  • Purpose: To inform users that the load balancer service is degraded.
  • Impact: 
    • For the centralized load balancer, it will bring traffic down risks. There is no standby Edge node for this load balancer service.
    • For the distributed load balancer, load balancer does not work on not ready and conflict LSPs.  

Environment

VMware NSX-T Datacenter
VMware NSX

Cause

  • For the centralized load balancer, the load balancer service is not ready on the standby Edge node.
  • For the distributed load balancer, some logical switch ports are not ready on ESXi host nodes.

Resolution

Steps to resolve
For 3.1.2 and higher

Recommendation Action:

For centralized load balancer:

  1. Check the load balancer status on standby Edge node as the degraded status means the load balancer status on standby Edge node is not ready. On standby Edge node, invoke the NSX CLI command `get load-balancer <lb-uuid> status`.
  2. If the LB-State of load balancer service is not_ready or there is no output, make the Edge node enter maintenance mode, then exit maintenance mode.

For distributed load balancer:

  1. Get detailed status by invoking NSX API GET /policy/api/v1/infra/lb-services/<LBService>/detailed-status?source=realtime
  2. From API output, find ESXi host reporting a non-zero instance_number with status NOT_READY or CONFLICT.
  3. On ESXi host node, invoke the NSX CLI command `get load-balancer <lb-uuid> status`.
           If 'Conflict LSP' is reported, check whether this LSP is attached to other load balancer service. Check whether this conflict is acceptable.
           If 'Not Ready LSP' is reported, check the status of this LSP by invoking NSX CLI command `get logical-switch-port status`.

NOTE: You should ignore the alarm if it can be resolved automatically in 5 minutes because the degraded status can be transient.

Maintenance window required for remediation? Yes for the centralized load balancer.