Down Status of Load Balancer Pool Alarm
search cancel

Down Status of Load Balancer Pool Alarm

book

Article ID: 372279

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Title: Alarm for Down Status of Load Balancer Pool
Event ID: load_balancer.pool_status_down
 
Alarm Description:
  • Purpose: To inform users that the load balancer pool is down.
  • Impact: All the traffic to this pool is broken.

Environment

VMware NSX-T Datacenter
VMware NSX

Cause

  • The network to all backend servers in this pool may be unreachable.
  • The service in all backend servers is not working well.

Resolution

Steps to resolve
For 3.0.0 and higher

Recommendation Action:

Determine which load balancer pools and members are DOWN.

  • In the UI navigate to Networking -> Load Balancing and select the Server Pools tab.
  • Look for any pools that show a status of DOWN.
  • Click on that DOWN status, it will open a window with the status of all pool members.
  • Verify which members are status DOWN.


  • You can also review the pool member status at the edge command line for further detail.
  • Log in to the edge that contains the Load Balancer as admin.
  • Issue the following commands to view the load balancer and pool status:

> get load-balancers
Thu Dec 12 2024 UTC 19:30:37.690
Load Balancer
Applied To                         :
    Logical Router Id              : ########-####-####-####-############
    Service Router Id              : ########-####-####-####-############
Display Name                       : Test-lb
Enabled                            : True
UUID                               : ########-####-####-####-############
Log Level                          : LB_LOG_LEVEL_INFO
Relax Scale Validation             : False
Size                               : SMALL
Virtual Server Id                  : ########-####-####-####-############


> get load-balancer <LB UUID from previous command> pool status
Thu Dec 12 2024 UTC 19:53:20.041
Pool
UUID            : ########-####-####-####-############
Display-Name    : dummy-lb-pool
Members         : 1
Status          : down
Primary-UP-No   : 0
Backup-UP-No    : 0


> get load-balancer <LB UUID from previous command> pool <Pool UUID from previous command> status
Thu Dec 12 2024 UTC 19:54:08.338
Pool
UUID                        : ########-####-####-####-############
Display-Name                : dummy-lb-pool
Status                      : down
Total-Members               : 1
Primary Up                  : 0
Primary Down                : 1
Primary Disabled            : 0
Primary Graceful Disabled   : 0
Primary Unknown             : 0
Backup Up                   : 0
Backup Down                 : 0
Backup Graceful Disabled    : 0
Backup Disabled             : 0
Backup Unknown              : 0

Member
Display-Name                : dummy-backend
Type                        : primary
IP                          : ##.##.##.##
Port                        : 443
Status                      : down
Last-State-Change-Time      : 2024-12-12 18:59:36

Monitor
Display-Name                : default-http-lb-monitor
Type                        : HTTP
Status                      : unknown
Url                         : /
Last-Check-Time             : 2024-12-12 18:59:36
Last-State-Change-Time      : 2024-12-12 19:53:38
Failure-Reason              : Health Check Intializing

 

For each member that is showing DOWN or UNKNOWN

  • Check network connectivity from the load balancer to the impacted pool members.
  • Validate application health of each pool member.
  • When the health of the member is established, the pool member status is updated to healthy based on the 'Rise Count' configuration in the monitor. 
  • Remediate the issue by rebooting the pool member or make the Edge node enter maintenance mode, then exit maintenance mode.

Maintenance window required for remediation? Optional.