NSX Distributed Load Balancer shows Degraded alarm
search cancel

NSX Distributed Load Balancer shows Degraded alarm

book

Article ID: 420132

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The NSX Distributed Load Balancer is reporting a 'Degraded' status on the Alarms dashboard

  • One or more hosts show Logical Switch Ports(LSP) in  “Not ready” state.
    esxi> get load-balancer 39f9####-1e38-####-abec-5ae9####2241 status
    Load Balancer
    UUID : 39f9####-1e38-####-abec-5ae9####2241
    Display-Name : mydlb1
    Status : partially ready
    Ready LSP Count : 1
    Not Ready LSP Count: 2
    Conflict LSP Count : 0
    Ready LSP : 90cc####-7602-####-a27d-a45e####180d
    Not Ready LSP : 8c25####-64a8-####-b323-181f####1b45
                         01c0####-61be-####-9f20-5306####3531
    Conflict LSP :
    Warning : LSP below is not ready as DFW Exclusion List
                         8c25####-64a8-####-b323-181f####1b45
                         01c0####-61be-####-9f20-5306####3531

Environment

VMware NSX 

Cause

The LSPs are part of DFW Exclusion list.

Group members in the Exclusion List should have no other services configured. If another service is configured, there will be an outage on that service. For example, if group members in the Exclusion List have DLB service configured,there will be an outage on the DLB workloads.

Resolution

This is a known issue impacting VMware NSX.

Workaround: Move the ports in DFW Exclusion List out of the Group that is used by DLB.
For example, AVI SE VMs are part of Exclusion List and all Segments in the inventory is configured under DLB

1. Create an NSX Inventory Group that excludes the Avi Service Engines (SE) on the target NSX Segment.:

NSX Segment
Tag
Equals
<Domain name>
Scope
Equals
ncp/cluster

AND

NSX Segment
Tag
Not Equals
avi
Scope
Equals
ncp/created_for

  • Define the group membership criteria using the condition: ncp/created_for != avi
  • Verify the Avi SE VMs are successfully excluded from the newly created group.


2. Retrieve the DLB service configuration via API:

curl -k -u admin 'https://nsx_mgr_ip/policy/api/v1/infra/lb-services/'
{
  "results" : [ {
   "connectivity_path" : "/infra/domains/<lB-Domain name>/groups/clusterip_domain-d23:###-df3-###-234e-###d5tu_all_segments",
    "enabled" : true,
    "relax_scale_validation" : false,
    "size" : "DLB",
    "error_log_level" : "INFO",
    "resource_type" : "LBService",
   "id" : "<lB-Domain name>",
   "display_name" : "clusterip_domain-d23:###-df3-###-234e-###d5tu_all_segments",
    "tags" : [ {
      "scope" : "ncp/version",
      "tag" : "1.2.0"
    }, {
      "scope" : "ncp/cluster",
     "tag" : "<lB-Domain name>"
    }, {
      "scope" : "ncp/created_for",
      "tag" : "DLB"
    }, {
      "scope" : "external_id",
      "tag" : "4561####-c5b4-####-8402-####e652####"
    } ],
   "path" : "/infra/lb-services/<lB-Domain name>",
   "relative_path" : "clusterip_domain-d23:###-df3-###-234e-###d5tu_all_segments",

3. Update the DLB service payload, replacing the existing connectivity_path with the path of the new exclusion group, and execute a PUT/PATCH request.

curl -k -u admin https://nsx_mgr_ip/policy/api/v1/infra/lb-services/<lB-Domain name>" -X PATCH [email protected] -H 'accept: application/json' -H 'Content-type: application/json' -H 'X-Allow-Overwrite: true'

{
  "connectivity_path" : "/infra/domains/default/groups/dlb-without-avi-se-all-segments",
  "enabled" : true,
  "relax_scale_validation" : false,
  "size" : "DLB",
  "error_log_level" : "INFO",
  "resource_type" : "LBService",
 "display_name" : "clusterip_domain-d23:###-df3-###-234e-###d5tu_all_segments",
  "tags" : [ {
    "scope" : "ncp/version",
    "tag" : "1.2.0"
  }, {
    "scope" : "ncp/cluster",
   "tag" : "<lB-Domain name>"
  }, {
    "scope" : "ncp/created_for",
    "tag" : "DLB"
  }, {
    "scope" : "external_id",
    "tag" : "4561####-c5b4-####-8402-####e652####"
  } ]
}

4. Monitor the Alarms dashboard. The 'Degraded' status should clear within 5 minutes.