LB status showing as degraded when using Distributed Load Balancer while AVI is also used in the environment
search cancel

LB status showing as degraded when using Distributed Load Balancer while AVI is also used in the environment

book

Article ID: 417411

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Using DLB for TKG cluster(s)
  • AVI is also in the environment, but not being used for the cluster
  • Degraded alarm is showing, but the traffic is passing as expected
  • One or more hosts show Logical Switch Ports(LSP) in  “Not ready” state.
    esxi> get load-balancer 39f9####-1e38-####-abec-5ae9####2241 status
    Load Balancer
    UUID : 39f9####-1e38-####-abec-5ae9####2241
    Display-Name : mydlb1
    Status : partially ready
    Ready LSP Count : 1
    Not Ready LSP Count: 2
    Conflict LSP Count : 0
    Ready LSP : 90cc####-7602-####-a27d-a45e####180d
    Not Ready LSP : 8c25####-64a8-####-b323-181f####1b45
                         01c0####-61be-####-9f20-5306####3531
    Conflict LSP :
    Warning : LSP below is not ready as DFW Exclusion List
                         8c25####-64a8-####-b323-181f####1b45
                         01c0####-61be-####-9f20-5306####3531

Environment

  • VMware NSX

 

Cause

This issue occurs because AVI also uses the DLB to provided services.  Each host that has AVI will have a service engine VM that is linked to the DLB interfaces.

When DLB is used outside of AVI, AVI is still referenced for the ports and will return a negative result, which causes the DLB port to report "not ready".

Resolution

Currently, there is no fix for this issue.  You may work around the issue with the following procedure.

 

  1. Create a security group to exclude the AVI service engine VMs.
    1. Navigate to the Inventory->Groups section of the NSX UI.
    2. Click the Add Group button.
    3. Add the group criteria.
      1. NSX Segment Tag Equals <segment name>
        1. The segment name is the name of the NSX segment that the load balancer is attached to.
      2. Scope Equals ncp/cluster.  If ncp/cluster does not show up in the selection list, you may enter it in manually.
      3. click the + button at the end of the first line to add another criteria.  Ensure that the operator is selected as AND.
      4. NSX Segment Tag Not Equals avi.  If avi does not appear in the selection list, you may enter it manually.
      5. Scope Equals ncp/created_for
  2. Verify that the AVI service engine VM's do NOT show up in this group.
    1. Navigate to the Inventory->Groups section of the NSX UI.
    2. Find your new group.
    3. Select the View Members link.
    4. Verify that there aren't any AVI service engine VM's listed.
  3. Get the path to the new security group.
    1. Navigate to the Inventory->Groups section of the NSX UI.
    2. Find your new group.
    3. Select the ellipsis (three dots) on the left side of the group and select the copy path to clipboard option.
    4. Save this path in a text file.
  4. Get the Load Balancer domain ID.  This may be done from an NSX Manager node via SSH as root or from any command line the support the CURL command.
    1. curl -k -u 'admin' 'https://<NSX manager IP>/policy/api/v1/infra/lb-services/'
    2. find the 'id' section of the output that also contains the 'display name' that matches your load balancer.
    3. save this id in a text file.
  5. Output the distributed load balancer service JSON data to a file.  This may be done from an NSX Manager node via SSH as root or from any command line that supports the CURL command. 
    1. curl -k -u 'admin' 'https://<NSX manager IP>/policy/api/v1/infra/lb-services/<load balancer domain ID> > data.json
      1. The load balancer domain ID is the id from step 4.
  6. Edit the data.json file and change the connectivity_path section to use the security group path from Step 3.
  7. Update the dlb domain using the edited data.json file.
    1. curl -k -u 'admin' https://<NSX manager IP>/policy/api/v1/infra/lb-services/<cluster_ip_domain> -X PATCH [email protected] -H 'accept: application/json' -H 'Content-type: application/json' -H 'X-Allow-Overwrite: true
  8. After 5 minutes, verify that the load balancer no longer shows the alarm.
    1. curl -k -u 'admin' 'https://<NSX manager IP>/policy/api/v1/infra/lb-services/<cluster_ip_domain>/detailed-status?source=realtime&enforcement_point_path=/infra/sites/default/enforcement-points/default'
    2. The NOT_READY instance number should be 0 for all hosts.