NSX Distributed Load Balancer status is Degraded or Partially Up when a Transport Node is disconnected or unhealthy
search cancel

NSX Distributed Load Balancer status is Degraded or Partially Up when a Transport Node is disconnected or unhealthy

book

Article ID: 433461

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

In the VMware NSX Manager UI, the Distributed Load Balancer (DLB) service status appears as Degraded.

  • When querying the detailed status for the DLB via the API (/policy/api/v1/infra/lb-services/<cluster_ip_domain>/detailed-status?source=realtime&enforcement_point_path=/infra/sites/default/enforcement-points/default) , the service returns PARTIALLY_UP.
  • The Virtual Servers and Virtual Pools are reported as UP.
  • Review of the API output shows that one or more Transport Nodes are reporting an empty instance status (instance_number: 0 across all statuses) instead of a READY state higher than 0.
    Example of the affected node in the API output:

"transport_node_id": "<UUID of TN>",
"instance_detail_per_status": [
  {
    "status": "READY",
    "instance_number": 0
  }, {
    "status": "CONFLICT",
    "instance_number": 0
  },
  {
    "status": "NOT_READY",
    "instance_number": 0
  }
]

Environment

VMware NSX

Cause

The DLB service status is aggregated across all Transport Nodes (hosts) within the scope of the service. Per product documentation, status is marked as Degraded when at least one transport node returns a status of ready or partially ready, but not all related transport nodes return a ready status.

In this scenario, the "Degraded" state is triggered because a host is disconnected from NSX (e.g., due to a hardware issue). Because the host is not communicating, it fails to report any instances, resulting in a 0 Ready / 0 Conflict / 0 Not Ready  status count in the API. This prevents the global service from reaching a "Success/Up" state.

As per the Admin Guide - DLB Status

Status is Degraded when all the following conditions are true:

  • At least one transport node returns status for the distributed load balancer service as ready or partially ready
  • Not all the related transport nodes return status for the load balancer service as ready.

Resolution

 

This is a condition that may occur in a VMware NSX environment. This is by design and is a sign of an underlying host issue. 

  1. Identify the problematic Transport Node by running the following API command against the NSX Manager:

    curl -k -u 'admin' 'https://<NSX-Manager-IP>/policy/api/v1/infra/lb-services/<DLB-ID>/detailed-status?source=realtime&enforcement_point_path=/infra/sites/default/enforcement-points/default' 
  2. Locate the instance_detail_per_tn section in the JSON output.

  3. Identify the transport_node_id where instance_number for all statuses (READY, NOT_READY, CONFLICT) is 0.

  4. Navigate to System > Fabric > Nodes > Host Transport Nodes in the NSX UI and verify the connection health of the identified host. If unsure, the Transport Node ID can be searched in the NSX GUI.

  5. Resolve the underlying host connectivity or hardware issue.

  6. Once the host is reconnected and synchronized, the DLB status will return to Success/Up.

 

Additional Information

This KB is only relevant if there are no LSP showing as not ready or conflict in the API mentioned in the issues section.

Similar KBs

LB status showing as degraded when using Distributed Load Balancer while AVI is also used in the environment

Load Balancer Service Status Degraded Alarm