During deployment of a TKG Management Cluster the Native Load Balancer Virtual Server and server pool are Down.
search cancel

During deployment of a TKG Management Cluster the Native Load Balancer Virtual Server and server pool are Down.

book

Article ID: 409213

calendar_today

Updated On:

Products

VMware NSX VMware Tanzu Kubernetes Grid VMware Tanzu Kubernetes Grid Integrated Edition VMware Tanzu Kubernetes Grid Integrated Edition (Core) VMware Tanzu Kubernetes Grid Integrated EditionStarter Pack (Core) VMware Tanzu Kubernetes Grid Management VMware Tanzu Kubernetes Grid Plus

Issue/Introduction

  • TKG is creating a management cluster in which it will deploy Control Plane and Worker nodes.
  • The process appears to be stalled.   
  • The Control Plane and Worker nodes are seen as deployed in the vCenter server inventory.
  • The Control Plane and Worker nodes do not show any IP configuration in vCenter.
  • In the NSX manager under load balancers the Virtual Server and Server Pools show a status of Down.

Environment

  • VMware NSX 4.x
  • Tanzu Kubernetes Grid

Cause

  • The virtual server and server pool will be in a Down status when there is no communication between the server pool members and the virtual server. This is expected. 


  • This screenshot shows the virtual server in a Down status with the status window expanded.
  • Expanding the Alarm will reveal the message that the issue is related to communications between the virtual server and the server pool members.


  • Opening the alarms under server pools reveals that there are no alarms open.


  • Clicking on Down under the status of the server pool opens the details window for the server pool.
  • Checking the membership of the server pool verifies that there are no members in the pool.
  • The expected status is Down if there are no server pool members. 
  • In vCenter server you may see that one or all TKG nodes have been created to some point.
  • None of the nodes will have an IP configured (These are the server pool members).
  • When TKG is creating the management cluster it communicates with both the vCenter server and NSX managers.
  • The vCenter will create the cluster and management nodes (virtual machines) as well as configure them.
  • NSX will configure a native load balancer with the configurations requested by TKG.
  • Once there is communication established the virtual server and server pool members will have a status of success.

    NOTE:
    NSX has done the work asked of it by TKG.  It created the virtual server and server pool. NSX is only waiting on the pool member to establish communication with the virtual server to transition to a success status.
    VCenter has the tasks to create a TKG management cluster and the management nodes.  The nodes will be configure per the data given that the TKG YAML files contain.  If this process stalls before the TKG nodes are fully configure with IP settings,  NSX will wait. The server pool IPs are taken from the YAML files to be configured as the server pool member IP addresses.

Resolution

Reboot vCenter server.

Additional Information

  • If the reboot does not correct this issue then a joint call with the TGK team, Compute team, and NSX team will be needed.
  • If there is to be an RCA attempted the vCenter support logs will need to be gathered prior to the reboot and then after the reboot.
  • Time and date stamps will be very important. Particularly for the TKG cluster deployment.