TKGi with NSX service of type LoadBalancer is not distributing the traffic equally to all endpoints
search cancel

TKGi with NSX service of type LoadBalancer is not distributing the traffic equally to all endpoints

book

Article ID: 377196

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

The deployment contains 4 POD during high traffic flow the Load balancer routing to particular POD, the traffic is not distributing to other 3 PODS. At this moment application behaviour is not stable.

Environment

TKGi 1.18.x

TKGi 1.19x

NSX 3.x 

NSX 4.x

Service ot type LoadBalancer is represented as NSX L4 LB using RoundRobin for distribution 

Cause

ROUND_ROBIN as balancing method is distributing the load equally to all members in ordered way for example if we have 4 pods and 15 requests they will be distributed equally to each of the endpoints

POD1 - 1,5,9,13

POD2 - 2,6,10,14

POD3 - 3,7,11,15

POD4 - 4,8,12

However this method will only take into account the initial session to the endpoint and will not consider the already established sessions and their load

If there are different clients (3rd party consumers of the service)  they could behave differently and they might keep one session for longer period or eventually they can send/receive more data compared to other clients holding one session for longer.

Due to the very basic principle of distribution ROUND_ROBIN does not take into account such behaviour and inevitably leads to some pods loaded more than others

 

Resolution

Change the algorithm to another type of distribution based on the NSX documentation, currently the TKGi, NCP does not provide a method to change the distribution and it have to be completed via NSX API

Below are the required steps for MANAGER type 

1. Get the details for the pool

curl -k -u 'admin':'PASSWORD' -X GET https://<NSXMANAGER>/api/v1/loadbalancer/pools/5788511e-6bb0-4251-87f3-6afd30d95812 > patch.json

Where the ID you can find from the NSX 

search for the pool name in my case pks-6cac40ac-0f84-4f8d-8ed0-3c7a338fbac2-data-nginx-pong-80    <------- data  is the namespace and  nginx-pong-80 is the name of the loadbalancer service

once you find it in the NSX under server pools section copy the ID from there

2. Edit the json created:

Original file would look like this:

{
  "algorithm" : "ROUND_ROBIN",
  "members" : [ {
    "display_name" : "nginx-pong.nginx-pong-deployment-77455d444d-chdsz",
    "ip_address" : "10.10.x.3",

Modify the algorithm:

{
  "algorithm" : "LEAST_CONNECTION",
  "members" : [ {
    "display_name" : "nginx-pong.nginx-pong-deployment-77455d444d-chdsz",
    "ip_address" : "10.10.x.3",

 

3. Once updated PUT the request back to the same ID:

curl -k -H "X-Allow-Overwrite: true" -H "Content-Type: application/json" -u 'admin:PASSWORD' -X PUT -d @patch.json https://<NSXMANAGER>/api/v1/loadbalancer/pools/5788511e-6bb0-4251-87f3-6afd30d95812

Where:

-k <--- ignore ssl validation

-H "X-Allow-Overwrite: true" <--- overwrite protected object 

-H "Content-Type: application/json" <--- json format we are using

-d @patch.json <--- name of the file we want to sent as body

This changes the algorithm to least connected 

the options are:

ROUND_ROBIN, WEIGHTED_ROUND_ROBIN, LEAST_CONNECTION, WEIGHTED_LEAST_CONNECTION, IP_HASH

If you are using POLICY the API then the call would be different please review the NSX API documentation for the correct API calls