Kubernetes service External IP changes without manual update in Enterprise PKS

Products

VMware Cloud PKS

Issue/Introduction

Symptoms:

Kubernetes service External IP (LB VIP) changes automatically without user recreation or update of the service.
In the /var/vcap/sys/log/ncp/ncp.stdout.log on the K8S Master node, you see the entries to service LB getting recreated:
2019-08-09T21:01:06.273Z 42053046-217c-458c-9e7c-ac57ca49aedf NSX 11715 - [nsx@5876 comp="nsx-container-ncp" subcomp="ncp" level="WARNING"] nsx_ujo.ncp.k8s.service_lb_controller There is an error happened or service type changed to service Loadbalancer nginx project nginx, LB will be deleted
2019-08-09T21:01:07.958Z 42053046-217c-458c-9e7c-ac57ca49aedf NSX 11715 - [nsx@5876 comp="nsx-container-ncp" subcomp="ncp" level="INFO"] nsx_ujo.ncp.k8s.service_lb_controller External ip used for service nginx in namepsace nginx is 10.40.14.42 from ip_pool e9219c53-4454-4b86-af06-9aaefa2d78ae
2019-08-09T21:01:08.715Z 42053046-217c-458c-9e7c-ac57ca49aedf NSX 11715 - [nsx@5876 comp="nsx-container-ncp" subcomp="ncp" level="INFO"] nsx_ujo.ncp.k8s.service_lb_controller Successfully updated Loadbalancer resources for service nginx under project nginx
In the ncp.stdout.log file, you also see that NCP API request rate is exceeding the limit (default 100) on NSX-T Manager.
2019-08-09T20:52:55.958Z 42053046-217c-458c-9e7c-ac57ca49aedf NSX 11563 - [nsx@5876 comp="nsx-container-ncp" subcomp="ncp" level="WARNING"] vmware_nsxlib.v3.client The HTTP request returned error code 429, whereas 200 response codes were expected. Response body {u'module_name': u'common-service', u'error_message': u'Exceeded request rate of 100 per second', u'error_code': u'98'}

Environment

VMware PKS 1.x

Cause

In scaled PKS environments, the heartbeat API calls sent from NCP process to NSX Manager may not get a response in time due to high CPU utilization on NSX Manager. This triggers the NCP processes in the Kubernetes clusters to restart causing NCP Master re-election. When NCP processes in all Kubernetes clusters go for re-election at the same time, the number of API calls would exceed the default API rate limits configured on NSX Manager. During this process some of the API calls fail to get a response which causes NCP process to fail and restart again leading to re-election loop.

So whenever this happens, during that time if NCP request for the service LB detail, Manager won't be able to provide that information and so eventually NCP would assume that the LB is not present and it would start creating it.

Resolution

This is a known issue impacting VMware Enterprise PKS 1.3.x and earlier.

This behavior is improved in VMware Enterprise PKS 1.4.1 + VMware NSX-T 2.4.1 by updating the default values for API rate limit and concurrency limit. Additionally, you can tune the NCP parameters for increased PKS Scaling by following the KB Tuning NCP parameters for increased PKS Scaling in VMware Enterprise PKS environment (68085)

In NSX-T 2.4.1, the default API rate limit values are:

nsxmgr-01a> get service http
Service name:                     http
Service state:                    running
Session timeout:              1800 seconds
Connection timeout:           30 seconds
Client API rate limit: 100 requests/sec
Client API concurrency limit: 40 connections
Global API concurrency limit: 199 connections
Redirect host:                (not configured)

Workaround:

To work around this issue in VMware Enterprise PKS 1.3.x, increase the API rate limits on NSX-T Manager and update the NCP parameters on each K8S node in PKS cluster:

To update the rate limit values on NSX-T Manager:

Connect to NSX-T manager as admin user.
Set API rate to unlimited and concurrency limit to 199 by running the below commands on NSX-T Manager:
set service http client-api-rate-limit 0
set service http client-api-concurrency-limit 199

To update the NCP parameters in on each master node:

Connect to the first master node using Bosh:
bosh -d <service-instance> ssh master/0
Navigate to sudo by running : sudo -i
Stop the ncp service by running the command: monit stop ncp

Add the below parameters under [ha] section in /var/vcap/jobs/ncp/config/ncp.ini file:
master_timeout = 18
heartbeat_period = 6
update_timeout = 10
Start the ncp service by running the command: monit start ncp
Repeat the same steps on other master nodes.
Repeat the above steps for other PKS clusters as well.

Note: These NCP HA changes are ephemeral and reset to default values if you recreate the Master node from ops manager/bosh cck. In that case you may need to manually update the NCP HA changes again.