Same ingress IP used across multiple clusters in VMware Tanzu Kubernetes Grid Edition (TKGI) 1.7
search cancel

Same ingress IP used across multiple clusters in VMware Tanzu Kubernetes Grid Edition (TKGI) 1.7

book

Article ID: 298730

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

After creating a VMware Tanzu Kubernetes Grid Edition (TKGI) cluster using a network profile to specify the ingress_ip parameter in cni_configurations, the types of cluster creation listed below look correct but the virtual servers for ingress never gets created, as seen in the NSX-T UI page:
  • TKGI cluster created without a network-profile.
  • TKGI cluster created with a network profile not specifying the ingress_ip parameter in cni_configurations.

Cause

After creating a TKGI cluster using a network profile to specify ingress_ipfor subsequent clusters created with or without network profile, the http_and_https_ingress_ip parameter in NCP configuration (/var/vcap/jobs/ncp/config/ncp.ini) does not get overridden. As a result, NCP ends up picking the same ingress_ip which was used earlier to create virtual servers for the previous cluster.

Since the ingress_ip now being used is already allocated in the FIP pool, new virtual servers for ingress controller of subsequent clusters are not created.

Even though the cluster creation succeeds for the subsequent cluster, you can still see the failure in cluster logs stating the conflict (/var/vcap/sys/log/ncp/ncp.stdout.log):
[nsx@6876 comp="nsx-container-ncp" subcomp="ncp" level="INFO"] nsx_ujo.ncp.nsx.lb_layer4_service Assigned user specified ip 10.40.14.60 to Layer7 virtual servers
[nsx@6876 comp="nsx-container-ncp" subcomp="ncp" level="WARNING"] vmware_nsxlib.v3.client The HTTP request returned error code 409, whereas 201/200 response codes were expected. Response body {u'error_code': 5141, u'error_message': u'Requested IP Address 10.40.14.60 is already allocated.', u'httpStatus': u'CONFLICT', u'module_name': u'id-allocation service'}
[nsx@6876 comp="nsx-container-ncp" subcomp="ncp" level="ERROR" errorCode="NCP00164"] nsx_ujo.ncp.nsx.manager.base_k8s_nsxapi Failed to allocate ip from ip pool: NSX IP 10.40.14.60 is already allocated in ip pool efe30760-39c2-4644-b6b1-103b7cfe609e


How to reproduce this issue?

Using the following sequence to create TKGI clusters ends up duplicating the Ingress IP for the virtual servers deployed in NSX-T.

1. Create a TKGI cluster (cluster1) with a Network Profile (test-np1.json), which defines the ingress_ip parameter, under cni_configurations to manually specify an available FIP from the FIP pool.

Example: Network Profile (test-np1.json):

{
    "name": "test-np1",
    "description": "Network profile using API lookup mode",
    "parameters": {
      "nodes_dns": [
        "<redacted>"
        ],
      "fip_pool_ids": [
        "<redacted>"
        ],
      "dns_lookup_mode": "API",
      "cni_configurations": {
        "type": "nsxt",
        "parameters": {
        "ingress_ip": "10.40.14.60"
                }
        }
    }
}

2. Create another TKGI cluster (cluster2) with a Network Profile (test-np2.json) that does not manually specify a FIP for the ingress_ip parameter.

Example: Network Profile (test-np2.json):
{
    "name": "test-np2",
    "description": "Network profile using API lookup mode",
    "parameters": {
      "nodes_dns": [
        "<redacted>"
        ],
      "fip_pool_ids": [
        "<redacted>"
        ],
      "dns_lookup_mode": "API"
    }
}

3. BOSH SSH to the master VM for cluster1 and cluster2 and observe that ncp.ini under /var/vcap/jobs/ncp/config/ncp.ini have the same IP for http_and_https_ingress_ip parameter for both cluster1 and cluster2

4. Lastly, in the NSX-T Manager UI, navigate to Advanced Networking & Security > Load Balancing > Virtual Servers. You will see virtual servers created only for cluster1 and not for cluster2.

Environment

Product Version: 1.7
OS: Ubuntu

Resolution

Workaround

For now, a workaround to resolve this issue is to explicitly use a network profile that specifies a new IP for the ingress_ip parameter in cni_configurations with each TKGI cluster creation so that old ingress_ip gets overridden in NCP configurations (ncp.ini) on TKGI cluster nodes.