Service VIP allocated from wrong CIDR when TKG management cluster deployed using AVI Network separation feature along with "AVI_CONTROL_PLANE_HA_PROVIDER:​​ true
search cancel

Service VIP allocated from wrong CIDR when TKG management cluster deployed using AVI Network separation feature along with "AVI_CONTROL_PLANE_HA_PROVIDER:​​ true

book

Article ID: 371361

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

In TKG 2.4.1, when "AVI_CONTROL_PLANE_HA_PROVIDER" is set to "true", and "AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR",  "AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_CIDR", set to different networks, creating cluster might fail because VIP for control plane service might be assigned to wrong network.

AKO will try to assign the correct IP back but will fail with below error:-

2024-06-19T01:13:08.545Z    WARN    rest/rest_operation.go:304    key: admin/tkg-system-mgmt-test--tkg-system-tkg-system-mgmt-test-control-plane, msg: RestOp method PUT path /api/vsvip/vsvip-a8d0b819-2b46-42f6-86ad-241d6dffad6f tenant admin Obj {"_last_modified":"1718759587857132","cloud_ref":"https://<AVI-Controller-IP>/api/cloud/cloud-7f2c7a55-95e1-4b6d-99bd-af191fe78f44#Default-Cloud","east_west_placement":false,"markers":[{"key":"clustername","values":["tkg-system-mgmt-test"]}],"name":"tkg-system-mgmt-test--tkg-system-tkg-system-mgmt-test-control-plane","tenant_ref":"https://<AVI-Controller-IP>/api/tenant/admin#admin","url":"https://<AVI-Controller-IP>/api/vsvip/vsvip-a8d0b819-2b46-42f6-86ad-241d6dffad6f#tkg-system-mgmt-test--tkg-system-tkg-system-mgmt-test-control-plane","uuid":"vsvip-a8d0b819-2b46-42f6-86ad-241d6dffad6f","vip":[{"auto_allocate_ip":true,"ip_address":{"addr":"192.168.30.10","type":"V4"},"ipam_network_subnet":{"network_ref":"/api/network/?name=VLAN-20-PG"},"vip_id":"0"}],"vrf_context_ref":"https://<AVI-Controller-IP>/api/vrfcontext/vrfcontext-066fcecc-5faa-4994-a48a-c05e6fab2a06#global","vsvip_cloud_config_cksum":"2996509940"} returned err {"code":0,"message":"map[error:[tkg-system-mgmt-test--tkg-system-tkg-system-mgmt-test-control-plane#0] There are no configured static IP pools available for VIPs in network <Portgroup-name> which contain IP address 192.168.30.10]","Verb":"PUT","Url":"https://<AVI-Controller-IP>//api/vsvip/vsvip-a8d0b819-2b46-42f6-86ad-241d6dffad6f","HttpStatusCode":400} with response null
2024-06-19T01:13:08.545Z    WARN    rest/dequeue_nodes.go:620    key: admin/tkg-system-mgmt-test--tkg-system-tkg-system-mgmt-test-control-plane, msg: there was an error sending the macro Error during PUT: Encountered an error on PUT request to URL https://<AVI-Controller-IP>//api/vsvip/vsvip-a8d0b819-2b46-42f6-86ad-241d6dffad6f: HTTP code: 400; error from Avi: map[error:[tkg-system-mgmt-test--tkg-system-tkg-system-mgmt-test-control-plane#0] There are no configured static IP pools available for VIPs in network <Portgroup-name> which contain IP address 192.168.30.10]

 

AVI_MANAGEMENT_CLUSTER_CONTROL_PLANE_VIP_NETWORK_CIDR is set to 192.168.20.0/24, but the service is assigned with 192.168.30.10, which is from AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR.

Cause

This is because of a bug in AKO 1.10, when booting up AKO, it doesn’t wait for AVI Infra setting to be ready first when reconciling the service for the control plane endpoint. So it might assign a wrong IP without the correct information from AVI Infra Setting.

Resolution

No permanent resolution as of now.

Workaround:- 

Set ​"VSPHERE_CONTROL_PLANE_ENDPOINT" to IP from AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR first before creating the cluster.