Failure to create loadbalancers for Kubernetes clusters with same Service name on the same Resource Group on Azure
search cancel

Failure to create loadbalancers for Kubernetes clusters with same Service name on the same Resource Group on Azure

book

Article ID: 298481

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Suppose that you have two different Kubernetes clusters on the same Resource Group on Azure. You may want to create loadbalancers for each cluster with the same Service name based on the following simple manifest.
---
apiVersion: v1
kind: Service
metadata:
  name: my-loadbalancer
  labels:
    run: my-loadbalancer
spec:
  ports:
  - port: 20001
    protocol: TCP
  selector:
    run: my-loadbalancer
  type: LoadBalancer

You can create a loadbalancer without any problems for the first cluster. However, you may face an issue with the second cluster and you get the following error from the kubectl describe command:
Error syncing load balancer: failed to ensure load balancer: timed out waiting for the condition


Resolution

According to the stderr log of kube-controller-job at the master node, you see the following log messages when you see this issue:
I0304 14:05:15.646864       6 controller.go:336] Ensuring load balancer for service default/my-loadbalancer
  :
E0304 14:05:41.817056       6 azure_backoff.go:749] processHTTPRetryResponse: backoff failure, will retry, err=network.PublicIPAddressesClient#Delete: Failure sending request: StatusCode=400 -- Original Error: Code="PublicIPAddressCannotBeDeleted" Message="Public IP address /subscriptions/********-****-****-****-************/resourceGroups/my-rsg/providers/Microsoft.Network/publicIPAddresses/kubernetes-******************************** can not be deleted since it is still allocated to resource /subscriptions/********-****-****-****-************/resourceGroups/my-rsg/providers/Microsoft.Network/loadBalancers/kubernetes/frontendIPConfigurations/********************************. In order to delete the public IP, disassociate/detach the Public IP address from the resource.  To learn how to do this, see aka.ms/deletepublicip." Details=[]
E0304 14:05:41.817168       6 controller.go:243] error processing service default/my-loadbalancer (will retry): failed to ensure load balancer: timed out waiting for the condition
I0304 14:05:41.817334       6 event.go:281] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"my-loadbalancer", UID:"*************-****-****-************", APIVersion:"v1", ResourceVersion:"6538", FieldPath:""}): type: 'Warning' reason: 'SyncLoadBalancerFailed' Error syncing load balancer: failed to ensure load balancer: timed out waiting for the condition

When Azure tries to assign the public IP address to the second loadbalancer, it looks like that it tries to reuse the IP address assigned to the first loadbalancer. However, the first loadbalancer is active and it fails to reuse the IP address for the second loadbalancer.

Azure tries to pick up the unique public IP address for loadbalancers based on Service name.

You have two resolutions for this issue.

1. The first one is to specify different Service name for each Kubernetes clusters on the same Resource Group when creating loadbalancers on Azure.

2. The second one is to specify public IP addresses for each clusters manually rather than relying on Azure's behavior at .spec.loadBalancerIP in your .yaml file. For details, see the following document provided by Microsoft: Use a static public IP address and DNS label with the Azure Kubernetes Service (AKS) load balancer