AKO crashing while retrieving resource lock in a standalone deployment
search cancel

AKO crashing while retrieving resource lock in a standalone deployment

book

Article ID: 420692

calendar_today

Updated On:

Products

VMware Avi Load Balancer

Issue/Introduction

This article addresses the unexpected behaviour where a standalone deployment of Avi Kubernetes Operator (AKO) repeatedly reboots after failing to acquire the leader lease lock from the Kubernetes API server.

This behaviour is standard and expected in a High Availability (HA) deployment but leads to service disruption when AKO is deployed as a single instance.

Environment

AKO 1.3.x and previous versions

Cause

  • AKO is designed to use a leader election mechanism via a Kubernetes Lease object to ensure that, in an HA setup, only one instance (the leader) is actively reconciling configurations.
  • The logic dictates that if an instance cannot obtain or maintain this lock, it must reboot to allow the other peer to take over leadership.
  • This behaviour is expected when AKO is deployed in a HA pair.
  • But this behaviour was seen in a standalone deployment of AKO.
  • Below Logs are seen wen AKO crashes.
    [[31mERROR^[[0m logr/logr.go:299 error retrieving resource lock avi-system/ako-lease-lock: Get "https://172.##.##.##:443/apis/coordination.k8s.io/v1/namespaces/avi-system/leases/ako-lease-lock": context deadline exceeded
    [[34mINFO^[[0m logr/logr.go:278 failed to renew lease avi-system/ako-lease-lock: timed out waiting for the condition
    
    [[34mINFO^[[0m lib/lib.go:320 Setting Disable Sync to: true
    [[31mFATAL^[[0m k8s/leader_election_callbacks.go:82 AKO lost the leadership

Resolution

The issue has been fixed in 2.1.1 in a standalone AKO deployment

This would make sure that AKO does not crash if it is not able to attain lease lock in a standalone setup.

Release Notes,

https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/avi-load-balancer/avi-kubernetes-operator/2-1/ako-release-notes/release-notes-for-ako-version-2-1-1.html

Under Key Changes in AKO 2.1.1

  • AKO does not create a lease lock object when only a single AKO instance is running.