Error: "DHCP lease expired" causes CCI Controller Manager pods to fail in Supervisor

search cancel

Error: "DHCP lease expired" causes CCI Controller Manager pods to fail in Supervisor

book

Article ID: 432680

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

CCI Controller Manager pods, which manage the Local Consumption Interface service within the Supervisor, transition into an error state. This issue results in potential control plane disruptions and degraded functionality within the containerized environment due to the loss of active controller managers.
When reviewing the Kubernetes cluster status and pod error logs, a "DHCP lease expired" condition is observed, as seen in the following output:

kubectl get pods -n svc-cci-ns-<domain-cXX>

NAME READY STATUS RESTARTS AGE
cci-ns-controller-manager-##########-##### 1/1 dhcp lease expired couldn't renew it: server rejected request with Nak (msg: requested address not available) 0 15h
cci-ns-controller-manager-##########-##### 1/1 dhcp lease expired couldn't renew it: server rejected request with Nak (msg: requested address not available) 0 17h
cci-ns-controller-manager-##########-##### 1/1 dhcp lease expired couldn't renew it: server rejected request with Nak (msg: requested address not available) 0 25h
cci-ns-controller-manager-##########-##### 1/1 dhcp lease expired couldn't renew it: server rejected request with Nak (msg: requested address not available) 0 12h

Environment

VMware Kubernetes Service

Cause

The DHCP lease duration configured in the external DHCP server (such as Infoblox) is excessively short (e.g., 1 hour). This short duration forces frequent IP renewal attempts. The DHCP server rejects these renewals with a NAK response, causing the pods to lose network connectivity and the controller-manager process to fail upon lease expiration.

Resolution

To restore the Local Consumption Interface service and resolve the configuration mismatch, complete the following steps.

Clear Errored Pods

Delete the affected pods to force the Kubernetes scheduler to recreate them. This action initiates a new DHCP DORA (Discover, Offer, Request, Acknowledge) sequence, restoring the pods to a Running state with newly assigned IP addresses.

1. Connect to the Supervisor control plane cluster.
2. Run the following command to delete the errored pods, replacing <namespace> with the appropriate namespace: kubectl delete pods -l control-plane=controller-manager -n <namespace>

Modify DHCP Configuration

To prevent future recurrences and pod failures, increase the DHCP lease time configured within the DHCP server (e.g., Infoblox) consulting the network team. Extending the lease duration mitigates the frequency of renewal attempts and prevents premature lease expirations and NAK responses.

Feedback

thumb_up Yes

thumb_down No