2.1.1 to 2.2.0 TKGm management cluster upgrade to k8s version 1.25.x fails when using Antrea CNI
book
Article ID: 369894
calendar_today
Updated On:
Products
Tanzu Kubernetes GridVMware Tanzu Kubernetes Grid
Issue/Introduction
TKGm management cluster upgrade from 2.1.1 to 2.2.0 fails when using Antrea CNI and upgrading to k8s version 1.25.x because of issues with old EndpointSlice API
The control plane nodes upgrade as expected but worker nodes fail to upgrade.
TKGm 2.1.1 includes Antrea 1.7.2, which reconciles EndpointSlice API version v1beta1. This version of the EndpointSlice resource was removed in k8s 1.25 (see the Deprecated API Migration Guide Kubernetes docs)
During an upgrade to TKGm 2.2.0 and k8s version 1.25.x, the API server is updated and no longer includes EndpointSlice v1beta1 while Antrea CNI pods running on worker nodes still expect to reconcile that resource version.
This causes the upgrade to fail.
Resolution
Upgrading to TKGm 2.2.0 and k8s version 1.23.x or 1.24.x will succeed.
This will upgrade Antrea to version 1.9.0 which reconciles EndpointSlice v1 resources.
The EndpointSlice v1 version is available since k8s 1.21.
To upgrade to k8s 1.25.x perform a 2-step upgrade, first upgrading to one of the versions mentioned above and then to 1.25.x.