TKGI version upgrade without new stemcell fails for Containerd runtime clusters with Istio CNI
search cancel

TKGI version upgrade without new stemcell fails for Containerd runtime clusters with Istio CNI

book

Article ID: 442521

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

TKGI version upgrade without new stemcell fails for Containerd runtime clusters with Istio CNI. This breaking change is specified in the release notes for TKGi https://techdocs.broadcom.com/us/en/vmware-tanzu/standalone-components/tanzu-kubernetes-grid-integrated-edition/1-25/tkgi/release-notes.html#-tkgi-version-upgrade-without-new-stemcell-fails-for-containerd-runtime-clusters-with-istio-cni

On clusters configured to use a containerd registry and Istio CNI, upgrading the TKGI version without also upgrading the stemcell fails with errors kubelet cannot find istio-cni binary and nsx fails to receive message header.

This error does not occur when you upgrade to a new stemcell along with the new TKGI version.

Environment

TKGI

Cause

When TKGI cluster upgrades and drains the node during upgrade, it leaves the cluster nodes’ Istio CNI agent and CNI configuration in a corrupted state.

If the cluster nodes are not automatically re-created by a stemcell change, the corrupted Istio CNI state remains.

Resolution

In case the environment is already running with latest stemcell and an upgrade is planned for the TKGi. There are 2 options to follow with:

1. Wait for the new stemcell. Currently latest version is 1.1193, the next release is expected in June.

2. Do the upgrade and manually recreate all vms after to workaround the issue described in breaking changes.