TSM Software upgrade fails for certain clusters with message Upgrading mesh dependencies
search cancel

TSM Software upgrade fails for certain clusters with message Upgrading mesh dependencies

book

Article ID: 379671

calendar_today

Updated On:

Products

VMware Tanzu Service Mesh

Issue/Introduction

During Upgrade from TSM or TMC of the  Istio installation from v1.18.5 to v1.22.2 the upgrade is initiated but after a while it fails and roll back is started

 

Environment

Tanzu on vSphere clusters

TSM Tanzu Service Mesh

TMC Tanzu Mission Control

Cause

Jump upgrade from 1.18.x to 1.22.2 have several operations completed during the process

for each stage there is a separate task executed 

There were two issues discovered during the upgrade operations that was taking place:

1. Pod disruption budget configured on telemetry preventing the restart of the telemetry pods

2. During second phase where all proxies configured in the cluster (enabled namespaces) have to be restarted 

Due to validating webhook this process was denied and the proxy restart was failing - leading to a rollback operation stated in the UI "Robbling back mesh dependencies"

Other reasons could be related to specific configuration or stuck objects but such were not found during analisys

Resolution

To resolve this problem we have:

1. PDB issue - Save the PDB and delete the PDB during the upgrade procedure completes

2. Validating webhook (gatekeeper) disable the validating webhook during the upgrade process to allow restart if Proxy pods.

  • gatekeeper-mutating-webhook
  • gatekeeper-validating -webhook

Applying these two changes allowed up to complete the upgrade.