Recreating cluster deployment fails after a failed upgrade in VMware Tanzu Kubernetes Grid Integrated Edition (TKGI)
search cancel

Recreating cluster deployment fails after a failed upgrade in VMware Tanzu Kubernetes Grid Integrated Edition (TKGI)

book

Article ID: 298647

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Upgrading VMware Tanzu Kubernetes Grid Integrated (TKGI) fails for an unknown reason and trying to recreate one of the master nodes fails with: Error: 'master/5bd430c6-24bd-41aa-9f66-8c55495b2d8 (0)' is not running after update. Review logs for failed jobs: kube-apiserver, etcd 
Task 1234 | 14:11:14 | Updating instance master: master/5bd430c6-24bd-41aa-9f66-8c55495b2d8 (0) (canary) (00:09:23) L Error: 'master/5bd430c6-24bd-41aa-9f66-8c55495b2d8 (0)' is not running after update. Review logs for failed jobs: kube-apiserver, etcd Task 1234 | 14:20:37 | Error: 'master/5bd430c6-24bd-41aa-9f66-8c55495b2d8 (0)' is not running after update. Review logs for failed jobs: kube-apiserver, etcd 

In addition, etcd.stderr.log: shows the etcd cluster cannot be downgraded.
2020-01-19 15:15:48.414046 I | raft: newRaft 18f305ed866fbc3 [peers: [18f305ed866fbc3,3872e9a619e11d92,7a0ea7da8e713ed4], term: 597, commit: 43294283, applied: 43294283, lastindex: 43294284, lastterm: 597] 2020-08-16 15:15:48.414241 C | etcdserver/membership: cluster cannot be downgraded (current version: 3.3.17 is lower than determined cluster version: 3.4).


Environment

Product Version: 1.8

Resolution

TKGI instances were not successfully upgraded and when you tried to recreate the failed cluster, it recreates VMs with the previous successful manifest, which results in downgrading ETCD (on master).

To workaround this issue, execute the upgrade process with "pks upgrade-cluster". If it fails again, review the logs and open a ticket with VMware Tanzu Support.