- Post upgrade TCA 3.3 failed to pull the container images after the reboot
- During TCA 3.3.0 upgrade, a bug in the cleanup-images.sh causes deletion of both old (3.2.0.1) and new (3.3.0) images for some critical services like Kafka and Postgres. Since containerd permits image deletion even if pods are running, any pod crash or reboot leads to ImagePullBackOff errors due to missing images.
3.3
After upgrading to version 3.3.0, a cleanup script is executed to remove old TCA images. However, due to a bug in the script, some off-the-shelf (OTS) images (e.g., Kafka, Postgres, Istio) that remained unchanged between versions 3.2.0.1 and 3.3.0 are also being inadvertently deleted.
As a result, if any pods using these images are recreated, they fail to start and enter an ImagePullBackOff state because the corresponding images are no longer available locally.
Pods can be recreated in several scenarios, including:
Impact:
All OTS pods like istio,kafka etc fails to comeback leading TCA non-functional.
VMware by Broadcom is aware of this issue and working to fix it in TCA 3.3 Patch.
Workaround:
To fix the issue download the patch manually and apply in both TCA-M and TCA-CP appliances.
tar -xvzf recover-missing-images-tca-3-3-0-patch.tar.gz
bash recover-missing-images-tca-3-3-0-patch/recover-missing-images-tca-3-3-0.sh
NOTE : MD5 | SHA256 details of recover-missing-images-tca-3-3-0-patch.tar.gz
MD5: 1b88f048e2645c9978502e9a441757d0
SHA256: 85a34abfe7ffd038ad312338fa5c685105cb89331d18deeafb14d8a651a819ea