Symptoms:
After vCenter upgrades, patching, or some other vCenter maintenance operations, new Telco Cloud Automation (TCA) cluster operations fail or do not complete.
Other symptoms may include:
I0924 21:50:02.070872 1 vimmachine.go:147] "capv-controller-manager/vspheremachine-controller/Cluster-NameSpace/Cluster-Name-control-plane-gnx88-htrkg: waiting for ready state"
2.x
This is due to a disconnect between CAPV and the vCenter API. CAPV is unable to restore connectivity in some instances. This is a known issue in the Cluster API Provider for vSphere (CAPV). Please refer to TKG VMs Not Provisioned in vSphere - status.ready not found vSphereVM for additional details.
TCA 3.0+ with TKG 2.3.1+
Restart the capv controller. There is no impact on existing clusters.
kubectl rollout restart deploy/capv-controller-manager -n capv-system