Autoscaler pod keeps crashing with the error:
F0829 13:56:06.574078 1 clusterapi_provider.go:205] could not find preferred version for CAPI group "cluster.x-k8s.io": failed to get ServerGroups: Get "https://##.##.##.##/api?timeout=32s": net/http: TLS handshake timeout
vSphere with Tanzu 8.0.3 tkg-service starts from v3.0.0
Isolated networks between Supervisor and Guest cluster
Autoscaler pod can not talk with supervisor apiserver through a floating IP. This IP points to one Control Plane(CP) node of the supervisor randomly.
The CP node will get the package but it will try to respond through the additional NIC and that breaks the routing because that's non-symmetric routing.
This is a known issue. There is a workaround provided internally within this KB article.
Fixed in:
vCenter v8.0.3 and tkg-service v3.3.0
Action:
Open a case with Broadcom Support and an Engineer will assist you with the workaround steps.
kubectl patch pkgi autoscaler -n tkg-system --type=json -p='[{"op": "remove", "path": "/metadata/annotations/ext.packaging.carvel.dev~1ytt-paths-from-secret-name.0"}]'