When you are creating a Tanzu Kubernetes Grid (TKG) management or workload cluster, the cluster creation appears to be hung or stuck in an indefinite state.
You are using vSphere as the underlying infrastructure provider and the following error messages can be observed in capv-controller-manager logs:
E0309 17:32:31.959993 1 machine_controller_noderef.go:152] controllers/Machine
"msg"
=
"Failed to parse ProviderID"
"error"
=
"providerID is empty"
"providerID"
={}
"node"
=
"tkg-mgmt-vsphere-20210309172258-md-0-86bd98f6c9-j4l56"
E0309 17:33:19.866066 1 machinehealthcheck_controller.go:477] controllers/MachineHealthCheck"msg"
=
"Unable to retrieve machine from node"
"error"
=
"expecting one machine for node tkg-mgmt-vsphere-20210309172258-control-plane-8cqbm, got []"
"node"
="tkg-mgmt-vsphere-20210309172258-control-plane-8cqbm"
For TKG versions before 1.2, these error messages keep on repeating in the logs without much information.
For TKG versions 1.2 and above, the error message can be observed in the capv-controller-manager logs:
E0607 00:24:51.827811 1 controller.go:248] controller-runtime/controller "msg"="Reconciler error" "error"="failed to reconcile cloud config secret for VSphereCluster tkg-system/mgmt: failed to create cloud provider secret for Cluster tkg-system/mgmt: Secret \"cloud-provider-vsphere-credentials\" is invalid: [data[https://10.16.18.51.password]: Invalid value: \"https://10.16.18.51.password\": a valid config key must consist of alphanumeric characters, '-', '_' or '.' (e.g. 'key.name', or 'KEY_NAME', or 'key-name', regex used for validation is '[-._a-zA-Z0-9]+'), data[https://10.186.198.51.username]: Invalid value: \"https://10.186.198.51.username\": a valid config key must consist of alphanumeric characters, '-', '_' or '.' (e.g. 'key.name', or 'KEY_NAME', or 'key-name', regex used for validation is '[-._a-zA-Z0-9]+')]" "controller"="vspherecluster" "name"="mgmt" "namespace"="tkg-system"
Correct Configuration: VSPHERE_SERVER: "10.16.18.51"
VSPHERE_SERVER: "10.16.18.51"