vSphere Supervisor operations hang or fail to complete
- Namespace deletion fails to complete
- VKS cluster deployments are stuck in provisioning.
- When reviewing the vCenter logs, the failing deployment may cause errors in /var/log/vmware/wcpsvc.log
Masked Output:
YYYY-MM-DDThh:mm:ss.xxxxZ debug wcp [kubelifecycle/kube_instance.go:4395] [opID=########-########-####-####-####-############] Cluster is not ready yet, would retry in 1m0s time.
YYYY-MM-DDThh:mm:ss.xxxxZ debug wcp [kubelifecycle/kube_instance.go:4395] [opID=########-########-####-####-####-############] Cluster is not ready yet, would retry in 1m0s time.
YYYY-MM-DDThh:mm:ss.xxxxZ debug wcp [kubelifecycle/kube_instance.go:4395] [opID=########-########-####-####-####-############] Cluster is not ready yet, would retry in 1m0s time.
YYYY-MM-DDThh:mm:ss.xxxxZ debug wcp [kubelifecycle/kube_instance.go:4395] [opID=########-########-####-####-####-############] Cluster is not ready yet, would retry in 1m0s time.
YYYY-MM-DDThh:mm:ss.xxxxZ debug wcp [kubelifecycle/kube_instance.go:4395] [opID=########-########-####-####-####-############] Cluster is not ready yet, would retry in 1m0s time.
-/var/log/vmware/vmdird/vmdird contains errors similar to the following:
YYYY-MM-DDTHH:MM:SS:t@################:WARNING: Lockout policy check - account lockout. (cn=wcp-storage-user-########-####-####-####-############-########-####-####-####-############,cn=serviceprincipals,dc=tanzu,dc=local)
YYYY-MM-DDTHH:MM:SS:t@################:ERROR: VdirPasswordFailEvent from user(cn=wcp-storage-user-########-####-####-####-############-########-####-####-####-############,cn=serviceprincipals,dc=tanzu,dc=local), error(0)()
YYYY-MM-DDTHH:MM:SS:t@################:ERROR: VmDirSendLdapResult: Request (Bind), Error (LDAP_INVALID_CREDENTIALS(49)), Message ((49)(SASL step failed.)), (0) socket (127.0.0.1)
YYYY-MM-DDTHH:MM:SS:t@################:ERROR: Bind Request Failed (127.0.0.1) error 49: Protocol version: 3, Bind DN: "CN=wcp-storage-user-########-####-####-####-############-########-####-####-####-############,cn=ServicePrincipals,dc=tanzu,dc=local", Method: SASL
YYYY-MM-DDTHH:MM:SS:t@################:ERROR: SASLSessionStep: sasl error (-13)(SASL(-13): authentication failure: client evidence does not match what we calculated. Probably a password error)
vSphere with Tanzu Supervisor
The wcp-storage-user password is used by the wcp service and the cns-driver of the Supervisor cluster to perform volume management operations.
This issue occurs when the password is expired, out of sync between the supervisor and vCenter, or other issues prevent the user from successfully authenticating.
A resync operation can be triggered by restarting the wcp service using the following command:service-control --restart wcp
If the errors persist after executing the restart, please refer to the following KB for additional steps in unlocking and/or resyncing the passwords:
CSI: Correct sync between CSI pod secret and workload_storage_management user password in vSphere with Tanzu
Should any issues or complications occur during remediation, please open a Broadcom support case.