TKGS VMOP User account password unlock and reset procedure
search cancel

TKGS VMOP User account password unlock and reset procedure

book

Article ID: 305325

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere Kubernetes Service

Issue/Introduction

This article is intended to provide steps to review vmop user account password sync, provide a workaround to unlock the account, and reset the password manually if unlocking doesn't resolve and passwords are out of sync.

Symptoms:
The VMOP user account is used by vSphere with Tanzu to manage VM object creation and management in vCenter. If this account is locked or the password is not in sync, it can cause failures to create new TKGS objects, leading to failures in Guest Cluster creation/update/management operations. 

You might see errors like the following from VMOP pod logs on the Supervisor Cluster:

2022-05-16T01:00:35.979221848Z stderr F E0429 04:20:35.979126       1 contentsource_controller.go:309] controllers/ContentSource "msg"="failed to difference images" "error"="login failed for url: https://:<VCENTER_FQDN>:443/sdk: ServerFaultCode: Cannot complete login due to an incorrect user name or password."

In vCenter server, the /var/log/vmware/vmdird/vmdird-syslog.log might show errors like:

2022-05-16T01:00:53.884683+00:00 warning vmdird  t@140502791870208: LoginBlocked DN (cn=wcp-vmop-user-domain-<>ClusterID>-<VC_MachineID>,cn=serviceprincipals,dc=vsphere,dc=local), error (9241)(Account access blocked)
2022-04-29T04:41:53.886225+00:00 info vmdird  t@140502791870208: Bind failed () (9241)
2022-04-29T04:41:58.884787+00:00 err vmdird  t@140502791870208: VmDirSendLdapResult: Request (Bind), Error (LDAP_INVALID_CREDENTIALS(49)), Message (), (0) socket (127.0.0.1)
2022-04-29T04:41:58.884890+00:00 err vmdird  t@140502791870208: Bind Request Failed (127.0.0.1) error 49: Protocol version: 3, Bind DN: "CN=wcp-vmop-user-domain-<ClusterID>-<VC_MachineID>,cn=ServicePrincipals,dc=vsphere,dc=local", Method: SASL


Environment

VMware vSphere 7.0 with Tanzu

Cause

The password sync and lockout failure is a very rare condition and root cause is still under investigation.

Resolution

Currently, the resolution is to wait for the vmop account password sync to trigger on its automated timestamp, which is every 12 hours. If this is blocking time critical operations, we can apply the below workaround to speed the process.

Workaround:

Check VMOP user account lock status:

 

1. From vCenter SSH: Check wcp logging, gather wcp-vmop account ID. Logs are located here on vCenter: /var/log/vmware/wcp/wcpsvc.log

- Example of wcp-vmop user ID: wcp-vmop-user-domain-*****-########-####-####-####-############

- The ***** is the ClusterID on which WCP was built. The ########-####-####-####-############ is the vCenter MachineID

2. From Supervisor VM: Check vmop pod logs on Supervisor cluster to see if they're reporting login failures:

# kubectl logs -n vmware-system-vmop vmware-system-vmop-controller-manager-<POD_ID> -c manager | less
 
- Note that there are 3 vmop controller manager pods but only 1 leader, you might need to adjust the POD_ID to get the correct leader.


3. From vCenter SSH: Check /var/log/vmware/vmdird/vmdird-syslog.log to see if account is locked. You will see messages like the following if it is:
 
2022-04-29T04:41:53.884683+00:00 warning vmdird  t@140502791870208: LoginBlocked DN (cn=wcp-vmop-user-domain-<>ClusterID>-<VC_MachineID>,cn=serviceprincipals,dc=vsphere,dc=local), error (9241)(Account access blocked)
2022-04-29T04:41:53.886225+00:00 info vmdird  t@140502791870208: Bind failed () (9241)
2022-04-29T04:41:58.884787+00:00 err vmdird  t@140502791870208: VmDirSendLdapResult: Request (Bind), Error (LDAP_INVALID_CREDENTIALS(49)), Message (), (0) socket (127.0.0.1)
2022-04-29T04:41:58.884890+00:00 err vmdird  t@140502791870208: Bind Request Failed (127.0.0.1) error 49: Protocol version: 3, Bind DN: "CN=wcp-vmop-user-domain-<ClusterID>-<VC_MachineID>,cn=ServicePrincipals,dc=vsphere,dc=local", Method: SASL

4. If account is reporting locked, check user using dir-cli on vCenter SSH:

# /usr/lib/vmware-vmafd/bin/dir-cli user find-by-name --account wcp-vmop-user-domain-*****-########-####-####-####-############ --level 2
 
- Output will look like:
 
    Account: wcp-vmop-user-domain-*****-########-####-####-####-############
    UPN: wcp-vmop-user-domain-c1006-9e3ac6d5-5116-4dee-9b7e-b9d066402e95@VSPHERE.LOCAL
    Account disabled: FALSE
    Account locked: TRUE
    Password never expires: FALSE
    Password expired: FALSE
    Password expiry: 9998 day(s) 19 hour(s) 57 minute(s) 58 second(s)

5. If account is showing locked, use the following command to unlock the account (please note, this command executes everything between <<EOF and the final line EOF):
 
# /opt/likewise/bin/ldapmodify -x -D cn=Administrator,cn=Users,dc=vsphere,dc=local -W <<EOF
dn: CN=wcp-vmop-user-domain-*****-########-####-####-####-############,CN=ServicePrincipals,dc=vsphere,dc=local
changetype: modify
replace: userAccountControl
userAccountControl: 0
EOF


 
- If the account locks again after manually unlocking it, check the below steps to ensure the passwords match between vCenter and Supervisor Cluster. If they don't match, reset the password to force a sync.
 



Check VMOP User Account Sync and reset password manually if needed:

 

1. Check user account password in Supervisor Cluster secrets:
 
# kubectl get secrets wcp-vmop-sa-vc-auth -n vmware-system-vmop -o jsonpath='{.data.password}'|base64 -d

2. Check user account password in Database: 
 
- This procedure requires creating a new file to query the secured database. We will copy an existing file and edit it to query the vmop password.

- Use the following command to copy the required file for editing:

# cp /usr/lib/vmware-wcp/decryptK8Pwd.py /usr/lib/vmware-wcp/decryptvmop.py
 
- Edit the new file and change the following:

- FROM:
   
    cur = conn.cursor()
    res = cur.execute(
        '''select cluster, master_mgmt_ip, password from cluster_db_configs''')
    rows = cur.fetchall()
    for row in rows:
        pt = decrypt(row[2], key)
        print("Cluster: %s" % row[0])
        print("IP: %s" % row[1])
        print("PWD: %s" % pt)

        print("-" * 60 + "\n")

if __name__ == '__main__':
    main()

 
- TO:
 
    cur = conn.cursor()
    res = cur.execute(
        '''select cluster, vmoperator_svcacct_pwd from cluster_db_configs''')
    rows = cur.fetchall()
    for row in rows:
        pt = decrypt(row[1], key)
        print("Cluster: %s" % row[0])
        print("vmoperator_svcacct_pwd: %s" % pt)
        print("-" * 60 + "\n")

if __name__ == '__main__':
    main()


 
- Save and quit the file.
 
- Change the file to executable:

# chmod +x /usr/lib/vmware-wcp/decryptvmop.py
 
- Execute the file to show the vmop user password, compare the output with output from Step 1:

# /usr/lib/vmware-wcp/decryptvmop.py

 
 
Example Output:
 
Read key from file

Connected to PSQL

Cluster: domain-c8-c161be69-d8d8-4b86-972c-55503fd5573d
svcacct_pwd: n6g0N>,P ue!7v<+.^yu
------------------------------------------------------------

 
3. If the passwords from step 1 and step 2 do not match, use the following steps to use the password gathered from the /usr/lib/vmware-wcp/decyptvmop.py script to update the secret in the Supervisor Cluster:
 
- From Supervisor Node SSH, use base64 to encode the password output from the decryptvmop.py command:
 
# echo -n 'n6g0N>,P ue!7v<+.^yu' | base64
 
Example Output

bjZnME4+LFA+dWVGN3Y8Ky5IeXU=
 
- Edit the wcp-vmop-sa-vc-auth secret in the vmware-system-vmop namespace:

# kubectl edit secret wcp-vmop-sa-vc-auth -n vmware-system-vmop 
 
 
- Replace the password: entry with the example output in the previous step. Then :wq to write and quit
Example:
 
password: LTk4NDMtMWFhMGQzZjU0NzQ3
 
Changes to:
password: bjZnME4+LFA+dWVGN3Y8Ky5IeXU=
 
 
8. Check the Supervisor Cluster secret to ensure the password has been changed: 
 

# kubectl get secrets wcp-vmop-sa-vc-auth -n vmware-system-vmop -o jsonpath='{.data.password}'|base64 -d
 
 
9. Finally, we need to scale down the VMOP pods then scale them back up from the Supervisor Cluster to ensure they take the new password:
 

- Use the following command from Supervisor Cluster to scale down vmop pods:

# kubectl scale deployment -n vmware-system-vmop vmware-system-vmop-controller-manager --replicas=0


- Scale the vmop pods back to 3 with the following command:

# kubectl scale deployment -n vmware-system-vmop vmware-system-vmop-controller-manager --replicas=3

 

PLEASE NOTE: If authentication failures persist with LDAP49 errors after the password reset, please confirm that the user has been UNLOCKED. If the user account is locked, password change will fail.

 

Additional Information

Impact/Risks:
If the VMOP account is locked, or if the password is out of sync, vSphere with Tanzu Supervisor Clusters will not be able to request resource creation or updates from vCenter. This will lead to failures in creating or managing new or existing TKGS cluster objects. 

The user might see a completely healthy Supervisor Cluster report, but will not be able to create, upgrade, or modify TKGS guest clusters.