Unable to put a host in maintenance mode.
vpxd.log shows errors similar to:
2020-12-21T12:43:56.848-08:00 info vpxd[10034] [Originator@6876 sub=MoHost opID=opId-18b14-105289-d9] WCP exitMaintenanceMode vAPI returns error: Error: --> com.vmware.vapi.std.errors.unauthenticated --> Messages: --> vapi.security.authentication.invalid<Unable to authenticate user> --> 2020-12-21T12:43:56.851-08:00 error vpxd[10034] [Originator@6876 sub=MoHost opID=opId-18b14-105289-d9] [Delete] Failed to delete vAPI session. Error: --> Error: --> com.vmware.vapi.std.errors.unauthenticated --> Messages: --> vapi.security.authentication.invalid<Unable to authenticate user> .. .. .. 2020-12-21T12:43:56.860-08:00 info vpxd[10034] [Originator@6876 sub=Default opID=opId-18b14-105289-d9] [VpxLRO] -- ERROR task-6215 -- host-9421 -- vim.HostSystem.enterMaintenanceMode: vim.fault.InvalidState: --> Result: --> (vim.fault.InvalidState) { --> faultCause = (vmodl.MethodFault) null, --> faultMessage = (vmodl.LocalizableMessage) [ --> (vmodl.LocalizableMessage) { --> key = "com.vmware.cdrs.maintenancemode.wcp.entermaintenancemode", --> arg = <unset>, --> message = <unset>
SDDC Manager workflows fail with the error:
FAILED_TO_GET_WCP_CLUSTER_STATUS Failed to get Workload Management cluster status for vCenter <VC_FQDN>
/usr/lib/vmware-vmafd/bin/vmafd-cli get-machine-id --server-name localhost hostname -f3. Generate WCP solution user key:
/usr/lib/vmware-vmca/bin/certool --server localhost --genkey --privkey=/tmp/wcp.key --pubkey=/tmp/wcp.pub4. Generate WCP solution user certificate:
/usr/lib/vmware-vmca/bin/certool --server=localhost --genCIScert --privkey=/tmp/wcp.key --cert=/tmp/wcp.crt --Name=wcp --Hostname=<VC_FQDN>
5. Get WCP service name using dir-cli: [default name - wcp-<machine id>]
/usr/lib/vmware-vmafd/bin/dir-cli service list6. Update the WCP service with the new WCP certificate:
/usr/lib/vmware-vmafd/bin/dir-cli service update --name <insert wcp service name from the service list> --cert /tmp/wcp.crt7. Delete the WCP solution user entry from VECS store:
/usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store wcp --alias wcp -y /usr/lib/vmware-vmafd/bin/vecs-cli force-refresh
8. Update the new WCP solution user certificate to VECS store:
/usr/lib/vmware-vmafd/bin/vecs-cli entry create --store wcp --alias wcp --cert /tmp/wcp.crt --key /tmp/wcp.key
9. Verify that the WCP certificate is updated - The Subject should contain unique CN as updated in wcp.cfg, as well as a new Issue and Expiration date:
/usr/lib/vmware-vmafd/bin/vecs-cli entry getcert --store wcp --alias wcp --text
10. Restart services on the vCenter:
service-control --stop --all && service-control --start --all
11. Re-try the previous workflow which was failing due to WCP errors.
Impact/Risks:
WARNING: The process involves making changes to the solution user registration and certificate, which is stored in VMDIR. Highly recommended to take offline snapshots of all vCenters in the SSO prior to attempting the steps in the resolution.
This issue is being checked by Diagnostics for VMware Cloud Foundation.
The check is as follows: