ESX hosts booting from Auto-Deploy is stuck at /vmw/rbd/host/##########/waiter.tgz
search cancel

ESX hosts booting from Auto-Deploy is stuck at /vmw/rbd/host/##########/waiter.tgz

book

Article ID: 411647

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • ESXi host deployment with autodeploy fails to boot and is stuck at the loading screen, with an error: /vmw/rbd/host/##########/waiter.tgz 

  • In vCenter /var/log/vmware/rbd/rbd-cgi.log file, an error similar below is seen:

err: Operation Failed: _PyErr_SetObject: exception <class 'vmca.vmca_exception'> is not a BaseException subclass
YYYY-MM-DDTHH:MM:SS [1752679:Thread-1 (process_request_worker)]ERROR:pluginmaster:exception:rbdplugins.sslcert.vmwWaiterTgz -- 0:Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5
Message :UNKNOWN
:Operation Failed: _PyErr_SetObject: exception <class 'vmca.vmca_exception'> is not a BaseException subclass
Traceback (most recent call last):
  File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/pluginmaster.py", line 236, in _curry
  File "bora/install/vmvisor/autodeploy/var/rbdplugins/sslcert.py", line 234, in vmwWaiterTgz
  File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/vmcacertutil.py", line 149, in generateCert
  File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/vmcacertutil.py", line 82, in _handleVmcaUtilError
Exception: 0:Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5
Message :UNKNOWN
:Operation Failed: _PyErr_SetObject: exception <class 'vmca.vmca_exception'> is not a BaseException subclass
YYYY-MM-DDTHH:MM:SS [1752679:Thread-1 (process_request_worker)]WARNING:waitertgz:retrying waiter tgz because of rc: [None, None, None, None, None], except: [Exception("0:Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5\nMessage :UNKNOWN\n:Operation Failed: _PyErr_SetObject: exception <class 'vmca.vmca_exception'> is not a BaseException subclass")]
YYYY-MM-DDTHH:MM:SS [1752679:Thread-1 (process_request_worker)]INFO:director: - - "GET /vmw/rbd/host/effd2a0ffab9903d2657df8e6197ab60/waiter.tgz HTTP/1.1" 503 -

  • In vCenter /var/log/vmware/vmcad/vmcad-syslog.log: 

YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.715 [vmcad][INFO] [RPC] Exiting RpcVMCAGetSignedCertificate, Status = 5
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.599 [vmcad][INFO] [OPID :RPC] Entering RpcVMCAGetSignedCertificate
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][INFO] Checking upn: cn=CAAdmins,cn=Builtin,dc=vsphere,dc=local against CA admin group: autodeploy-########-####-####-####-############@vsphere.local
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][INFO] Checking user's group: CN=ServiceProviderUsers,DC=vsphere,DC=local against CA admin group: cn=CAAdmins,cn=Builtin,dc=vsphere,dc=local
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][WARNING] [lotus/vmca/service/auth.c:VMCALdapAccessCheck:92] error code: 0x00000005
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][INFO] VMCACheckAccessKrb: Access denied as user is not administrator
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][WARNING] [lotus/vmca/service/rpcserv.c:VMCACheckAccess:111] error code: 0x00000005
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][WARNING] [lotus/vmca/service/rpcserv.c:RpcVMCAGetSignedCertificate:372] error code: 0x00000005
YYYY-MM-DDTHH:MM:SS info vmcad YYYY-MM-DDTHH:MM:SS.604 [vmcad][INFO] [RPC] Exiting RpcVMCAGetSignedCertificate, Status = 5

  • Validation using an LDAP browser like JXplorer confirms the "autodeploy-########-####-####-####-############" user is missing from CAAdmin group. 

Environment

  • vCenter 7.x
  • vCenter 8.x

Cause

This is observed when the "autodeploy-########-####-####-####-############" user is missing from the CAAdmins group. This could also have been caused if a domain re-point has been performed to the vCenter.

Resolution

Note: Ensure to take a snapshot of the vCenter before proceeding with the steps below. If vCenter is in Enhanced Linked Mode(ELM), refer to this link for snapshot best practices VMware vCenter in Enhanced Linked Mode pre-changes snapshot (online or offline) best practice

Add the missing "autodeploy-########-####-####-####-############" user to the CAAdmins group. To add the autodeploy-########-####-####-####-############ user to the CAAdmins group, please perform the following steps

    1. SSH into the affected vCenter
    2. Run the below command
      /usr/lib/vmware-vmafd/bin/dir-cli group modify --name CAAdmins --add autodeploy-########-####-####-####-############
    3. The host should continue to boot into ESXi