ESXI bootup through Auto-Deploy stuck at /vmw/rbd/host/xxxxxxxxxx/waiter.tgz
search cancel

ESXI bootup through Auto-Deploy stuck at /vmw/rbd/host/xxxxxxxxxx/waiter.tgz

book

Article ID: 323193

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

To recreate a missing waiter user and/or rectify the missing permissions 


Symptoms:

ESXi host deployment with autodeploy fails to boot and stuck at loading screen, with an error: /vmw/rbd/host/xxxxxxxxxx/waiter.tgz 

Log Location
In the /var/log/vmware/rbd/rbd-cgi.log file, you will see error similar to:
  
YYYY-MM-DDTHH:MM:SS.602 [54236]ERROR:vmcacertutil: Could not generate certificates for: abcd.example.com: 0 out: b'Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5\nMessage :UNKNOWN\n'err: b"Operation Failed: exception <class 'vmca.vmca_exception'> not a BaseException subclass"
YYYY-MM-DDTHH:MM:SS.642[54236]ERROR:pluginmaster:exception:rbdplugins.sslcert.vmwWaiterTgz -- 0:b'Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5\nMessage :UNKNOWN\n':b"Operation Failed: exception <class 'vmca.vmca_exception'> not a BaseException subclass"Traceback (most recent call last):

rbd.vmca operation failed error
YYYY-MM-DDTHH:MM:SS.445 [35946]INFO:rbd-vmca-certificate:generating certificates for: abcd.example.com, , 10.10.xx.xx, /var/lib/rbd/ssl/XXXXXXXXXXXXXXXXXXX, rui.key, rui.crt
YYYY-MM-DDTHH:MM:SS.580 [35946]ERROR:rbd-vmca-certificate:Operation Failed
Traceback (most recent call last):

Log location
/var/log/vmware/vmcad/vmcad-syslog.log 

YYYY-MM-DDTHH:MM:SS.863098-07:00 info vmcad  t@140531036313344: VMCACheckAccessKrb: Authenticated user [email protected]
YYYY-MM-DDTHH:MM:SS.867730-07:00 info vmcad  t@140531036313344: Checking upn: cn=CAAdmins,cn=Builtin,dc=vsphere,dc=local against CA admin group: [email protected]
YYYY-MM-DDTHH:MM:SS.867942-07:00 warning vmcad  t@140531036313344: error code: 0x00000005

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

 

 

Environment

VMware vCenter Server Appliance 6.7.x

Cause

 vCenter's waiter user is missing or does not have proper permissions.  This may occur after fixing replication issues, a broken upgrade, or cross-domain repoint.

Resolution

To resolve this issue, use the attached script called 'recreate_rbd_waiter.sh' 

1. Copy the script to the affected vCenter server appliance using a tool like WinSCP.
 

2. Modify permissions of the script so that it can be executed.

chmod +x recreate_rbd_waiter.sh

3.  Execute the script

./recreate_rbd_waiter.sh
4. Output should be shown as below:
# ./recreate_rbd_waiter.sh
RECREATE WAITER ACCOUNT
=======================

> Please enter password for [email protected]:
> Waiter account name detected: waiter-747e2b48-8e05-4bfa-9b9b-7c161c336369
> waiter-747e2b48-8e05-4bfa-9b9b-7c161c336369 does not exist!  Creating it...
|---- Generating password       SUCCESS!
|---- Creating the waiter account       SUCCESS!
|---- The following will succeed even if already set
|---- Add account to CAAdmins           SUCCESS!
|---- Set password to never expire      SUCCESS!
|---- Update password in database       SUCCESS!
> Script has finished.  Please restart the rbd service.
5.  Restart the RBD service:
service-control --stop vmware-rbd-watchdog && service-control --start vmware-rbd-watchdog
 


Workaround:
N/A

Additional Information

Impact/Risks:
No Impact

Attachments

recreate_rbd_waiter get_app