ESXi bootup through Auto-Deploy stuck at /vmw/rbd/host/######/waiter.tgz
search cancel

ESXi bootup through Auto-Deploy stuck at /vmw/rbd/host/######/waiter.tgz

book

Article ID: 323193

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • ESXi host deployment with Auto Deploy fails to boot and is stuck at loading screen, with an error: /vmw/rbd/host/#####/waiter.tgz 
  • /var/log/vmware/rbd/rbd-cgi.log in vCenter shows the following entries:

    YYYY-MM-DDTHH:MM:SS.602 [54236]ERROR:vmcacertutil: Could not generate certificates for: name.example.com: 0 out: b'Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5\nMessage :UNKNOWN\n'err: b"Operation Failed: exception <class 'vmca.vmca_exception'> not a BaseException subclass"
    YYYY-MM-DDTHH:MM:SS.642[54236]ERROR:pluginmaster:exception:rbdplugins.sslcert.vmwWaiterTgz -- 0:b'Error: 5, VMCAGetSignedCertificatePrivate() failedError Code : 5\nMessage :UNKNOWN\n':b"Operation Failed: exception <class 'vmca.vmca_exception'> not a BaseException subclass"Traceback (most recent call last):
    YYYY-MM-DDTHH:MM:SS.445 [35946]INFO:rbd-vmca-certificate:generating certificates for: abcd.example.com, , 10.10.##.##, /var/lib/rbd/ssl/#######, rui.key, rui.crt
    YYYY-MM-DDTHH:MM:SS.580 [35946]ERROR:rbd-vmca-certificate:Operation Failed
    Traceback (most recent call last):

  • /var/log/vmware/vmcad/vmcad-syslog.log in vCenter shows the following entries:

    YYYY-MM-DDTHH:MM:SS.863098-07:00 info vmcad  t@140531036313344: VMCACheckAccessKrb: Authenticated user waiter-########@vsphere.local
    YYYY-MM-DDTHH:MM:SS.867730-07:00 info vmcad  t@140531036313344: Checking upn: cn=CAAdmins,cn=Builtin,dc=vsphere,dc=local against CA admin group: waiter-########@vsphere.local
    YYYY-MM-DDTHH:MM:SS.867942-07:00 warning vmcad  t@140531036313344: error code: 0x00000005

    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vCenter Server Appliance 6.7.x

Cause

 vCenter's waiter user is missing or does not have proper permissions.  This may occur after fixing replication issues, a broken upgrade, or cross-domain repoint.

Resolution

  1. Download the script attached to this KB and upload to the affected vCenter server appliance using a tool like WinSCP. See How to upload or download files to or from vCenter and ESXi hosts for more information.

  2. Modify permissions of the script so that it can be executed:

       chmod +x recreate_rbd_waiter.sh

  3.  Execute the script with the below command:

       ./recreate_rbd_waiter.sh

    Output:

      ./recreate_rbd_waiter.sh
       RECREATE WAITER ACCOUNT
       =======================
       > Please enter password for [email protected]:
       > Waiter account name detected: waiter-7######8-8##5-4###-9###-7##########9
       > waiter-7######8-8##5-4###-9###-7##########9 does not exist!  Creating it...
       |---- Generating password       SUCCESS!
       |---- Creating the waiter account       SUCCESS!
       |---- The following will succeed even if already set
       |---- Add account to CAAdmins           SUCCESS!
       |---- Set password to never expire      SUCCESS!
       |---- Update password in database       SUCCESS!
       Script has finished.  Please restart the rbd service.

  4. Restart the RBD service:

    service-control --stop vmware-rbd-watchdog && service-control --start vmware-rbd-watchdog

Attachments

recreate_rbd_waiter get_app