Auto Deploy boot stuck at waiter.tgz: ValueError Days Left for Certificate Expiry
search cancel

Auto Deploy boot stuck at waiter.tgz: ValueError Days Left for Certificate Expiry

book

Article ID: 437637

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Stateless ESXi host boot remains stuck at the end of the boot process at waiter.tgz

  • The /var/log/vmware/rbd/rbd-cgi.log file on the vCenter Server reports that the VMCA root certificate is approaching expiration, showing errors similar to the following: 


YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:director:request -- Request[{'headers_out': {}, 'uri': '/4EZ63AtKeW2NKm4h/vmw/rbd/host/d#######################/waiter.tgz', 'environ': {'REQUEST_METHOD': 'GET', 'QUERY_STRING': ''}, 'rfile': <_io.BufferedReader name=15>, 'headers_in': <http.client.HTTPMessage object at 0x7f2344368810>, 'remote_ip': '1#.##.##.##', '
request': <socket.socket fd=15, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 35069), raddr=('127.0.0.1', 43168)>}]
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:pxe_context:active rule sets -- ['RuleSet-10']
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:pxe_context:finding host d#######################
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:pxe_context:d#######################: host identifier 'uuid=f#######-4###-e###-####-###########' maps to user-specified host moref: ('host-3##', 1)
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:pxe_context:d#######################: host identifier 'mac=00:##:##:##:##:##' maps to user-specified host moref: None
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:pxe_context:d#######################: found the following host mappings -- ['host-3##', None, None]
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:pxe_context:host ID d####################### maps to Vim.HostSystem (1, 'host-3##')
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:waitertgz:Preparing waitertgz for kernelversion : 6.5.0
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:beacon:Finding address family preference based on vcAddress(HOSTNAME.DOMAIN)
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:beacon:Sorting the list with IPv4 addresses first.
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:beacon:Adding etc/vmware/autodeploy/waiterNotify.json
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:hostprofile:admin password is in the answer file for -- host-3##
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslutil:cert files are missing from /var/lib/rbd/ssl/host-3##
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslutil:cert files are missing from /var/lib/rbd/ssl/d#######################
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslcert:Using VC_FQDN as the expected hostname
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslutil:cert files are missing from /var/lib/rbd/ssl/host-3##
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslcert:Generating SSL cert for d####################### (VC_FQDN)
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslcert:Validating certificate checks for hostId: d#######################
YYYY:MM:DDTHH:MM:SS.Z [1821]INFO:sslutil:Days left for expiry: 202 days
YYYY:MM:DDTHH:MM:SS.Z [1821]ERROR:pluginmaster:exception:rbdplugins.sslcert.vmwWaiterTgz -- The days left for certificate expiry is less than the threshold value, Days_left:202, Configured_threshold:240
Traceback (most recent call last):
  File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/pluginmaster.py", line 236, in _curry
  File "bora/install/vmvisor/autodeploy/var/rbdplugins/sslcert.py", line 249, in vmwWaiterTgz
  File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/sslutil.py", line 133, in validateCertFiles
  File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/sslutil.py", line 115, in validateCert
ValueError: The days left for certificate expiry is less than the threshold value, Days_left:202, Configured_threshold:240

 

 

Environment

  • VMware vCenter Server 7.x 
  • VMware vSphere ESXi 8.x

Cause

In vSphere 7.x, the vCenter Server signs host certificates for ESXi hosts booting via Auto Deploy. If the VMCA signing certificate has less than 240 days of validity left (controlled by the advanced setting vpxd.certmgmt.certs.softThreshold), the deployment loop fails during the certificate generation phase.

Resolution

Workaround: To temporarily bypass the issue and allow ESXi hosts to boot, you can lower the certificate expiration threshold:

  1. Log in to the vSphere Client.

  2. Select the vCenter Server in the inventory.

  3. Navigate to Configure > Advanced Settings.

  4. Click Edit Settings and filter for vpxd.certmgmt.certs.softThreshold.

  5. Modify the value to 30 (or a value lower than the current days remaining on the certificate).

  6. Click Save.

  7. Reboot the affected ESXi host to verify it successfully bypasses the waiter.tgz state.

Permanent Resolution: To permanently resolve the issue perform a full VMCA root certificate renewal. Once the certificate is renewed, revert the vpxd.certmgmt.certs.softThreshold advanced setting back to the default value of 240.

Additional Information

For further guidance on certificate management and renewal procedures of the VMCA root, refer the Broadcom Techdoc Regenerate a New VMCA Root Certificate and Replace All Certificates