After vCenter server certificate replacement operation or the trusted roots import into vCenter server, vSphere Autodeploy Service failed to load the new certificates.
YYYY-MM-DDTHH:MM:SS [pool-20-thread-1 [] ERROR com.vmware.certificatemanagement.notifications.AsyncNotifier opId=] Failed to notify AUTODEPLOY on http://localhost:1080/external-vecs/http1/localhost/6501/vmw/rbd/config/refresh-certificates, retrying again.
YYYY-MM-DDTHH:MM:SS [pool-20-thread-1 [] ERROR com.vmware.certificatemanagement.notifications.AsyncNotifier opId=] Failed to notify AUTODEPLOY on http://localhost:1080/external-vecs/http1/localhost/6501/vmw/rbd/config/refresh-certificates, retrying again.
YYYY-MM-DDTHH:MM:SS [pool-20-thread-1 [] ERROR com.vmware.certificatemanagement.notifications.AsyncNotifier opId=] Failed to notify AUTODEPLOY on http://localhost:1080/external-vecs/http1/localhost/6501/vmw/rbd/config/refresh-certificates, retrying again.
YYYY-MM-DDTHH:MM:SS [pool-20-thread-1 [] ERROR com.vmware.certificatemanagement.notifications.AsyncNotifier opId=] Final error while notifying AUTODEPLOY on http://localhost:1080/external-vecs/http1/localhost/6501/vmw/rbd/config/refresh-certificates
java.lang.Exception: Failed to notify AUTODEPLOY on http://localhost:1080/external-vecs/http1/localhost/6501/vmw/rbd/config/refresh-certificates HTTP Error code: 503 Failed HTTP error message : Service Unavailable ErrorStream: no healthy upstream
Autodeploy service is not running.
# service-control --status rbd
Stopped:
rbd
Stop the Autodeploy service using the service-control --stop --all command.
/var/log/vmware/cloudvm/service-control.log
YYYY-MM-DDTHH:MM:SS INFO service-control ********** Start ['--stop', '--all', '--ignore'] **********
Unsubscribe operation was failed during shutdown in service-control --stop --all command.
/var/log/vmware/rbd/rbd-watchdog-linux.log
YYYY-MM-DDTHH:MM:SS [251670:MainThread]INFO:rbd_watchdog_linux:Unsubsribing from NDC
YYYY-MM-DDTHH:MM:SS [251670:MainThread]ERROR:rbd_watchdog_linux:Failed to unsubscribe from NDC
Traceback (most recent call last):
File "/var/lib/rbd/bin/rbd_watchdog_linux.py", line 486, in main
refreshcertsutil.NonDisruptiveCerts.unsubscribe()
File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/refreshcertsutil.py", line 164, in unsubscribe
File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/vapiutil.py", line 179, in createVsphereClient
File "bora/install/vmvisor/autodeploy/site-packages/vmware/rbd/utils/svcaccountutil.py", line 176, in getStsHokSamlAssertion
File "/usr/lib/vmware/site-packages/pyVim/ssov2.py", line 72, in get_hok_saml_assertion_for_service_user
hok_token = self.perform_request(soap_message, public_key, private_key,
File "/usr/lib/vmware/site-packages/pyVim/sso.py", line 264, in perform_request
webservice.endheaders()
File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/usr/lib/python3.10/http/client.py", line 942, in connect
self.sock = self._create_connection(
File "/usr/lib/python3.10/socket.py", line 845, in create_connection
raise err
File "/usr/lib/python3.10/socket.py", line 833, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
vCenter Server 7.x
vCenter Server 8.x
This issue occurs because the autodeploy service remains subscribed to the certificatemanagement service even after the autodeploy service has stopped.
Subscriptions to the certificatemanagement service are created when a service starts and removed when the service stops.
The certificatemanagement service notifies its subscribers whenever a certificate is renewed or imported.
However, when running the service-control --stop --all command, the certificatemanagement service stops before the rbd service.
This causes the unsubscribe operation to fail, leaving the subscription active.
As a result, the certificatemanagement service attempts to send certificate update notifications to the already stopped rbd service, resulting in failed notifications with 503 error and triggering an alarm.
Broadcom VCF engineering is aware of this issue and working towards a fix.
Workaround:
To workaround this issue, start and stop autodeploy service manually.
service-control --start rbd && service-control --stop rbd