When you remediate ESXi hosts through Update Manager (VUM) in vCenter Server 8.0, the remediation operation fails. The VUM service crashes. This issue affects specific hosts. It may not impact all hosts in your environment.
You observe the following symptoms:
core.updatemgr-worke.*Note: Core dump files indicate a VUM service crash. However, the logs do not show specific error messages about certificate thumbprint mismatch. You confirm the diagnosis when disconnect/reconnect resolves the issue. This operation refreshes the stored certificate thumbprint.
Additional symptoms reported:
VMware vCenter Server Appliance 8.0 and later (all 8.0.x releases)
During ESXi host remediation, Update Manager validates the host connection. It compares the certificate thumbprint stored in the vCenter database against the host's current certificate. This validation ensures secure communication between vCenter and the ESXi host.
The Update Manager process should handle certificate thumbprint mismatches gracefully. It should generate an error message and fail the remediation operation. However, when VUM encounters a mismatch, the process does not handle this exception properly. Instead of returning an error, the VUM service crashes.
This crash results in:
The thumbprint mismatch can occur when:
Before you perform the resolution, identify which ESXi hosts have the certificate thumbprint mismatch.
1. Connect to the vCenter Server Appliance shell via SSH or direct console access
2. Check for VUM service crash indicators:
ls -lh /var/core/core.updatemgr-worke*
Understanding core dump files: Core dump files are memory snapshots. The system creates them when a process crashes unexpectedly. Files matching the pattern core.updatemgr-worke.* indicate the Update Manager worker process crashed. Note the timestamp of these files. Correlate them with remediation attempts.
3. Analyze the most recent core dump file to confirm thumbprint validation failure. Replace ##### with the actual process ID from the file name from step 2:
strings /var/log/vmware/vmware-updatemgr/vum-server/core.updatemgr-worke.##### | grep -i "failed to validate SSL thumbprint"
If you see the message Failed to validate SSL thumbprint of host, this confirms the crash. The crash occurred during certificate thumbprint validation. This message indicates a mismatch between the stored thumbprint and the host's current certificate.
4. Extract the affected host identifier from the core dump:
strings /var/log/vmware/vmware-updatemgr/vum-server/core.updatemgr-worke.##### | grep "host-" | less
Look for a host identifier that appears repeatedly throughout the output. For example, host-12345. The affected host will have numerous references in the core dump. Note this host ID number.
5. Correlate the host ID to the actual ESXi hostname:
.../host-12345/...)6. Alternatively, if you already tried remediation and observed failures, check VUM logs for the hostnames:
grep -E "host.*remediat|remediat.*host" /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server-default.log | grep -i "fail\|error" | tail -50
7. Check that VUM service crashes match remediation attempts:
grep -i "starting\|stopped\|restart" /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server-default.log | tail -30
If service restarts align with the core dump timestamps and remediation failures, continue with the resolution steps below.
Do the following steps for each affected ESXi host. This refreshes the certificate thumbprint.
1. Log in to the vCenter Server Client at https://vcenter-fqdn/ui
2. Go to the Hosts and Clusters view
3. Find the ESXi host that had failed remediation
4. Right-click the ESXi host and select Connection > Disconnect
5. Click Yes to confirm the disconnect operation
Note: Disconnecting a host from vCenter does not affect running virtual machines on that host. VMs continue to run normally during the disconnect and reconnect process. You do not need maintenance mode or virtual machine migration. The operation typically completes within seconds. However, HA protection and DRS may be temporarily unavailable for VMs on the disconnected host.
6. After the host status changes to Disconnected, right-click the host and select Connection > Connect
7. If prompted, enter the ESXi root credentials and click OK
8. Check that the host status shows Connected. The certificate thumbprint updates automatically during reconnection.
9. Start the Update Manager remediation operation for the host
10. Check that the remediation completes successfully. The VUM service should not crash.
If multiple hosts are affected, repeat steps 4 through 10 for each host. If you suspect this issue affects many hosts requiring remediation, test the resolution on one host first. Then apply remediation to additional hosts.
After you complete the resolution, confirm the issue is resolved.
1. Connect to the vCenter Server Appliance shell. Check that no new core dump files were created:
ls -lt /var/log/vmware/vmware-updatemgr/vum-server/core.updatemgr-worke* | head -5
2. Confirm VUM service stability. Check for unexpected restarts:
grep -i "starting\|stopped" /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server-default.log | tail -10
The VUM service should remain running. You should not see unexpected restarts.
3. In the vCenter Server Client, confirm host remediation operations complete successfully
For more information about managing ESXi host certificates, see [PLACEHOLDER: ESXi Certificate Management Documentation - awaiting URL].
For more information about Update Manager operations, see [PLACEHOLDER: Using VMware Update Manager/Lifecycle Manager Documentation - awaiting URL].
Related articles:
Diagnostic limitations:
The diagnostic indicators in this article may appear for other VUM issues as well. Core dumps and remediation failures are not unique to thumbprint mismatches. The distinguishing characteristic is the "Failed to validate SSL thumbprint of host" message in the core dump. Also, disconnect/reconnect resolves the problem by refreshing the stored thumbprint. If disconnect/reconnect does not resolve the remediation failure, investigate other potential causes.