Supervisor configuration operations failing after replacing vCenter certificate
search cancel

Supervisor configuration operations failing after replacing vCenter certificate

book

Article ID: 313182

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

/var/log/vmware/wcp/wcpsvc.log on vCenter shows several log messages with an x509 error about an unknown certificate authority. For instance, two examples are provided below-

error wcp [kubelifecycle/master_node.go:1280] [opID=659cfc6a-b22e5481-d124-4a9e-95aa-3b88aeb6abfb] Failed to remove VMs from list view. Err Post "https://vcenter-1.rainpole.local:443/sdk": x509: certificate signed by unknown authority
...
error wcp [storage/policy.go:87] [opID=upgrade-domain-c19] Failed to get list of cluster datastores. Err Post "https://sc2-10-186-87-18.eng.vmware.com:443/sdk": x509: certificate signed by unknown authority`, I think that there was one API client (PBM client) that wasn't included when handling notification about certificate replacement"


The error signature of relevance is the following:

Err Post "https://vcenter-1.rainpole.local:443/sdk": x509: certificate signed by unknown authority

Which indicates the WCP service being unable to trust vCenter's certificate.

Cause

In vCenter 8.0U2, non-disruptive certificate replacement was implemented to allow replacing certificates without service restarts. Please see https://core.vmware.com/resource/whats-new-vsphere-8-update-2#sec31672-sub3 for more information. wcpsvc (which manages vSphere Supervisors) is one such service.

On vCenter certificate replacement, wcpsvc reloads the trusted root certificates to initiate TLS connections to vCenter from /etc/ssl/certs on disk to work around a Golang limitation of cached system trust roots (Add ability to reload root certificates)

Occasionally, this certificate reload does not detect the updated vCenter trust bundles, and wcpsvc vCenter clients are no longer able to establish TLS connections to vCenter. This results in wcpsvc being unable to manage Supervisors or talk to vCenter.

Resolution

Currently there is no resolution. This will be fixed in a future release.


Workaround:

To workaround the issue, restart wcpsvc on vCenter, using appliance management or as root on the vCenter Server Appliance. This will force reload the updated certificates.

If restarting via appliance management, restart the 'Workload Control Plane' service.

If restarting as root on vCenter-
vmon-cli -r wcp