"ESXi cannot connect to the KMS server" error on Skyline Health after vCenter appliance certificate change
search cancel

"ESXi cannot connect to the KMS server" error on Skyline Health after vCenter appliance certificate change

book

Article ID: 381893

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

On Skyline health, an error is seen showing "Failed to get host encryption health result" after renewing certificate on vCenter appliance.

In the /var/log/vmware/vsan-health/vmware-vsan-health-service.log in vCenter, this is seen below:    

ERROR vsan-mgmt[412291] [VsanVapiUtil::GetVapiConfigStubBySolUser opID=06e96157] Fail to connect vAPI by solution user vpxd-extension
Traceback (most recent call last):
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 161, in GetVapiConfigStubBySolUser
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 140, in _getConfigStubBySolUser
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 103, in _getSamlToken
  File "/usr/lib/vmware/site-packages/pyVim/sso.py", line 388, in get_hok_saml_assertion
  File "/usr/lib/vmware/site-packages/pyVim/sso.py", line 277, in perform_request
pyVim.sso.SoapException: SoapException:
faultcode: ns0:FailedAuthentication
faultstring: Invalid credentials
faultxml: <?xml version='1.0' encoding='UTF-8'?><S:Envelope xmlns:S=""><faultcode xmlns:ns0="http://docs.oasis-open.org/ws-sx/ws-trust/200512">ns0:FailedAuthentication</faultcode><faultstring>Invalid credentials</faultstring></S:Fault></S:Body></S:Envelope>
vsan-mgmt[412291] [VsanHealthEncUtil::GenerateEncryptionHealthSummaryForKmx opID=06e96157] Error when GetVpxdHostProviderInfo:
Traceback (most recent call last):
  File "bora/vsan/health/esx/pyMo/VsanHealthEncUtil.py", line 533, in GenerateEncryptionHealthSummaryForKmx
  File "bora/vsan/clusterconfig/vpxd/pyMoVsan/VsanVcEncryption.py", line 470, in GetVpxdHostProviderInfo
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 250, in GetConfigStub
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 164, in GetVapiConfigStubBySolUser
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 161, in GetVapiConfigStubBySolUser
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 140, in _getConfigStubBySolUser
  File "bora/vsan/health/vpxd/pyMoVsan/VsanVapiUtil.py", line 103, in _getSamlToken
  File "/usr/lib/vmware/site-packages/pyVim/sso.py", line 388, in get_hok_saml_assertion
  File "/usr/lib/vmware/site-packages/pyVim/sso.py", line 277, in perform_request
pyVim.sso.SoapException: SoapException:
faultcode: ns0:FailedAuthentication
faultstring: Invalid credentials
faultxml: <?xml version='1.0' encoding='UTF-8'?><S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">faultcode xmlns:ns0="http://docs.oasis-open.org/ws-sx/ws-trust/200512">ns0:FailedAuthentication</faultcode><faultstring>Invalid credentials</faultstring></S:Fault></S:Body></S:Envelope>

In the same logs, this is observed "host encryption health checking"

Then VC health reports error about host encryption health checking
WARNING vsan-mgmt[66558] [VsanHealthEncUtil::_AggregateEncryptionConfigHealthForKmx opID=06e96157] Host: encryption health error: (vim.fault.VsanFault) {
  faultMessage = (vmodl.LocalizableMessage) [
    (vmodl.LocalizableMessage) {
      key = 'com.vmware.vsan.health.msg.list.kmxa.provider.error',
      message = 'get provider info error, please check the health logs'
    }
  ]
}
WARNING vsan-mgmt[66558] [VsanHealthEncUtil::_AggregateEncryptionConfigHealthForKmx opID=06e96157] Host:  health error: (vim.fault.VsanFault) {
  faultMessage = (vmodl.LocalizableMessage) [
    (vmodl.LocalizableMessage) {
      key = 'com.vmware.vsan.health.msg.list.kmxa.provider.error',
      message = 'get provider info error, please check the health logs'

 

Environment

VMware vSphere vCenter 8.0.x

Cause

This is caused due to /storage/vsan-health/vpxd-extension.cert and /storage/vsan-health/vpxd-extension.key having stale credentials, which is used by SSO authentication, leading to authorization failing. 

Resolution

The fix is planned to be included in a future release.

 The current workaround for this issue is:
   1. Login to VC via SSH.
   2. Backup vpxd-extension.cert and vpxd-extension.key.
       ----> cp /storage/vsan-health/vpxd-extension.cert /storage/vsan-health/vpxd-extension.cert.bak
       ----> cp /storage/vsan-health/vpxd-extension.key /storage/vsan-health/vpxd-extension.key.bak

   3. Remove vpxd-extension.cert and vpxd-extension.key.
       ----> rm /storage/vsan-health/vpxd-extension.cert
       ----> rm /storage/vsan-health/vpxd-extension.key

   4. Once this is removed, please do a Skyline health reset on vCenter interface by going on the cluster lever, click on monitor, then go to vSAN skyline health and then initiate a retest. This should clear the alarm.