Hosts with custom certs appear non-responsive when vmware-vsan-health is enabled on vCenter
search cancel

Hosts with custom certs appear non-responsive when vmware-vsan-health is enabled on vCenter

book

Article ID: 327021

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSAN

Issue/Introduction

Symptoms:
- You are using VMCA as an intermediate/subordinate certification
- You have verified all certificates across the environment are correct and consistent

When starting the vmware-vsan-health service on the vCenter, we see the following error messages in the vpxa.log on the hosts:

2020-01-28T00:27:38.423Z info vpxa[16742038] [Originator@6876 sub=Default opID=vsan-PC-59d27b8701b25-sq111:j1-W910-sq123:j2-c6-98] [VpxLRO] -- ERROR lro-52 -- vsanSystem -- vim.host.VsanSystem.fetchVsanSharedSecret: vim.fault.NotAuthenticated:
--> Result:
--> (vim.fault.NotAuthenticated) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = <unset>,
-->    object = 'vim.host.VsanSystem:vsanSystem',
-->    privilegeId = "none"
-->    msg = "Received SOAP response fault from [<cs p:000000a694a672d0, TCP:localhost:8307>]: fetchVsanSharedSecret
--> The session is not authenticated."
--> }
--> Args:
-->

This eventually leads to the host disconnecting from vCenter with the following messages in the vpxa.log:

2020-01-27T23:58:24.985Z error vpxa[16721871] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:58:34.991Z error vpxa[16721873] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:58:44.993Z error vpxa[16721898] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:58:54.995Z error vpxa[16721875] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:59:04.990Z error vpxa[16721874] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:59:14.998Z error vpxa[16721871] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:59:25.002Z error vpxa[16721899] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500
2020-01-27T23:59:35.006Z error vpxa[16721870] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 500

Environment

VMware vCenter Server 6.5.x
VMware vSAN 6.7.x
VMware vCenter Server 6.7.x
VMware vSAN 6.5.x

Cause

This is due to VsanMgmtAdapters.py [located in /usr/lib/vmware-vpx/vsan-health/pyMoVsan/] calculating the thumbprint of the hosts incorrectly.

Resolution

Upgrade vCenter to 6.7U3b build 15132721 or higher and ESXi to 6.7U3 P01 build 15160138 or higher.

Workaround:
We need to make changes to VsanMgmtAdapters.py to adjust the way it calculates the host certificate thumbprint.
Download the attached script "fix_vsanMgmtAdapters.sh" to /tmp directory in vCenter

Give it executable permissions.
# chmod +x /tmp/fix_vsanMgmtAdapters.sh

Run the script 
# /tmp/fix_vsanMgmtAdapters.sh

After this is done, restart the management agents on the host with the following commands.

/etc/init.d/hostd restart
/etc/init.d/vpxa restart

Additional Information

Impact/Risks:
Hosts go non-responsive in vCenter losing management capabilities. This affects both vSAN and non-vSAN clusters managed by the same vCenter.

Attachments

fix_vsanMgmtAdapters_sh get_app