Host shows the vSphere HA status as Network Partitioned
book
Article ID: 307480
calendar_today
Updated On:
Products
VMware vCenter ServerVMware vCenter Server 8.0VMware vCenter Server 7.0VMware vCenter Server 6.0
Issue/Introduction
Symptoms:
A host reports the vSphere HA status as Network Partitioned
In the fdm.log file of the network partitioned host, you see these entries:
info 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::MainLoop] curState 1 lastState 3 info 'Election' opID=SWI-eb460f87] Startup: Got AmMaster 0 info 'Election' opID=SWI-eb460f87] [ClusterElection::ChangeState] Startup => SlaveConnecting : StartupStateFunc 'Cluster' opID=SWI-eb460f87] Change state to SlaveConnecting:20224569933819 verbose 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::CheckElectionState] Transitioned from Startup to SlaveConnecting info 'Cluster' opID=SWI-eb460f87] [ClusterManagerImpl::ConnectToMaster] Connecting to master host-###@ #.#.#.#:8182 info 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::MainLoop] curState 3 lastState 1 warning 'Libs' opID=SWI-2e80d2bf] SSL_VerifyX509: Certificate verification is disabled, so connection will proceed despite the error verbose 'Cluster' opID=SWI-eb460f87] [ClusterManagerImpl::VerifyHost] Thumbprint match ##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:## for host host-### 'Cluster' opID=SWI-eb460f87] [ClusterManagerImpl::ConnectToMaster] Connected to master @ host-### 2012-01-23T15:51:55.105Z [FFE48B90 info 'Election' opID=SWI-eb460f87] Slave to host @ #.#.#.# 2012-01-23T15:51:55.105Z [FFE48B90 info 'Election' opID=SWI-eb460f87] [ClusterElection::ChangeState] SlaveConnecting => Slave : SlaveConnectingStateFunc 2012-01-23T15:51:55.105Z [FFE48B90 info 'Cluster' opID=SWI-eb460f87] Change state to Slave:20224569933819 2012-01-23T15:51:55.105Z [FFE07B90 verbose 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::CheckElectionState] Transitioned from SlaveConnecting to Slave 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::MainLoop] curState 4 lastState 3 'Election' opID=SWI-eb460f87] [ClusterElection::ChangeState] Slave => Startup : Lost master 'Cluster' opID=SWI-eb460f87] Change state to Startup:0 verbose 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::CheckElectionState] Transitioned from Slave to Startup info 'Message'] Destroying connection
In the fdm.log file on the host that is listed as the master for the HA enabled cluster, you see these entries:
verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::IsBAdIP] #.#.#.# is bad ip verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::InvalidCredentialsIP::IsBadIP #.#.#.# is in the bad ip manager warning 'Election' opID=SWI-14c14582ReadMsg [60 times] Received messge from bad ip #.#.#.# - dropping verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::Verify Host] Thumbprint mismatch (##:##:##:##:##:##:##:##:##:##:A3:7F: verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::InvalidCredentialsIP::SetBadIP] Blacklisting ip address #.#.#.# for 60 seconds warning 'Cluster Slave host-124 has invalid credentials - closing connection
Environment
VMware vCenter Server 6.7
VMware vCenter Server 7.0
VMware vCenter Server 8.0
Cause
This issue occurs when the SSL certificate thumbprint presented to the master host is not what the master host is expecting. This is indicated by the thumbprint mismatch error in the fdm.log file of the master host.
Resolution
To resolve this issue:
Disable HA.
To disable HA:
In the vCenter Server inventory, right-click the cluster and click Edit Settings.
Deselect the Turn On vSphere HA option.
Click OK.
Wait for all hosts to reconfigure HA in Recent Tasks.
Ensure that SSL Certificate Checking is enabled.
For vCenter 5.0 and earlier :
In the vSphere Client connected to the vCenter Server, click Administration > vCenter Server Settings.
If the vCenter Server system is a part of a connected group, select the server you want to configure from the Current vCenter Server dropdown.
In the settings list, select SSL Settings.
For vCenter Server 5.1 and later:
In the vSphereWeb Client, navigate to the vCenter Server instance.
Click the Manage tab.
Under Settings, click General.
Click Edit and select SSL settings.
Select vCenter requires verified host SSL certificates. If there are hosts that require manual validation, these hosts appear in the host list at the bottom of the dialog.
Click OK.
Determine the host thumbprint for each host that requires validation.
Log in to the direct console. For more information, see the Using the ESXi Shell section
Click View Support Information in the System Customization menu. The thumbprint is displayed in the right column.
Notes:
If you do not have access to the direct console, connect a vSphere Client that has not installed the hosts certificate directly to the host. When it prompts for certificate confirmation, click View Certificate > Details and then scroll down to Thumbprint.
If your issue is occurring because the SSL Thumbprints do not match, when you click OK, all listed hosts disconnect from vCenter Server. Reconnect each host to refresh the SSL thumbprints. This requires the root password.
Compare the thumbprint you obtained from the host with the thumbprint listed in the vCenter Server Settings dialog.
If the thumbprints match, select the check box for the host.
Click OK. Hosts that you have not selected are now disconnected.
Reconnect the host to vCenter Server.
To reconnect the host:
Right-click the disconnected host and click Connect.
When prompted, enter host's credentials to connect the host back to vCenter Server.
You see a popup that mentions the SHA1 thumbprint of the SSL certificate for the host.
Click Yes. The host is now connected to vCenter Server.
Enable HA.
To enable HA:
In the vCenter Server inventory, right-click the cluster and click Edit Settings.
Select the Turn On vSphere HA option.
Click OK.
Wait for all hosts to reconfigure HA in Recent Tasks.
The host should now show the vSphere HA status as connected (slave) or connected (master), depending on the election results when HA was enabled.