Host shows the vSphere HA status as Network Partitioned
search cancel

Host shows the vSphere HA status as Network Partitioned

book

Article ID: 307480

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • A host reports the vSphere HA status as Network Partitioned
  • In the fdm.log file of the network partitioned host, you see these entries:

    info 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::MainLoop] curState 1 lastState 3
    info 'Election' opID=SWI-eb460f87] Startup: Got AmMaster
    0 info 'Election' opID=SWI-eb460f87] [ClusterElection::ChangeState] Startup => SlaveConnecting : StartupStateFunc
    'Cluster' opID=SWI-eb460f87] Change state to SlaveConnecting:20224569933819
    verbose 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::CheckElectionState] Transitioned from Startup to SlaveConnecting
    info 'Cluster' opID=SWI-eb460f87] [ClusterManagerImpl::ConnectToMaster] Connecting to master host-143 @ 10.4.252.33:8182
    info 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::MainLoop] curState 3 lastState 1
    warning 'Libs' opID=SWI-2e80d2bf] SSL_VerifyX509: Certificate verification is disabled, so connection will proceed despite the error
    verbose 'Cluster' opID=SWI-eb460f87] [ClusterManagerImpl::VerifyHost] Thumbprint match ##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:##:## for host host-143
    'Cluster' opID=SWI-eb460f87] [ClusterManagerImpl::ConnectToMaster] Connected to master @ host-143
    2012-01-23T15:51:55.105Z [FFE48B90 info 'Election' opID=SWI-eb460f87] Slave to host @ 10.4.252.33
    2012-01-23T15:51:55.105Z [FFE48B90 info 'Election' opID=SWI-eb460f87] [ClusterElection::ChangeState] SlaveConnecting => Slave : SlaveConnectingStateFunc
    2012-01-23T15:51:55.105Z [FFE48B90 info 'Cluster' opID=SWI-eb460f87] Change state to Slave:20224569933819
    2012-01-23T15:51:55.105Z [FFE07B90 verbose 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::CheckElectionState] Transitioned from SlaveConnecting to Slave
    'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::MainLoop] curState 4 lastState 3
    'Election' opID=SWI-eb460f87] [ClusterElection::ChangeState] Slave => Startup : Lost master
    'Cluster' opID=SWI-eb460f87] Change state to Startup:0
    verbose 'Cluster' opID=SWI-8948711d] [ClusterManagerImpl::CheckElectionState] Transitioned from Slave to Startup
    info 'Message'] Destroying connection

     
  • In the fdm.log file on the host that is listed as the master for the HA enabled cluster, you see these entries :

    verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::IsBAdIP] 192.168.111.41 is bad ip
    verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::InvalidCredentialsIP::IsBadIP 291.168.111.41 is in the bad ip manager
    warning 'Election' opID=SWI-14c14582ReadMsg [60 times] Received messge from bad ip 192.168.111.41 - dropping
    verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::Verify Host] Thumbprint mismatch (##:##:##:##:##:##:##:##:##:##:A3:7F:
    verbose 'Cluster' opID=SWI-14c14582 '[ClusterManagerImpl::InvalidCredentialsIP::SetBadIP] Blacklisting ip address 192.168.111.41 for 60 seconds
    warning 'Cluster Slave host-124 has invalid credentials - closing connection



Environment

VMware vCenter Server 5.1.x
VMware vCenter Server 5.0.x

Cause

This issue occurs when the SSL certificate thumbprint presented to the master host is not what the master host is expecting. This is indicated by the thumbprint mismatch error in the fdm.log file of the master host.

Resolution

To resolve this issue:
  1. Disable HA.

    To disable HA:
    1. In the vCenter Server inventory, right-click the cluster and click Edit Settings.
    2. Deselect the Turn On vSphere HA option.
    3. Click OK.
    4. Wait for all hosts to reconfigure HA in Recent Tasks.
       
  2. Ensure that SSL Certificate Checking is enabled.

    For vCenter 5.0 and earlier :
    1. In the vSphere Client connected to the vCenter Server, click Administration > vCenter Server Settings.
    2. If the vCenter Server system is a part of a connected group, select the server you want to configure from the Current vCenter Server dropdown.
    3. In the settings list, select SSL Settings.

    For vCenter Server 5.1 and later:
    1. In the vSphere Web Client, navigate to the vCenter Server instance.
    2. Click the Manage tab.
    3. Under Settings, click General.
    4. Click Edit and select SSL settings.
       
  3. Select vCenter requires verified host SSL certificates. If there are hosts that require manual validation, these hosts appear in the host list at the bottom of the dialog.
  4. Click OK.
  5. Determine the host thumbprint for each host that requires validation.
    1. Log in to the direct console. For more information, see the Log in to the ESXI Shell section of the vSphere Installation and Setup Guide .
    2. Click View Support Information in the System Customization menu. The thumbprint is displayed in the right column.

      Notes:
      • If you do not have access to the direct console, connect a vSphere Client that has not installed the hosts certificate directly to the host. When it prompts for certificate confirmation, click View Certificate > Details and then scroll down to Thumbprint.
      • If your issue is occurring because the SSL Thumbprints do not match, when you click OK, all listed hosts disconnect from vCenter Server. Reconnect each host to refresh the SSL thumbprints. This requires the root password.
         
  6. Compare the thumbprint you obtained from the host with the thumbprint listed in the vCenter Server Settings dialog.
  7. If the thumbprints match, select the check box for the host.
  8. Click OK. Hosts that you have not selected are now disconnected.
     
  9. Reconnect the host to vCenter Server.

    To reconnect the host:
    1. Right-click the disconnected host and click Connect.
    2. When prompted, enter host's credentials to connect the host back to vCenter Server.

      You see a popup that mentions the SHA1 thumbprint of the SSL certificate for the host.

       
    3. Click Yes. The host is now connected to vCenter Server.
       
  10. Enable HA.

    To enable HA:
    1. In the vCenter Server inventory, right-click the cluster and click Edit Settings.
    2. Select the Turn On vSphere HA option.
    3. Click OK.
    4. Wait for all hosts to reconfigure HA in Recent Tasks.
The host should now show the vSphere HA status as connected (slave) or connected (master), depending on the election results when HA was enabled.

Additional Information

Note: This issue may also occur if proxy ARP is enabled on the ESX/ESXi management VLAN. To resolve the issue, disable Proxy ARP. For more information, see Troubleshooting network connection issues caused by proxy ARP (1005965).Troubleshooting network connection issues caused by Proxy ARP