Mapping VMware High Availability Heartbeats and Communication Paths
search cancel

Mapping VMware High Availability Heartbeats and Communication Paths

book

Article ID: 322160

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

Symptoms:
You may experience these symptoms:
  • VMware High Availability (HA) shows false alerts
  • Cannot determine where a HA issue is generating


Resolution

When dealing with a HA problem in a cluster, it is very difficult to tell where the problem is originating.
 
If several hosts are showing issues, it may be any of the primary hosts which is having a problem, since client (non-primary) hosts do not talk directly to the current primary node. Therefore, the primaries share state communications with the community hosts. This is described as one of the scalability features of HA.
 
To identify the real primary host, view the following file:
/etc/opt/vmware/aam/vmware-sites

This file maps hostname-to-ID# for the environment

Logging details often show just the ID, not the hostname

Note: The ID is what is noted the /var/log/vmware/aam/procMon/{hostname}_fatal.out* file(s) as the host actually being used for the connections in the LAG and various EVENT statements. Host IDs are evidently the first digit before the / in the parentheses.

The vpxa logs only show the primary node as the connection, but this may not actually be the host which a secondary's communication actually goes to. The idea being to allow HA to distribute the heartbeat load among primaries.
Run the following commands on live systems to view the mapping of the cluster's hosts:
export FT_DIR=/opt/vmware/aam/bin
export FT_DOMAIN=vmware
$FT_DIR/ftcli
fcli> la -l
(S
hows mapping of the cluster hosts)