One or more ESXi host Fault Tolerance vmkernel interface not able to communicate with the rest of the cluster fault tolerance vmkernel interfaces.
search cancel

One or more ESXi host Fault Tolerance vmkernel interface not able to communicate with the rest of the cluster fault tolerance vmkernel interfaces.

book

Article ID: 410508

calendar_today

Updated On:

Products

VMware

Issue/Introduction

  • Fault Tolerance network problem on one or more of the hosts in the cluster.
  • Fault Tolerance vmkernel interface network configuration looks identical (but different IP addresses in the same subnet) on every ESXi host in the cluster.
  • Fault Tolerance tag is correctly set on each ESXi host.
  • Connectivity using vmkping command is not working to other ESXi hosts over vmkernel port tagged with Fault Tolerance tag:  vmkping -I vmk# <IP of other host FT vmk> 
  • There are two active uplinks available in dvs portgroup in the Teaming and Failover setting.
  • Load balancing method in Teaming and Failover settings of dvs portgroup used by FT is not LACP or IP Hash.
  • Checking through esxtop command you see that affected host is connected to different vmnic from other hosts, for example ESXi1 using vmnic0 and ESXi2 is using vmnic1:
    • Login to ESXi1 over SSH
      • type:  esxtop
      • Then click on n for network
      • Take a note of TEAM-PNIC next to the vmk that is used by Fault Tolerance: 
           PORT-ID USED-BY                         TEAM-PNIC DNAME              PKTTX/s  MbTX/s   PSZTX    PKTRX/s  MbRX/s   PSZRX %DRPTX %DRPRX
          67108868 Management                            n/a DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
          67108870 Shadow of vmnic0                      n/a DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
          67108872 Shadow of vmnic1                      n/a DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
          67108873 vmk0                               vmnic0 DvsPortset-0          4.01    0.02  663.00       3.81    0.01  274.00   0.00   0.00
          67108874 vmk1                               vmnic1 DvsPortset-0        585.37   46.87 10495.00     819.21    0.64  102.00   0.00   0.00
        67108875 vmk2                               vmnic0 DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
    • Login to ESXi2 over SSH and do the same test as per above to check if the working host is using different vmnic. In the following example it is using vmnic1 which is different from above:
         PORT-ID USED-BY                         TEAM-PNIC DNAME              PKTTX/s  MbTX/s   PSZTX    PKTRX/s  MbRX/s   PSZRX %DRPTX %DRPRX
        67108868 Management                            n/a DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
        67108870 Shadow of vmnic0                      n/a DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
        67108872 Shadow of vmnic1                      n/a DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00
        67108873 vmk0                               vmnic0 DvsPortset-0          4.01    0.02  663.00       3.81    0.01  274.00   0.00   0.00
        67108874 vmk1                               vmnic1 DvsPortset-0        585.37   46.87 10495.00     819.21    0.64  102.00   0.00   0.00
      67108875 vmk2                               vmnic1 DvsPortset-0          0.00    0.00    0.00       0.00    0.00    0.00   0.00   0.00

Cause

Network connectivity between ESXi hosts Fault Tolerance vmkernel interfaces is impacted due to external network issue.

Resolution

Please engage network team to troubleshoot why network communication is not working over Fault Tolerance network when ESXi is sending traffic to two different physical switches.

Additional Information

vDS HealthCheck may also detect that the VLAN is missing in the trunk: vDS Health Check reports unsupported VLANs for MTU and VLAN