vSAN health alert "Configuration Check" triggered and cluster partition noticed.
search cancel

vSAN health alert "Configuration Check" triggered and cluster partition noticed.

book

Article ID: 418748

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • The partitioned ESXi host can successfully ping other hosts, and the unicast table remains intact. The ESXi host repeatedly joins the original vSAN cluster and then partitions again immediately.

  • The following error "Data-in-transit encryption is disabled on the vSAN cluster but is enabled on this host" is seen.

  • After applying firmware patches to ESXi host and rebooting the host the ESXi host becomes partitioned from the cluster.
  • A vSAN health alert, "Configuration Check," is triggered on the ESXi host that is observed to be partitioned from the cluster.


 

  • When investigating the health check details for the partitioned host, the message "Data-in-transit encryption is disabled on the host" is displayed or "Data-in-transit encryption is disabled on the vSAN cluster but is enabled on this host"

  • However, when running the command esxcli vsan network security get on the affected host, the "Data-in-transit" encryption status is seen as "True".  In contrast, when the same command is run against the non-partitioned ESXi hosts, their status correctly shows "False," which aligns with the vSAN cluster's setting where Data-in-transit encryption is disabled or Vice versa. 

Environment

VMWare vSAN 8.x

Cause

The issue is caused by a configuration mismatch in the Data-in-transit (DIT) encryption status among the ESXi hosts in the vSAN cluster.

The vSAN cluster requires all member ESXi hosts to adhere to a consistent security protocol for seamless and secure inter-host communication.

  • Configuration Mismatch:

    • The partitioned ESXi host has DIT encryption status set to opposite of the cluster. 

  • Communication Failure:

    • When the partitioned ESXi host attempts to join the cluster or send a heartbeat (communication packet), it encrypts the data payload because its DIT status is "TRUE."

    • The Master ESXi Host in the cluster, which operates with DIT status "FALSE" (disabled), expects unencrypted traffic from all members.

    • The Master Host cannot decrypt or process the incoming encrypted payload.

    • It interprets this unrecognized, encrypted traffic as an unauthenticated or invalid communication attempt and consequently rejects the connection.

This rejection prevents the host from establishing stable membership, resulting in the observed behavior where the host repeatedly joins and immediately partitions from the vSAN cluster.

Resolution

The resolution is to set the Data-in-Transit Encryption on the partitioned ESXi host to match with the cluster data-in-transit encryption setting using the command  "esxcli vsan network security set".  Since the cluster's DIT setting is disabled (FALSE), the partitioned host must also be set to FALSE.

Activate Data in transit encryption on the vSAN cluster.

  1. In the Hosts and Clusters inventory, select the vSphere cluster that uses vSAN as storage.

  2. Click the Configure tab and under vSAN, click Services.

  3. Click the Data ServicesEdit button.

  4. In the vSAN Services dialog box, activate the toggle switch of Data-In-Transit encryption, configure rekey interval and click Apply

Disabled Data in transit encryption form the hosts that have it enabled 

  1. Connect to the hosts with SSH.

  2. Perform esxcli vsan network security set -e false on host.

  3. Check 'Data-in-Transit Encryption status' is false by performing command

[root@esxihost:~] esxcli vsan network security get
   Sub-Cluster UUID: ########-####-####-############
   Data-in-Transit Encryption status: false
   Rekey Interval (in minutes): 1440

This will ensure that Data in Transport Encryption is disabled and now in alignment with Cluster settings.

 

NOTE: This issue can also happen when the vSAN cluster Data-in-Transit (DIT) encryption is enabled (True) and the partitioned ESXi host DIT setting is disabled (False). The fundamental requirement is to maintain a consistent DIT configuration across all hosts in the vSAN cluster.