vSAN Health Check fails with message "Invalid Unicast List" and Rebooting vCenter Results in Incorrect Unicast Agent List Causing a Cluster Partition
book
Article ID: 326460
calendar_today
Updated On:
Products
VMware vSAN
Issue/Introduction
Symptoms: The alert "Invalid unicast list" is triggered in vSAN health check even though the cluster is formed. If the vCenter is rebooted, unicast agent list of data nodes is altered by vCenter.
Impact/Risks: If witness previously used a different IP for vSAN/witness traffic, rebooting the vCenter causes vCenter to push the old witness IP to the unicast agent list of the data nodes.
If resolution below cannot be implemented, enable IgnoreClusterMemberListUpdates on data nodes after cluster is formed to prevent vCenter from overwriting the unicast list:
This is caused by vCenter holding on to old witness IP address(s). When rebooting vCenter, the vCenter pushes this old IP address to all of the nodes in the cluster resulting in a partition.
Resolution
This can be fixed by re-configuring the the stretched cluster. Note that objects will be in reduced availability during the process.
Disable IgnoreClusterMemberListUpdates on all data nodes
Command to do that is "esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates"
Put witness into maintenance mode
Delete its diskgroups
Go to fault domains and disable the stretched cluster
SSH to witness and make sure it is not part of a cluster
Use "esxcli vsan cluster get" to check if it is part of the cluster
Use "esxcli vsan cluster leave" to leave the cluster if it is still part of one
Go to fault domains and re-configure the stretched cluster
Additional Information
Impact/Risks: If witness previously used a different IP for vSAN/witness traffic, rebooting the vCenter causes vCenter to push the old witness IP to the unicast agent list of the data nodes.
If resolution below cannot be implemented, enable IgnoreClusterMemberListUpdates on data nodes after cluster is formed to prevent vCenter from overwriting the unicast list: