The vSAN performance service alarm 'Stats primary election' and vSAN network alarm 'Hosts with connectivity issues' are triggered together and auto resolves.
These two alerts keep repeating.
The host reporting the 'Hosts with connectivity issues' issue keeps changing.
There are no issues over the management network between the hosts and vCenter.
Restarting vsanmgmtd service on the host does not help resolve the issue as per vSAN Health Service - Network Health - Hosts with connectivity issues.
Recreating the performance stats object does not help resolve the 'Stats primary election' alert as per vSAN Health Service - Performance Service - Stats master election check.
Vmware Aria Operations 8.16.x
Vmware Aria Operations 8.17.x
VMware vSAN 8.x
The 'Stats primary election' alert is triggered because the vsanmgmtd service on the vSAN hosts go to a not responding state frequently.
This is due to continuous API calls with unsupported parameters, which run into an error while fetching the details being made on the service, thereby overloading it.
From the ESXi host's /var/run/log/vsanmgmt.log the unsupported parameter call can be seen:
YYYY-MM-DDTHH:MM:SS.SSSZ error vsand[2110597] [opID=########-#### statsdb::Run] When run command execute for mode normalMode, met exception in DB thread data processing: no such column: throughputDevRead, out is no such column: throughputDevRead Traceback (most recent call last): File "/usr/lib/vmware/vsan/perfsvc/statsdb.py", line 4535, in Run sqlite3.OperationalError: no such column: throughputDevRead
These API calls are coming from VMware Aria Operations.
Due to these calls the hostd service on the host also goes down causing the 'Hosts with connectivity issues' alert.
To resolve this issue:
Upgrade VMware Aria Operations to either 8.16 Hot Fix 2 or 8.17 Hot Fix 1.
Delete and re-create the vsan stats object- SAN Health Service - Performance Service - Stats master election check.