vSAN Health Service - File Service - Infrastructure health
search cancel

vSAN Health Service - File Service - Infrastructure health

book

Article ID: 331500

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article explains the File Service - Infrastructure health in the vSAN Health Service and provides details on why it might report an error.

Environment

VMware vSAN 7.0.x

Resolution

What does  File Service - Infrastructure health check do?

It checks the file service infrastructure health state per ESXi host in the vSAN cluster.

  • The column 'vSAN File Service Node' checks if the vSAN file service node VM is in a powered on state.
  • The column 'VDFS Daemon' checks if VDFS daemon process is running or not.
  • The column 'Root File System Health' checks if the file service root file system is valid and is mounted correctly on the host.
  • The column "Workload Balance Health" checks whether file servers are well balanced in the cluster.
What does it mean when it is in an error state?

The column 'Description' will provide the detailed error message which can be referred to the following table.

 
Check ItemError Message Type
vSAN File Service NodeHost in maintenance mode
File service VM not found on this host
File service VM not powered on this host
VDFS DaemonVDFS daemon and sockrelay are not running
Sockrelay is not running
VDFS daemon is not running
Root File System HealthRoot filesystem not present through CMMDS
Root filesystem not responsive
Workload BalanceFile service is comparatively overloaded on this host.
Cannot fetch file service balance status
Load balance health check is not supported.

When the vSAN File Service is enabled, stateless containers are deployed to a max of eight ESXi hosts in vSAN 7.0. If there are nine hosts, only eight vSAN File Service nodes are deployed.

When a host is placed in maintenance mode, the vSAN File Service node is powered off. This can cause the Infrastructure Health check to trigger an error



Once the host enters maintenance mode, an informational message will be shown.


After a period of time, assuming the ESXi host isn't rebooted or shutdown, the powered off vSAN File Service node will be deleted. If another host is available, a new vSAN File Service node will be deployed. 
 

How does one troubleshoot and fix the error state?

Click Remediate File Service to start auto-remediation by force. In most cases, issues can be remediated automatically. If it keeps failing to deploy file service VM OVF, it's suggested to check the host and vSAN status for the file service VM deployment. VDFS Daemon health and Root File System Health are monitored and remediated periodically.

  • For the error "File service is comparatively overloaded on this host." Click "Remediate Imbalance" to start file server rebalance by force. In NFS share only cluster environment, the issue can be remediated automatically. If the host keeps in overloaded state, it's suggested to check other file service infrastructure issues at first. Note that, file server rebalance by force will lead to the movement of parts of file servers, which will cause SMB clients to be disconnected from SMB shares running on those file servers. So, please schedule a maintenance window for file server rebalance by force.

 

  • For the error "Cannot fetch file service balance status", it is usually caused by other file service infrastructure issues, please check them at first.
  • For the error "Load balance health check is not supported", please fully upgrade file service cluster to 7.0 u1 by referring to the vSAN file service upgrade admin guide.