vSAN Skyline health alert for Infrastructure Health > vSAN File Service Node is Unhealthy
search cancel

vSAN Skyline health alert for Infrastructure Health > vSAN File Service Node is Unhealthy

book

Article ID: 407745

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

  • After host maintenance and reboot the vSAN Skyline health alert for Infrastructure Health > vSAN File Service Node is Unhealthy is triggered.
  • The vSAN File Service node is failing to start on the host.
  • DRS moves all VMs off of the host even when manually placed there. 

Environment

vSAN with vSAN File Services in use

Cause

File services error in the vsan-mgmt log:
2025-08-13T13:53:22.622Z ERROR vsan-mgmt[109721] [VsanClusterFileServiceSystemImpl::_CheckFSVMState opID=########] Failed to check FSVM state
Traceback (most recent call last):
....<truncated log messages>
   msg = 'EAM is still loading from database. Please try again later.',

eam logs show:
2025-08-13T02:25:46.527Z | ERROR | vim-monitor | VcListener.java | 124 | An unexpected error in the changes polling loop
com.vmware.eam.EamRemoteSystemException: Unexpected error communicating with the vCenter server.
....<truncated log messages>
Caused by: com.vmware.vim.binding.vim.fault.NotAuthenticated: The session is not authenticated.
2025-08-13T02:25:46.531Z |  INFO | vim-monitor | VcListener.java | 125 | Full stack trace: com.vmware.eam.EamRemoteSystemException: Unexpected error communicating with the vCenter server.
....<truncated log messages>
Caused by: (vim.fault.NotAuthenticated) {
....<truncated log messages>
2025-08-13T02:25:56.535Z |  INFO | vim-monitor | ExtensionSessionRenewer.java | 190 | [Retry:Login:com.vmware.vim.eam:182c2ab8bf6bf28] Re-login to vCenter because method: currentTime of managed object: null::ServiceInstance:ServiceInstance failed due to expired client session: null
2025-08-13T02:25:56.536Z |  INFO | vim-monitor | OpId.java | 37 | [vim:loginExtensionByCertificate:################] created from [Retry:Login:com.vmware.vim.eam:################]
2025-08-13T02:25:59.542Z |  INFO | vim-async-2 | OpIdLogger.java | 43 | [vim:loginExtensionByCertificate:################] Failed.
2025-08-13T02:25:59.542Z |  WARN | vim-async-2 | ExtensionSessionRenewer.java | 227 | [Retry:Login:com.vmware.vim.eam:182c2ab8bf6bf28] Re-login failed, due to:
com.vmware.eam.security.NotAuthenticated: Failed to authenticate extension com.vmware.vim.eam to vCenter.

Resolution

Follow the resolution section in Service fails to start after replacing vCenter Server certificates

  1. Log in to the vCenter Server Appliance using SSH
  2. Enable access the Bash shell by typing:
    shell.set --enabled true
  3. Type shell and press Enter
  4. Run these commands to retrieve the vpxd-extension solution user certificate and key:
    mkdir /certificate
    /usr/lib/vmware-vmafd/bin/vecs-cli entry getcert --store vpxd-extension --alias vpxd-extension --output /certificate/vpxd-extension.crt
    /usr/lib/vmware-vmafd/bin/vecs-cli entry getkey --store vpxd-extension --alias vpxd-extension --output /certificate/vpxd-extension.key
  5. Run this command to update the extension's certificate with vCenter Server.
    python /usr/lib/vmware-vpx/scripts/updateExtensionCertInVC.py -e com.vmware.vim.eam -c /certificate/vpxd-extension.crt -k /certificate/vpxd-extension.key -s localhost -u [email protected]
    Note: If this produces the error "Hostname mismatch, certificate is not valid for 'localhost'", change 'localhost' to the FQDN or IP of the vCenter. The process is checking this value against the SAN entries of the certificate.
    Note: The default user and domain is [email protected]. If this was changed during configuration, change the domain to match your environment. When prompted, type in the [email protected] password.
  6. Restart EAM and restart all the services with these commands:
    service-control --restart eam
    service-control --restart wcp
    Note: To restart all services at once, restart all the vCenter services with below command:
    service-control --stop --all && service-control --start --all