vSAN Infrastructure health reports 'File server not found' on a particular cluster node, shows error: "vSAN File Service Node cannot be found on this host"
search cancel

vSAN Infrastructure health reports 'File server not found' on a particular cluster node, shows error: "vSAN File Service Node cannot be found on this host"

book

Article ID: 372631

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

  • The ESXi host is in 7.0.x and the vCenter server has been upgraded to 8.0.x 
  • The vSAN infrastructure health report indicates that the file server is not detected on a specific node within the cluster. 
  • Prior to this, there was a maintenance activity on the impacted node
  • Remediating the vSAN file services fails 
  • When we try to restart the fsvmsockrelay service, it fails with the below error 

    /etc/init.d/fsvmsockrelay restart
    sockrelay is not running
    No fsvm-sockrelay resource pool found
    vSAN File Service Node cannot be found on this host
    sockrelay is not running

  • Additionally "Install agent" tasks by EAM could be seen in vCenter and it fails with the error ' Unable to access agent OVF package file'
  • Upon reviewing the events in the var/log/vmware/eam/eam.log file in vCenter, the following observations were noted

    2024-07-15T09:59:50.564Z |  WARN | vlsi | Workflow.java | 156 | [OvfValidator->Validate:http://localhost:1080/external-tp/http
    1/hostname.domain.com/443/e6714aaec7d3ffef1e34cd0c8e2621fe67410cff/vsanHealth/fileService/ovf/7.0.3.1000-20036589/V
    Mware-vSAN-File-Services-Appliance-7.0.3.1000-20036589_OVF10.ovf:d6568d69a9a44e0b] NEXT WORK ITEM : Failed to instantiate
    com.vmware.eam.exception.CannotAccessOVF: Cannot access OVF at http://localhost:1080/external-tp/http1/hostname.domain.com/443/e6714aaec7d3ffef1e34cd0c8e2621fe67410cff/vsanHealth/fileService/ovf/7.0.3.1000-20036589/VMware-vSAN-File-Services-
    Appliance-7.0.3.1000-20036589_OVF10.ovf
          at com.vmware.eam.agency.impl.OvfDownloader.downloadInternal(OvfDownloader.java:88) ~[eam-server.jar:?]
          at com.vmware.eam.agency.impl.OvfDownloader.download(OvfDownloader.java:65) ~[eam-server.jar:?]
          at com.vmware.eam.agency.impl.OVFs.toOvfInfo(OVFs.java:122) ~[eam-server.jar:?]
          at com.vmware.eam.agency.impl.OVFs.getInternal(OVFs.java:77) ~[eam-server.jar:?]
          at com.vmware.eam.agency.impl.OVFs.get(OVFs.java:66) ~[eam-server.jar:?]
          at com.vmware.eam.agency.impl.OvfValidator.lambda$validate$0(OvfValidator.java:81) ~[eam-server.jar:?]
          at com.vmware.eam.async.workflow.impl.CancellableWorkItemProvider.provide(CancellableWorkItemProvider.java:101) ~[eam-server.jar:?]

  • According to the logs found at var/log/vmware/vsan-health/vmware-vsan-health-service.log, it appears that the directory where the OVF files were previously located no longer exists

    2024-07-15T09:59:50.548Z ERROR vsan-mgmt[12761] [VsanHttpProvider::doGet opID=noOpId] Looking for non-existing path /storage/vsan-health/../updatemgr/vsan/fileService/ovf-7.0.3.1000-20036589/VMware-vSAN-File-Services-Appliance-7.0.3.1000-20036589_OVF10.ovf, return 404
    2024-07-15T09:59:50.549Z INFO vsan-mgmt[12761] [VsanMgmtServer::log_message opID=noOpId] ('127.0.0.1', 52490) - - "GET /vsanHealth/fileService/ovf/7.0.3.1000-20036589/VMware-vSAN-File-Services-Appliance-7.0.3.1000-20036589_OVF10.ovf HTTP/1.1" 404 -
    2024-07-15T09:59:50.556Z INFO vsan-mgmt[12761] [VsanMgmtServer::log_message opID=noOpId] ('127.0.0.1', 52490) - - "HEAD /vsanHealth/fileService/ovf/7.0.3.1000-20036589/VMware-vSAN-File-Services-Appliance-7.0.3.1000-20036589_OVF10.ovf HTTP/1.1" 200 -
    2024-07-15T09:59:50.563Z ERROR vsan-mgmt[12761] [VsanHttpProvider::doGet opID=noOpId] Looking for non-existing path /storage/vsan-health/../updatemgr/vsan/fileService/ovf-7.0.3.1000-20036589/VMware-vSAN-File-Services-Appliance-7.0.3.1000-20036589_OVF10.ovf, return 404
    2024-07-15T09:59:50.563Z INFO vsan-mgmt[12761] [VsanMgmtServer::log_message opID=noOpId] ('127.0.0.1', 52490) - - "GET /vsanHealth/fileService/ovf/7.0.3.1000-20036589/VMware-vSAN-File-Services-Appliance-7.0.3.1000-20036589_OVF10.ovf HTTP/1.1" 404 -
    2024-07-15T09:59:50.572Z ERROR vsan-mgmt[09194] [VsanClusterFileServiceSystemImpl::_RemediateClusterFileServiceTask opID=77504a72-W3314] Exception happened in deploying OVF in cluster 'vim.ClusterComputeResource:domain-c8'

     

 

Environment

VMware vSAN 7.x
VMware vSAN 8.x

Cause

This issue arises because during the upgrade of vCenter from version 7.x to 8.x, the existing FSVM version OVF is not retained by the vCenter Server.

 

Resolution

  • Download the FSVM version OVF again from the Broadcom portal.
  • Establish the missing directory path as identified in the vmware-vsan-health-service.log.

    For example, in this instance, we needed to create the following path: 

    /storage/vsan-health/../updatemgr/vsan/fileService/ovf-7.0.3.1000-20036589/

  • Transfer the downloaded OVF files to the newly created path.
  • Proceed to remediate the file service from Skyline Health. This action will enable the EAM agent to deploy the FSVM.