"File server creation failed due to unknown reason" error during vSAN File Services enabling.
search cancel

"File server creation failed due to unknown reason" error during vSAN File Services enabling.

book

Article ID: 372903

calendar_today

Updated On: 04-22-2025

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • When trying to enable vSAN File Services, process is getting stalled out around 89% and failing with error 'Cannot complete the operation. See the event log for details. File server creation failed due to unknown reason. Contact Broadcom Support for more information.'

  • vSAN Skyline health reports multiple health check alerts, indicating thumprint issue on the host.

Environment

VMware vSAN 7.x

VMware vSAN 8.x

Cause

  • There is an issue with one or more certs on the hosts which needs to be corrected.

  • vSAN FS process will create the vSAN FS VMs for the hosts as well as creating containers on those FS VMs.

/scratch/log/vdfs_support/containers/fsvm_logs/journal:Jun 12 17:16:08 localhost vsfs-xxxxxxxxxxxxxx[1401]: [MainThread] Changing container state: container_init_succeeded -> container_start_succeeded
/scratch/log/vdfs_support/containers/fsvm_logs/journal:Jun 12 17:16:08 localhost vsfs-xxxxxxxxxxxxxx[1406]: [MainThread] Changing container state: container_init_succeeded -> container_start_succeeded
/scratch/log/vdfs_support/containers/fsvm_logs/journal:Jun 12 17:16:08 localhost vsfs-xxxxxxxxxxxxxx[1407]: [MainThread] Changing container state: container_init_succeeded -> container_start_succeeded

  • FS domain creation will also start.

2024-06-12T17:15:49.560Z info vsand[6219358] [opID=facd59b4-W99-7a2c VsanFileServiceSystemImpl::CreateFileServiceDomain] Creating the domain on the RootFS ...

  • Process will eventually timeout when trying to callback to the hosts and will fail to connect to one or more hosts. The following error will be found in vsanmgmt logs.

2024-06-11T18:08:45.440Z error vsand[2104920] [opID=W3301037-W3301038 VsanVimHelpers::GetVsanVersionNamespace] Failed to test vsan vmodl version with error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131) on 10.xx.xx.xx

 

Resolution

  • Run the following command on all hosts, check should come back with an OK state.

openssl verify -purpose sslclient -CAfile /etc/vmware/ssl/castore.pem /etc/vmware/ssl/rui.crt

Example of healthy cert return:

[root@vsan-host:~] openssl verify -purpose sslclient -CAfile /etc/vmware/ssl/castore.pem /etc/vmware/ssl/rui.crt
/etc/vmware/ssl/rui.crt: OK

Example of an unhealthy cert return:

[root@vsan-host:~] openssl verify -purpose sslclient -CAfile /etc/vmware/ssl/castore.pem /etc/vmware/ssl/rui.crt
/etc/vmware/ssl/rui.crt: C = country, ST = state, L = location, O = O, OU = OU, CN = vsan-host
error 20 at 0 depth lookup:unable to get local issuer certificate

  • If the using self signed certs follow the below link to renew and refresh the self signed certs:

Regenerate vSphere 6.x, 7.x, and 8.0 certificates using self-signed VMCA

  • If using custom certs then the certs will need to be reissued to the hosts that did not return in OK state from the above openssl command.

    Note: If the hosts are using custom certs and are added to distributed switches then you may observe all triggered alerts on skyline health are pointing to thumbprint issue on the host. In this case, place the vSAN node into maintenance mode and then remove the host from inventory. Add the host back in the vSAN cluster and then readd the host to the distributed switch by following - Add Hosts to the vSphere Distributed Switch

  • Should the above solutions still not resolve the issue then the certificates will need further investigation.