"Too many open files" error and slow vSAN UI when vCenter Server manages large number of vSAN hosts
search cancel

"Too many open files" error and slow vSAN UI when vCenter Server manages large number of vSAN hosts

book

Article ID: 326894

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:
When vCenter Server manages a large number of vSAN hosts like ~1000 vSAN enabled hosts, the vSAN UI might be slow in rendering data. 
 
  • You might see error messages in the vSAN health log /var/log/vmware/vsan-health/vmware-vsan-health-service.log file in vCenter.
    OSError: [Errno 24] Too many open files
 
  • The number of opened file handlers for vSAN management process may exceed 1000 with the following command running in vCenter:
     lsof -p $(cat /var/log/vmware/vsan-health/vmware-vsan-health.pid) | wc -l
  • Furthermore, if user randomly browsing various VSAN UI pages about these clusters and hosts or adding/removing hosts, cluster partition might be triggered because vCenter might fail to update member information configuration to some hosts due to lacking file handlers for making connections.


Environment

VMware vSAN 6.5.x
VMware vSAN 6.6.x
VMware vSAN 6.7.x

Cause

This issue occurs because the vSAN management server caches many socket connections to each host and local host which exceeds the maximum open file handler limitation (1024 by default) in vCenter.

Resolution

  • This is a known issue found in vCenter Server Appliance for 6.7 Update 1 and 6.5 Update 2d.
  • This issue has been resolved in VMware vCenter Server 6.7 Update 1b Build 11726888
  • In the VMware vCenter Server 6.5 , the issue resolved in the 6.5 U3 


Workaround:
Note: If you need assistance with this procedure contact support.

To work around this issue:
  • Log in to vCenter Server
  • Stop the service service-control --stop vmware-vsan-health
  • Back up the python file /usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcExtension.py.
  • Open and modify the python file using the vi editor  /usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcExtension.py.
In the /usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcExtension.py on the VCSA locate the following sections:
 
conn = SoapStubAdapterForLocalhost('localhost', 443,
                                      version='vim.version.version11', path='/sdk',
                                      soapStubAdapater=HostdSoapStubAdapter)

and
         conn = SoapStubAdapterForLocalhost('localhost', port,
                                        version=GetVcVmodlVersion(),
                                        path='/sdk',
                                        httpConnectionTimeout=timeout)

In these add the line poolSize=0, so they read as follows:
         
 conn = SoapStubAdapterForLocalhost('localhost', 443,
                                      version='vim.version.version11', path='/sdk',
                                      poolSize=0,
                                      soapStubAdapater=HostdSoapStubAdapter)

and
           conn = SoapStubAdapterForLocalhost('localhost', port,
                                        version=GetVcVmodlVersion(),
                                        path='/sdk',
                                        poolSize=0,
                                        httpConnectionTimeout=timeout)
  • Save the python file.
  • Start the service:  service-control --start vmware-vsan-health
  • Restart vSAN management server using the command: /usr/lib/vmware-vmon/vmon-cli -r vsan-health. This will re-validate if the services are starting properly.
  • Verify that the total opened file handlers for vSAN management server are normal (default maximum limit is 1024).
lsof -p $(cat /var/log/vmware/vsan-health/vmware-vsan-health.pid) | wc -l