vSAN File Service shares may become intermittently or permanently inaccessible. This behavior is often observed in environments with high file activity and high-concurrency workloads, such as large-scale document management systems or high-transaction SMB file shares.
Symptoms:
The attempted operation cannot be performed in the current state (Powered On).
The system allows for a maximum of 100 file shares.Failed to query vSAN file service shares. VDFS datastore is not present.Infrastructure Health | Category: File Service | Impact area: AvailabilityFile Server Health | Category: File Service | Impact area: AvailabilityVMware vSAN
vSAN File Services
File Service VMs
Protocol: SMB
This issue is caused by a memory allocation failure (Panic) within the vdfsd-proxy service on the ESXi hosts. When navigating to the /var/core directory on the affected hosts, vdfsd-proxy-zdumps files will be present.
The issue stems from a known limitation in the 9p driver used for filesystem caching. Under specific high-load conditions, the driver fails to automatically free File Identifiers (FIDs), leading to a continuous increase in cached FID information within the vdfs proxy.
There is an existing management mechanism to trigger a proxy cache cleanup at 80MB threshold - however; rapid spikes in memory demand can exceed the proxy memory limit before the cleanup process completes. This results in service crash and subsequent FSVM reboots.
There is no permanent fix available at this time. This issue is under investigation by Broadcom Engineering.
Workaround:
Restore access to the file shares by performing a rolling reboot all the ESXi hosts in the affected vSAN cluster using Ensure Accessibility mode.
Once the hosts return from reboot, collect a log bundle of the ESXi hosts in the affected vSAN cluster and the vCenter and file a case with Broadcom Support.
Note: For Encrypted vSAN Environments: If encryption is enabled in vSAN, the log bundle must be collected using a password to ensure core dumps can be decrypted for analysis.
Per Article 319493, Step 3: Select the Password for encrypted core dumps option and specify a password. This password must be shared with the Broadcom Support Engineer assigned to the case.