FSVM Loses Network Connectivity After ESXi Host Reboot or Exiting Maintenance Mode
search cancel

FSVM Loses Network Connectivity After ESXi Host Reboot or Exiting Maintenance Mode

book

Article ID: 416989

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms: 

  • Network Disruption: Following a host reboot, the VNIC gets disconnected from the File Server Virtual Machine (FSVM)  

  • Client Access Failure: Shares gets inaccessible to clients when the containers / File-shares failover to such FSVMs 

  • /vmfs/volumes/vsan:5#############8-21#############a7/7a######-4##9-a##c-4##e-84########0/vmware.log for FSVM, it showed "Failed to connect dvport

    YYYY-MM-DDTHH:MM:SS In(05) vmx - DataSetsMgr_PowerOn: hwversion too low
    YYYY-MM-DDTHH:MM:SS In(05) vmx - Fail to connect dvport for 'ethernet0'.
  • Live Host: Run the command esxcfg-vswitch -l on the ESXi host. If the FSVM entry is missing from the output, it confirms the issue.
  • ESXi host Support bundle: /commands/esxcfg-vswitch_-l.txt dose not show entry for vSAN File Service Node (#).eth0

7 DVS Name         Num Ports   Used Ports  Configured Ports  MTU     Uplinks···
8 VMware Distributed Switch    9216        46          512               9000    vmnic#,vmnic#
10   DVPort ID                               In Use      Client
11   62                                      1           vmnic#
12   63                                      1           vmnic#
15   66                                      1           vmk#
16   19                                      1           vmk# 

Environment

VMware vSAN 8.x 

Cause

  • A timing/race condition occurs during the ESXi host's start-up sequence where vCenter attempts to apply the vDS configuration (including the FSVM's dvPort settings) before the FSVM's vSAN namespace object is fully accessible on the vSAN datastore. The inaccessible object prevents the vDS operation from completing successfully, leading to an incomplete or failed network configuration for the FSVM.

  • /var/run/log/hostd.log shows following events:

YYYY-MM-DDTHH:MM:SS In(166) Hostd[2108605]: [Originator@6876 sub=Vimsvc.TaskManager opID=5c3d6441-74dd sid=520254df user=vpxuser:<no user>] Task Created : haTask--vim.dvs.HostDistributedVirtualSwitchManager.applyDVPort-2###2

YYYY-MM-DDTHH:MM:SS Wa(164) Hostd[2108597]: [Originator@6876 sub=Hostsvc.NetworkProvider opID=5c3d6441-74dd sid=520254df user=vpxuser:<no user>] Skip saving  dvport DSwitchCluster0-1#4 to /vmfs/volumes/vsan:5##############3-f##############a/e0#####6-e##b-b##7-8##0-5#########20/.dvsData/50 ## 84 ## 90## 68 ##-29 ## 9e ## 8d ## b1 ##/1#4: failed to create dir

YYYY-MM-DDTHH:MM:SS In(166) Hostd[2108597]: [Originator@6876 sub=Hostsvc.NetworkProvider opID=5c3d6441-74dd sid=520254df user=vpxuser:<no user>] deleted [primary] port file /vmfs/volumes/vsan:5##############3-f##############a/e0#####6-e##b-b##7-8##0-5#########20/.dvsData/50 ## 84 ## 90## 68 ##-29 ## 9e ## 8d ## b1 ##/1#4

 
 

Resolution

This issue has been addressed in vSphere VCF 9.0. It will get backported to upcoming versions of vSphere ESXi 8.0. 

Workaround 

While Rebooting host or Exiting Maintenance Mode monitor the FSVM network connection status. If it gets disconnected manually connect it again.