ESXi host part of VSAN cluster shows 'Not responding' on vSphere Client and multiple vSAN objects are inaccessible
search cancel

ESXi host part of VSAN cluster shows 'Not responding' on vSphere Client and multiple vSAN objects are inaccessible

book

Article ID: 426511

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • ESXi host part of vSAN OSA cluster reports a status of "Not Responding" in vCenter Server.

  • The host remains reachable via ping, SSH, and the Direct Host Client, even while showing the state "Not responding" in vCenter server.

  • A significant number of virtual objects enter Inaccessible or Reduced Availability states. The count of impacted objects fluctuates or increases over time.

    The status of the virtual objects can be verified using the below command:

    localcli vsan debug object health summary get
    Health Status                                              Number Of Objects
    ---------------------------------------------------------  -----------------
    remoteAccessible                                                           0
    inaccessible                                                              13
    reduced-availability-with-no-rebuild                                       0
    reduced-availability-with-no-rebuild-delay-timer                           0
    reducedavailabilitywithpolicypending                                       0
    reducedavailabilitywithpolicypendingfailed                                 0
    reduced-availability-with-active-rebuild                                   4
    reducedavailabilitywithpausedrebuild                                       0
    data-move                                                                  0
    nonavailability-related-reconfig                                           0
    nonavailabilityrelatedincompliancewithpolicypending                        0
    nonavailabilityrelatedincompliancewithpolicypendingfailed                  0
    nonavailability-related-incompliance                                       0
    nonavailabilityrelatedincompliancewithpausedrebuild                        0
    healthy                                                                  235
  • The host went into Not responding state after new workloads were migrated to the cluster, increasing network traffic.

  • In the VSAN skyline health accessible via vCenter server > VSAN Cluster > Monitor > VSAN > Skyline health, below errors are observed

Environment

VMware VSAN 8.x

Cause

The issue is caused by an MTU mismatch within the ESXi networking stack. While the vmkernel interface used for vSAN, Management, and vMotion was set to Jumbo frames (MTU 9000), the underlying virtual switch was configured with a standard MTU of 1500.

This mismatch caused large vSAN packets to be dropped. The resulting network instability lead to repeated "Lost access to volume" events. This high volume of storage and network errors caused hostd (the ESXi management agent) to suffer severe performance degradation, eventually preventing it from communicating with vCenter.

Cause Validation:

  • vmkpings between the ESXi host with jumbo frames fails with 100% packet loss.

    vmkping -I vmk# -s 8972 192.#.#.#
    PING 192.#.#.# (192.#.#.#): 8972 data bytes

    --- 192.#.#.# ping statistics ---
    3 packets transmitted, 0 packets received, 100% packet loss

  • Reviewing the /var/run/log/hostd.log reports hostd performance degradation events and Lost access to volume events for VSAN objects.

    2026-01-17T12:33:20.388Z In(166) Hostd[81569040]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 581818 : Issue detected on <Host IP/fqdn> in ha-datacenter: hostd performance has degraded due to high system latency
    ................................
    2026-01-20T07:40:58.753Z In(166) Hostd[2104714]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 586575 : Issue detected on <Host IP/fqdn> in ha-datacenter: hostd performance has degraded due to high system latency

2026-01-20T07:42:01.174Z In(166) Hostd[81479861]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 586576 : Lost access to volume 694cf483-########-####-############ (83f44c69-####-####-####-############) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
2026-01-20T07:42:56.129Z In(166) Hostd[2104714]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 586577 : Lost access to volume 695117df-########-####-############ (df175169-####-####-####-############) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.

 

Resolution

To resolve this issue, the MTU settings must be consistent across the entire network path.

  • Navigate to the ESXi Host > Configure > Networking > VMkernel adapters.

  • Select the impacted VMkernel

  • Click Edit > NIC Settings.

  • Change the MTU to 1500.

  • Click OK.

Note: If Jumbo Frames are required for performance, the MTU must be set to 9000 on the vmkernel, the virtual switch, and the physical network switch ports.