ESXi Host "Not Responding" due to Management Network disconnect
search cancel

ESXi Host "Not Responding" due to Management Network disconnect

book

Article ID: 422899

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Total loss of connectivity to the host management IP.
  • Virtual machines running on the host lose network connectivity.
  • The host may automatically recover after several minutes, only to flap back to a disconnected state later.
  • During the failure, the host cannot ping its default gateway or other hosts within the same cluster.
  • When the issue is active, from the DCUI console esxtop network stats show unusual amount of traffic passing through vmk0(management vmkernel adapter)

  • A packet capture on the active physical uplink revealed a high volume of traffic originating from storage related IP address.
  • Management network and virtual machines are placed on the same virtual switch and use a single active uplink.

Environment

VMware vSphere ESXi

Cause

The ESXi host becomes unresponsive because the Management VMkernel adapter (vmk0) is saturated by storage-related I/O traffic.

When storage traffic  is misconfigured or fails over to the management network, it consumes the available bandwidth of the physical uplink. This creates a "Denial of Service" condition for management heartbeats, causing vCenter to mark the host as "Not Responding."

This eventually cause the virtual machine network disconnect because the uplink bandwidth is exhausted.

Resolution

Review the storage configuration to ensure traffic is pinned exclusively to the dedicated storage VMkernel adapters.

Emergency Workaround
If the host is currently unresponsive, you must clear the network bottleneck to regain management access:

  • Option A: Perform a physical reboot of the ESXi host.
  • Option B: Force a NIC failover via the console (SSH/DCUI) to reset the link state:
    • esxcli network nic down -n vmnic#
    • esxcli network nic up -n vmnic#