LCM executed ESXi Host Maintenance Mode, but host went into froze state during VM vMotions
search cancel

LCM executed ESXi Host Maintenance Mode, but host went into froze state during VM vMotions

book

Article ID: 432674

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

During a host update using Lifecycle Manager (LCM), you see the host become unresponsive while VMs are being vMotioned attempting to enter Maintenance Mode. You may experience the following:

  • Server was power cycled and a patching retry was successful in putting the host into Maintenance Mode and patching.

  • You see this error in vCenter:
    • The operation is not allowed in the current state. Host <hostname> cannot enter maintenance mode due to host latch failure.

  • On the host in /var/log/vmkernel.log, you see these messages:
    • vmkwarning: cpu160:2099710)WARNING: ScsiDeviceIO: 1780: Device naa.600############################# performance has deteriorated. I/O latency increased from average value of 1883 microseconds to 58345 microseconds.

  • On the host in /var/log/hostd.log, you see messages similar to:
    • Wa(###) Hostd[#######]: [Originator@#### sub=IoTracker] In thread #######, realpath("/vmfs/volumes/########-########-####-############/<VM directory>/<VM>.vmdk") took over 7 sec.

Environment

ESXi 8.0

Cause

This issue typically occurs when high system latency or storage-related delays impact the responsiveness of the ESXi management service, hostd. As a result, the host becomes unresponsive to management operations while still appearing connected in vCenter.

Resolution

The resolve the issue, perform the following steps:

  1. Perform a hard reset or power cycle of the physical host if management agents (hostd) are completely unresponsive.

  2. Resume the host update task in Lifecycle Manager.

 

 

Additional Information

ESX Host Unresponsiveness and VM Inaccessibility Due to Storage Latency or Fabric Issues