ESXi host will have repeating non-responsive events due to hostd crashing with LWD filter issue
search cancel

ESXi host will have repeating non-responsive events due to hostd crashing with LWD filter issue

book

Article ID: 391304

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Hostd will crash repeatedly on a host due to a a LWD filter issue when is called on a disk handle unexpectedly.

This results in the ESXi host going into a non responding state.

Environment

VMware vSphere ESXi 8.0

Cause

When the LwdFilter_ForceFullSync is called upon in this manner, we will see the below events in the vmkernel.log:

vmkernel.log

YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu53:7932468 opID=1c22ff72)FiltModS: 379: Aborted 0 IOs and completed 0 IOs after exit of upcall thread
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu53:7932468 opID=1c22ff72)VDFM: 1301: Destroying VDFM file node 1caeeda1-vdfm with fid 481226145.
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu37:7932483)VSCSI: 977: VSCSI_HBA world 7932483 (vmmLeader : 7932471) -- Exiting
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu17:2877868 opID=d654a344)FiltModS: 379: Aborted 0 IOs and completed 0 IOs after exit of upcall thread
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9abff1b40) 0xfe, CmdSN 0x440 from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9c06f9080) 0xfe, CmdSN 0x442 from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9abf21f40) 0xfe, CmdSN 0x43e from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9c06d4480) 0xfe, CmdSN 0x443 from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9abf04540) 0xfe, CmdSN 0x43f from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9abe32540) 0xfe, CmdSN 0x444 from world 2877868 to dev "naa.xxxxxx failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9abfad140) 0xfe, CmdSN 0x445 from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu10:2098276)ScsiDeviceIO: 4672: Cmd(0x45b9d889a9c0) 0xfe, CmdSN 0x441 from world 2877868 to dev "naa.xxxxxx" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu4:2877868 opID=d654a344)User: 3259: hostd-worker: wantCoreDump:hostd-worker signal:6 exitCode:0 coredump:enabled
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu4:2877868 opID=d654a344)UserDump: 3157: hostd-worker: Dumping cartel 2877552 (from world 2877868) to file /var/core/hostd-zdump.000 ...
YYYY-MM-DDTHH:MM:SSZ In(182) vmkernel: cpu4:2877868 opID=d654a344)UserDump: 3452: hostd-worker: Userworld(hostd-worker) coredump complete.

Resolution

This issue is resolved in ESXi 8.0 U3 P05. 

Release notes for reference: VMware ESXi 8.0 Update 3e Release Notes