Unexpected shut down of the Virtual Machine due to I/O Blockage and Resets.
search cancel

Unexpected shut down of the Virtual Machine due to I/O Blockage and Resets.

book

Article ID: 432273

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Virtual machines protected by Dell EMC RecoverPoint may experience unexpected resets or storage hangs. 

From the ESXI Host logs, we notice the below entries:

/var/run/log/vmkernel.log

YYYY-MM-DDTHH:MM:SS cpu26:2099998)esx_splitter: KL_INFO:862: #2 - ESXConfScanner_s_handleConfLineForListener:316: Debugging: esx.conf scan, /adv/Misc/HostIPAddr = ##.##.#.##
YYYY-MM-DDTHH:MM:SS  cpu26:2099998)esx_splitter: KL_INFO:862: #2 - ESXConfScanner_s_handleConfLineForListener:316: Debugging: esx.conf scan, /adv/Misc/HostName = ESXI_FQDN
YYYY-MM-DDTHH:MM:SS cpu1:5369897)esx_splitter: KL_INFO:862: #2 - EsxSplitterVolume_startIo: VAAI command 42 to protected volume guid 0x9bdee######bf4. reject with VMK_NOT_SUPPORTED
YYYY-MM-DDTHH:MM:SS cpu6:2099998)esx_splitter: KL_INFO:862: #2 - IoStats_s_printStats: total 6329 IOs over 60 seconds. average time to start 0us, pending 1230us, processing 8us
YYYY-MM-DDTHH:MM:SS cpu1:7475339)esx_splitter: KL_ERROR:937: #0 - IoEsx_ToStorage_s_forwardToLower: VSCSIFilter_IssueCommandToBackend Failed (io: 0x0), with status Would block
YYYY-MM-DDTHH:MM:SS cpu18:7508625)WARNING: iodm: IodmSasTransport:782: getSasPortStatistics failed for 'vmhba1' : Not implemented
YYYY-MM-DDTHH:MM:SS cpu39:5369897)esx_splitter: KL_INFO:862: #2 - EsxSplitterVolume_startIo: VAAI command 42 to protected volume guid 0x9bde#####7aabf4. reject with VMK_NOT_SUPPORTED

VM log - /vmfs/volumes/Datastore/vmname/vmware.log

YYYY-MM-DDTHH:MM:SSZ vcpu-0 - CPU reset: hard (mode Emulation)
YYYY-MM-DDTHH:MM:SSZ vcpu-1 - CPU reset: hard (mode Emulation)

Environment

  • VMware vSphere ESXI 7.x
  • VMware vSphere ESXI 8.x
  • VMware vSphere ESXI 9.x

Cause

  •  The splitter acts as an interceptor between the VM and the storage. If the splitter encounters a synchronization issue or a communication failure with the RecoverPoint Appliance (RPA), it stops passing data to the storage backend to preserve data consistency. The splitter may reject storage acceleration commands (VAAI), forcing the guest OS/PVSCSI controller into a Busy or timeout state causing the VM to shut down.

  • The splitter may maintain stale locks on virtual disk files, causing "Conflict between buffered and unbuffered open" errors. This prevents the VM from accessing its own disks during power-on or migration.

Resolution

  • Engage and Coordinate with Dell/EMC to confirm whether the esx-splitter VIB version installed on the hosts is compatible with the current ESXi build, and investigate the reason it is rejecting I/O operations.

    https://www.dell.com/support/kbdoc/en-in/000192169/recoverpoint-for-vms-consistency-groups-in-error-state-rpas-appear-to-take-100-cpu-and-esxi-hosts-many-queued-error-tasks