VM application service stop while backup is running for the VM
search cancel

VM application service stop while backup is running for the VM

book

Article ID: 400658

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

  • Critical Application services running inside the VM guest OS stops while back up software is taking backup of the VM
  • In the vmware.log file of the VM, long stun time is noticed during snapshot consolidation process:

2025-05-22T13:24:09.773Z In(05) vcpu-0 - ConsolidateDiskOpenCB: Opening disk node 'scsi0:2' for combining '/vmfs/volumes/########-########-####-############/example/example_1-000001.vmdk' -> '/vmfs/volumes/########-########-####-############/example/example_1_1.vmdk'.

2025-05-22T13:24:09.887Z In(05) vcpu-0 - CPT: vm was stunned for 16343817 us    (16 seconds)

 

 

Environment

VMware vSphere ESXi 7.x

VMware vSphere ESXi 8.x

Cause

This is caused by underlying storage device experiencing high latency.

  • The ESXi host's vmkernel log (/var/run/log/vmkernel.log) reports an increase in I/O latency:

yyyy-mm-ddTxx:xx:xx.xxxZ cpu51:2098041)WARNING: ScsiDeviceIO: 1513: Device naa.############################# performance has deteriorated. I/O latency increased from average value of 38762 microseconds to 776315 microseconds.
yyyy-mm-ddTxx:xx:xx.xxxZ cpu39:2098029)ScsiDeviceIO: 1513: Device naa.############################# performance has improved. I/O latency reduced from 776315 microseconds to 155097 microseconds.
yyyy-mm-ddTxx:xx:xx.xxxZ cpu47:2098037)WARNING: ScsiDeviceIO: 1513: Device naa.############################# performance has deteriorated. I/O latency increased from average value of 12017 microseconds to 254228 microseconds.

Resolution

  • Use the esxtop utility on the ESXi host to monitor LUN-level device performance statistics.
    Refer to the article: Using esxtop to identify storage performance issues
  • If the device latency is too high for a consistent period of time, check the storage performance by verifying the logs on the storage array for any indication of a failure. If failures are logged on the storage array side, take corrective actions. Engage storage vendor for information regarding checking logs on the array.