VM application service stop while backup is running for the VM

search cancel

VM application service stop while backup is running for the VM

book

Article ID: 400658

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:

Critical Application services running inside the VM guest OS stops while back up software is taking backup of the VM
In the vmware.log file of the VM, long stun time is noticed during snapshot consolidation process:

2025-05-22T13:24:09.773Z In(05) vcpu-0 - ConsolidateDiskOpenCB: Opening disk node 'scsi0:2' for combining '/vmfs/volumes/########-########-####-############/example/example_1-000001.vmdk' -> '/vmfs/volumes/########-########-####-############/example/example_1_1.vmdk'.

2025-05-22T13:24:09.887Z In(05) vcpu-0 - CPT: vm was stunned for 16343817 us (16 seconds)

Environment

VMware vSphere ESXi 7.x

VMware vSphere ESXi 8.x

Cause

This is caused by underlying storage device experiencing high latency.

The ESXi host's vmkernel log (/var/run/log/vmkernel.log) reports an increase in I/O latency:

yyyy-mm-ddTxx:xx:xx.xxxZ cpu51:2098041)WARNING: ScsiDeviceIO: 1513: Device naa.############################# performance has deteriorated. I/O latency increased from average value of 38762 microseconds to 776315 microseconds.yyyy-mm-ddTxx:xx:xx.xxxZ cpu39:2098029)ScsiDeviceIO: 1513: Device naa.############################# performance has improved. I/O latency reduced from 776315 microseconds to 155097 microseconds.yyyy-mm-ddTxx:xx:xx.xxxZ cpu47:2098037)WARNING: ScsiDeviceIO: 1513: Device naa.############################# performance has deteriorated. I/O latency increased from average value of 12017 microseconds to 254228 microseconds.

Resolution

Use the esxtop utility on the ESXi host to monitor LUN-level device performance statistics.
Refer to the article: Using esxtop to identify storage performance issues
If the device latency is too high for a consistent period of time, check the storage performance by verifying the logs on the storage array for any indication of a failure. If failures are logged on the storage array side, take corrective actions. Engage storage vendor for information regarding checking logs on the array.

Feedback

thumb_up Yes

thumb_down No