"performance has deteriorated" messages in ESXi host logs
search cancel

"performance has deteriorated" messages in ESXi host logs

book

Article ID: 318927

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXi host reports below message in vmkernel.log when the latency on the device is higher than the average latency:

Device naa.xxxxx123 performance has deteriorated. I/O latency increased from average value of 1832 microseconds to 19403 microsecond

Environment

VMware vSphere ESXi 6.x
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x

Cause

This occurs when either the latency ratio to the last time the log was updated is 30 or if the ratio doubled since the last log. The device latency may increase due to one of these reasons:
  • Changes made on the target
  • Disk or media failures
  • Overload conditions on the device
  • Failover
The numbers reported in the events are measured in microseconds, and they refer to DAVG measurements, as seen in "esxtop" storage displays.  
 
With traditional storage media (prior to flash-based technologies), the generally accepted threshold above which storage performance might be considered a constraint on performance, was 10 milliseconds (10,000 microseconds).  
 
With flash-based storage. it is rare to see DAVG latencies above 1-2 milliseconds, so these events should be investigated if the frequency is high.  
 
The latency is a measure of the round-trip time between the issuance of a SCSI command from the hypervisor, through the transport to the surface of the media, and return.  
 
So, the source of any delay could be anywhere in the fabric, or the storage infrastructure, or both. 

Resolution

High device latency

If the device latency is too high for a consistent period of time, check the storage performance by verifying the logs on the storage array for any indication of a failure. If failures are logged on the storage array side, take corrective actions. Contact your storage vendor for information regarding checking logs on the array.

Also, check if these messages are generated when there were any scheduled tasks, such as backups or replications, as these can also cause intermittent performance hits.

Overload conditions on the device

If the message is generated because of an overload condition, attempt to reduce the load on the affected storage device.
 
LUN replication tool is running

If running a LUN replication tool, pause the task from the storage end and attempt a storage vMotion to a different datastore. This should help improve the I/O operations.

Additional Information