High VM I/O workload and IOPS latency validation on ESXi hosts
search cancel

High VM I/O workload and IOPS latency validation on ESXi hosts

book

Article ID: 394192

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • The VM(s) in the cluster shows high IO spike in perf charts ( VC UI > Host and cluster view > select VM > Monitor > Overview > disk ).

  • The /var/run/log/vmkwarning logs shows :

 YYYY-MM-DDTHH:MM.SSSZ  vmkwarning: cpu##:2098204)WARNING: ScsiDeviceIO: 1779: Device naa.############0 performance has deteriorated. I/O latency increased from average value of 349 microseconds to 8049 microseconds.
 YYYY-MM-DDTHH:MM.SSSZ  vmkwarning: cpu##:2098194)WARNING: ScsiDeviceIO: 1779: Device naa.############1 performance has deteriorated. I/O latency increased from average value of 349 microseconds to 8303 microseconds.

Environment

  • VMware vSphere ESXi 6.x
  • VMware vSphere ESXi 7.x
  • VMware vSphere ESXi 8.x

Cause

  • When the number of IOPS on the Storage layer are exceeded beyond capacity by the VM running in the ESXi OS, the latency and deteriorated messages are displayed.

  • The NAA ID can be corelated with the Datastore name.

  • The NAA ID can also be found in Datastore information . vSphere UI client > Datastore view ( Select Datastore  > Configure > Device Backing ) 

    Validation steps :

    localcli storage vmfs extent list 

    Volume Name                      VMFS UUID                       Etent Number  Device Name                            Partition
    -----------------------------------------------------------------------------------------------------------------------------------
    DATASTORE_1             ########-########-####-############           0       naa.############1          1
    DATASTORE_2             ########-########-####-############           0       naa.############2          1
    DATASTORE_3             ########-########-####-############           0       naa.############3          1
    DATASTORE_4             ########-########-####-############           0       naa.############4          1

Resolution

  • Engage storage vendor to investigate the high DAVG values reported in ESXTOP data.

  • Engage the application team to identify any increased I/O generation and work on resolving the underlying cause.

Additional Information