Using esxtop to identify storage performance issues for ESXi
search cancel

Using esxtop to identify storage performance issues for ESXi

book

Article ID: 344099

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere ESX 7.x VMware vSphere ESX 8.x

Issue/Introduction

This article provides information about the esxtop utility and latency statistics that can be used when troubleshooting performance issues with SAN-connected storage (Fibre Channel or iSCSI).

Note: In ESXi 7.x and later, there may be messages indicating that performance has deteriorated. For more information, see performance has deteriorated" message in ESXi.

Environment

VMware vSphere ESXi 6.5.x
VMware vSphere ESXi 6.7.x
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x
VMware vSphere ESXi 9.x

Resolution




The interactive esxtop utility can be used to provide I/O metrics for various devices attached to the ESXi host.

Configuring monitoring using esxtop

General usage:

  • At the command line enter:
    esxtop 

  • Press f to modify the Current Field order:

  • Press s then 2 and press Enter to change the update time to every 2 seconds:
  • Use the keys 2 or 8 to navigate up and down in the list, respectively.

To monitor storage performance per HBA:

  1. Press d to switch to disk adapter view
  2. To view the entire Device name, reset the name field size to 36. Press SHIFT + L and enter 36 then hit Enter

To monitor storage performance on a per-LUN basis:

  1. Press u to switch to disk device view
     

To monitor storage performance on a per-virtual machine basis:

  1. Type v to switch to disk VM screen


 

Analyzing esxtop columns

Refer to this table for relevant columns and descriptions of these values:

 
Column Description
CMDS/s
Total amount of commands per second. Includes IOPS (Input/Output Operations Per Second) and other SCSI commands such as SCSI reservations, locks, vendor string requests, unit attention commands etc., being sent to or coming from the device or the virtual machine being monitored.

In most cases, CMDS/s = IOPS unless there are a lot of metadata operations such as SCSI reservations.
DAVG/cmd The average response time in milliseconds per command being sent to the device.
KAVG/cmd The amount of time the command spends in the VMkernel.
GAVG/cmd The response time as it is perceived by the guest operating system. This number is calculated: DAVG + KAVG = GAVG

These columns show reads and writes, whereas /rd is reads and /wr is writes. The combined value of these columns is the best way to monitor performance, but high read or write response times may indicate that the read or write cache is disabled on the array. While all arrays perform differently, DAVG/cmd, KAVG/cmd, and GAVG/cmd should not exceed more than 10 milliseconds (ms) for sustained periods of time.

If there are high latency times, investigate current performance metrics for switches and SAN targets. Check for errors that may suggest a delay in operations being sent to, received, or acknowledged.

If the response time is over 5000 ms, ESXi will time out the command and abort the operation. These events are logged and can be reviewed in /var/log/vmkernel.log
 
The type of storage logging depends on the configuration of the server. Find the value of these options by navigating to Host > Configuration > Advanced Settings > SCSI > SCSI.Log* or SCSI.Print*