Our windows team is looking for monitoring of Hardware Disk failure alerts in vCenter monitoring and Broadcom has suggested to try the below two metrics:
QOS_VMWARE_HOST_DISK_BUS_RESETS - This monitor indicates the number of SCSI-bus reset commands issued during the collection interval.
QOS_VMWARE_HOST_DISK_COMMANDS_ABORTED - This monitor indicates the number of SCSI commands terminated during the collection interval.
Our Windows team is asking for what exact alert would trigger if the above metrics are enabled and what would be the suggested thresholds for the above two metrics.
QOS_VMWARE_HOST_DISK_BUS_RESETS
This monitor indicates the number of SCSI-bus reset commands issued during the collection interval.
Disk Resets/Aborts view shows the number of bus resets and commands that aborted on a specific device. Problems such as a high number of bus resets can indicate issues with the disk that lead to corruption or overwrites.
Suggested threshold greater than or equal to 2
QOS_VMWARE_HOST_DISK_COMMANDS_ABORTED
This monitor indicates the number of SCSI commands terminated during the collection interval.
In VMware, an aborted disk command occurs when an ESXi host aborts a command sent by a guest. This can happen when the ESXi server can't process the command in time. Aborted commands can cause VMs to run slowly or crash, so it's best to keep the Disk commands aborted metric at 0.
Suggested threshold greater than or equal to 1
For alarms for these metrics, you can create any custom message using text strings and variables in the Admin Console template once copied for vmware.
Please refer to the following techdoc and search for Custom alarm message: