S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology) is an industry standard built into storage devices for monitoring drive health and predicting hardware failures. When looking for S.M.A.R.T. data from NVMe devices, the standard SAS/SATA commands do not provide full NVMe-specific metrics, necessitating the use of the NVMe namespace command.
VMware vSphere ESXi
VMware vSAN
To view complete S.M.A.R.T. data on NVMe devices, retrieve the standard NVMe S.M.A.R.T. log page as defined in the NVMe specification.
Execute the following command against the NVMe adapter: esxcli nvme device log smart get -A <adapter_id>
Example Output
[root@ESXi:~] esxcli nvme device log smart get -A vmhba1SMART And Health Info: Available Spare Space Below Threshold: false Temperature Warning: false NVM Subsystem Reliability Degradation: false Read Only Mode: false Volatile Memory Backup Device Failure: true Composite Temperature: 306 K Available Spare: 100 % Available Spare Threshold: 10 % Percentage Used: 0 % Data Units Read: 0x60ea0528 Data Units Written: 0x2f8fbda9 Host Read Commands: 0x27f7a927fb Host Write Commands: 0x12f8084edf Controller Busy Time: 0x13522 Power Cycles: 0x1a Power On Hours: 0x3f91 Unsafe Shutdowns: 0x9 Media Errors: 0x0 Number of Error Info Log Entries: 0x2c Warning Composite Temperature Time: 0 Mins Critical Composite Temperature Time: 0 Mins Temperature Sensor 1: 319 K Temperature Sensor 2: 309 K Temperature Sensor 3: 0 K Temperature Sensor 4: 0 K Temperature Sensor 5: 0 K Temperature Sensor 6: 0 K Temperature Sensor 7: 0 K Temperature Sensor 8: 0 K
esxcli storage core device smart get -d <device_id>
Example Output
[root@ESXi:~] esxcli storage core device smart get -d t10.NVMe____Dell_NVMe_ISE_PS1030_MU_U.2_6.4TB_______C8341036002EE4ACParameter Value Threshold Worst Raw------------------------ ------- --------- ----- ---Health Status WARNING N/A N/A N/APower-on Hours 16273 N/A N/A N/APower Cycle Count 26 N/A N/A N/AReallocated Sector Count 0 90 N/A N/ADrive Temperature 33 75 N/A N/A