SNMP tests fail on ESXi hosts with RDMs configured
search cancel

SNMP tests fail on ESXi hosts with RDMs configured

book

Article ID: 318692

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • Testing snmp on ESXi hangs and eventually fails with messages similar to this:
Result: Agent not responding, connect uds socket (var/run/snmp.ctl) failed 2, err= No such file or directory
SnmpAgentConfigImpl: Agent > not responding, connect uds socket(/var/run/snmp.ctl) failed 11, err= Resource temporarily unavailable

 
  • The ESXi host is configured with RDMs that are not set to be perinnially reserved.
  • Messages similar to the following are seen in /var/log/vmkernel.log for the RDM devices.
2020-12-13T12:11:00.394Z cpu0:2098099)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x18 P:0x0 Sense Data: 0x0 0x0 0x0 from dev "naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" occurred 1870 times(of 1877 commands)

2020-12-13T12:11:01.238Z cpu5:2098235)NMP: nmp_ThrottleLogForDevice:3802: Cmd 0x1a (0x459b40e24a00, 0) to dev "naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" on path "vmhba1:CX:TX:LXX" Failed: H:0x8 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL
2020-12-13T12:11:01.238Z cpu5:2098235)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" state in doubt; requested fast path state update...
2020-12-13T12:11:01.238Z cpu5:2098235)ScsiDeviceIO: 3449: Cmd(0x459b40e24a00) 0x1a, CmdSN 0x814639 from world 0 to dev "naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" failed H:0x8 D:0x0 P:0x0 Invalid sense data: 0x0 0x50 0x3a.


 



Environment

VMware vSphere ESXi 6.7
VMware vSphere ESXi 6.5
VMware vSphere ESXi 7.0.0

Cause

This occurs because the RDM LUN is not set to be perennially reserved, and so the snmp test attempts to query it for SMART information, causing a reservation conflict and the test to hang until it eventually times out. This is most often seen in environments with clustered virtual machine configurations such as MSCS .

Resolution

Additional Information

To check if a LUN is set to be perennially reserved, use the following from an SSH session to the ESXi host:

esxcli storage core device list

naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:
   Display Name: HITACHI Fibre Channel Disk (naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
   Has Settable Display Name: true
   Size: 40960
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
   Vendor: HITACHI
   Model: OPEN-V
   Revision: 5001
   SCSI Level: 2
   Is Pseudo: false
   Status: on
   Is RDM Capable: true
   Is Local: false
   Is Removable: false
   Is SSD: false
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: true
   Queue Full Sample Size: 0