ESXi hosts reports sudden NFS APD events where the host enters and exists an APD without any interaction.
search cancel

ESXi hosts reports sudden NFS APD events where the host enters and exists an APD without any interaction.

book

Article ID: 381569

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

ESXi hosts reports intermittent NFS APD events in the absence of any obvious networking errors

/var/run/log/vobd.log 

####-##-##T##:##:##.###Z: [APDCorrelator] 28100498953934us: [vob.storage.apd.start] Device or filesystem with identifier [########-########] has entered the All Paths Down state.
####-##-##T##:##:##.###Z: [APDCorrelator] 28100118604623us: [esx.problem.storage.apd.start] Device or filesystem with identifier [########-########] has entered the All Paths Down state.


####-##-##T##:##:##.###Z: [APDCorrelator] 28100505907252us: [vob.storage.apd.exit] Device or filesystem with identifier [########-########] has exited the All Paths Down state.
####-##-##T##:##:##.###Z: [APDCorrelator] 28100125557746us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [########-########] has exited the All Paths Down state.

Cause

These symptoms are observed if there are intermittent failures in NFS GETATTR requests. 

NFS heartbeats are used to determine whether or not an NFS volume is still available.

NFS heartbeats are actually GETATTR requests on the root file handle of the NFS Volume.

If APD is due to failed GETATTR requests, this will be reflected in vsish stats for the NFS datastore:

 

vsish -e get vmkModules/nfsclient/mnt/<NFS-Datastore-name>/opsStats

Ops statistics {
   getAttr:808483
   getAttrFailed:97
   setAttr:1710
   setAttrFailed:0
   readLink:0
   readLinkFailed:0
   create:3458
   createFailed:6
   mkdir:0
   mkdirFailed:0
   rmdir:0
   rmdirFailed:0
   remove:3452
   removeFailed:0
   rename:0
   renameFailed:0
   symLink:0
   symLinkFailed:0
   readdirplus:0
   readdirplusFailed:0
   lookup:30390
   lookupFailed:19825

 

Resolution

This indicates either an issue either on the network or the storage as the host is not getting a response to these requests.

Please engage with your storage vendor to look into the APD events to see if they are receiving the GetATTR requests if they are not then it indicates that there is a network issue.

 

Additional Information