ESXi hosts reports intermittent NFS APD events in the absence of any obvious networking errors
/var/run/log/vobd.log
####-##-##T##:##:##.###Z: [APDCorrelator] 28100498953934us: [vob.storage.apd.start] Device or filesystem with identifier [########-########] has entered the All Paths Down state.
####-##-##T##:##:##.###Z: [APDCorrelator] 28100118604623us: [esx.problem.storage.apd.start] Device or filesystem with identifier [########-########] has entered the All Paths Down state.
####-##-##T##:##:##.###Z: [APDCorrelator] 28100505907252us: [vob.storage.apd.exit] Device or filesystem with identifier [########-########] has exited the All Paths Down state.
####-##-##T##:##:##.###Z: [APDCorrelator] 28100125557746us: [esx.clear.storage.apd.exit] Device or filesystem with identifier [########-########] has exited the All Paths Down state.
These symptoms are observed if there are intermittent failures in NFS GETATTR requests.
NFS heartbeats are used to determine whether or not an NFS volume is still available.
NFS heartbeats are actually GETATTR requests on the root file handle of the NFS Volume.
If APD is due to failed GETATTR requests, this will be reflected in vsish stats for the NFS datastore:
vsish -e get vmkModules/nfsclient/mnt/<NFS-Datastore-name>/opsStats
Ops statistics {
getAttr:808483
getAttrFailed:97
setAttr:1710
setAttrFailed:0
readLink:0
readLinkFailed:0
create:3458
createFailed:6
mkdir:0
mkdirFailed:0
rmdir:0
rmdirFailed:0
remove:3452
removeFailed:0
rename:0
renameFailed:0
symLink:0
symLinkFailed:0
readdirplus:0
readdirplusFailed:0
lookup:30390
lookupFailed:19825
This indicates either an issue either on the network or the storage as the host is not getting a response to these requests.
Please engage with your storage vendor to look into the APD events to see if they are receiving the GetATTR requests if they are not then it indicates that there is a network issue.