Failed to power on virtual machines on a NFS3 datastore
search cancel

Failed to power on virtual machines on a NFS3 datastore

book

Article ID: 345175

calendar_today

Updated On:

Products

VMware

Issue/Introduction

To address the NFS client heap issue.

Symptoms:
  • Failed to power on VMs on NFS3 datastores.
  • Datastores may have been disconnected earlier, VMs could not be powered on after being shut down.
  • Migrating them off to different ESXi host  can allow to power on.
  • Powered on VMs hangs and become inoperable.
  • Linux VMs go into read only.
  • In vmkernel.log shows NFS IO slab running out of memory as below
 2018-04-09T17:04:36.296Z cpu6:33922506)WARNING: NFS: 5603: Failed to convert sgArr to NFSIoInfo: Out of memory
2018-04-09T17:04:36.304Z cpu39:66015)WARNING: NFS: 2189: Unable to allocate fileData
2018-04-09T17:04:36.304Z cpu39:66015)NFS: 2333: Failed to get object (0x43910ef9b33e) 52 e7a8a86e 6e6796e8 080de57f5 0 1617 11415c21 80de57f5 4000000000 405ae47e2a 0 0 :Out of memory
2018-04-09T17:04:36.310Z cpu15:33173614)WARNING: NFS: 5603: Failed to convert sgArr to NFSIoInfo: Out of memory
2018-04-09T17:36:54.856Z cpu34:67840)WARNING: NFSLock: 310: out of memory
2018-04-09T17:36:54.856Z cpu34:67840)NFSLock: 1771: Failed to create lock info for lock file .lck-ea2d750400000000 :Out  of memory
2018-04-09T17:36:54.857Z cpu34:67840)NFSLock: 2965: failed to get lock on file slotsfile 0x4306244ef380 on 10.1.42.85  (10.1.42.85): Out of memory
2018-04-09T17:36:54.866Z cpu34:67840)WARNING: NFSLock: 310: out of memory
  • vmkwarning.log shows similar to below
 vmkwarning.4:2018-04-09T16:37:07.462Z cpu34:36813845)WARNING: Heap: 3534: Heap nfsclient already at its  maximum size. Cannot expand.
vmkwarning.4:2018-04-09T16:38:29.580Z cpu18:9791527)WARNING: Heap: 4169: Heap_Align(nfsclient, 3088/3088  bytes, 8 align) failed. caller: 0x41802ec3895a
vmkwarning.5:2018-04-09T16:34:51.623Z cpu0:34884445)WARNING: Heap: 4169: Heap_Align(nfsclient, 1552/1552  bytes, 8 align) failed. caller: 0x41802ec3895a
vmkwarning.5:2018-04-09T16:35:50.192Z cpu20:39255722)WARNING: Heap: 3534: Heap nfsclient already at its  maximum size. Cannot expand.

          

 NOTE:The preceding log excerpts are only examples.Date,time and environmental variables may vary depending on your environment

Cause

NFS heap is exhausted.This happens because of heavy load and NFS latencies on target. NFS has 16384 as max outstanding IOs. If the outstanding IOs gets more than 16K, we  end up in "Out of memory". Heap size is also computed based on 16K max IOs.

Resolution

This is a known issue in VMware vSphere ESXi 6.5. Currently there is no resolution.
This issue is resolved in VMware vSphere ESXi 6.7 Patch ESXi670-202004002. For download go to the Customer Connect Patch Downloads page


Workaround:
Increase the NFS maxTransferSize from 64K to 1MB on the NFS server, to avoid this issue. Volume needs to be unmounted and remounted for this change to take affect.
Refer to the storage vendor documentation for the details to modify maxTransferSize

Additional Information

Impact/Risks:
None