VCF Administrator Unable to vMotion Compute Resources from Host to Host.
search cancel

VCF Administrator Unable to vMotion Compute Resources from Host to Host.

book

Article ID: 411364

calendar_today

Updated On:

Products

VMware vSphere ESX 8.x

Issue/Introduction

  • Virtual Machines are unable to update their configuration as file handles are reporting stale, as the file pointers are not being updated appropriately.

Environment

  • ESXi 8.X
  • NFS4.1 Share

Cause

During an attempted vMotion (compute only) of a virtual machine. The operation fails, terminating the vMotion process.  When this issue is encountered there will be a rash of various ‘stale file handle’ reported in the logs.

Below is an example of the critical failures reported from the vMotion attempt in Hostd.log:

(The error below is truncated and edited to remove any customer specific data, the error listed in hostd/vmkernel will be longer, and more detailed)

  • Hostd[2100154] [Originator@6876 sub=Vigor.Vmsvc.vm:/vmfs/volumes/[datastore_UUID]/vm_name.vmx] VMotionInitiateSrc: Start message: Unable to retrieve the current working directory: Stale file handle. Check if the directory has been deleted or unmounted.
  • Hostd[###] --> Failed to initialize migration at source. Error 0xbad0007. Bad parameter
  • Hostd[###] --> value = "116 (Stale file handle)"

As noted above the primary error is thrown, as the host is attempting to make a change to the vmx file in question, and is unable to do so for the reasons below :

  • Hostd[###] -->    reason = "Unable to retrieve the current working directory: Stale file handle. Check if the directory has been deleted or unmounted.
While this reads like a storage related issue, it is actually related to how the file pointers are aligned and updated by the host.

Utilizing the same timestamp (which will be included in any host logs), similar 'stale file handle' messages can be observed flooding vmkernel:

During this same period of time, we see vmkernel.log filled with the following messages:
  • vmkwarning: cpu39:####)WARNING: NFS41: NFS41FileOpGetFileAttributes:4681: Failed to get file attributes for object 0x430c923a35e0 name vmware.log: Stale file handle
  • vmkwarning: cpu39:####)WARNING: NFS41: NFS41FileOpGetFileAttributes:4681: Failed to get file attributes for object 0x430c923a35e0 name vmware.log: Stale file handle
  • vmkwarning: cpu39:####)WARNING: NFS41: NFS41FileOpGetFileAttributes:4681: Failed to get file attributes for object 0x430c923a35e0 name vmware.log: Stale file handle

This issue only occurs when NFS4.1 shares are being utilized, and will not affect NFS3 shares.

Resolution

This is a known issue identified by Engineering.

This fix for this issue has been rolled into the July 29th GA Release of 8.0 Esxi - 8.0 P06 (24859861).

It is recommended the customer update all hosts on the 24022510 code branch or lower experiencing this issue to 24859861 or above.  This build can be accessed through the Broadcom customer portal, or by setting up an upgrade profile in VCenter Server

Esxi - 8.0 P06 (24859861).