After HA is triggered VMs are reported "Moved or Copied" in NFSv3 environment
searchcancel
After HA is triggered VMs are reported "Moved or Copied" in NFSv3 environment
book
Article ID: 323108
calendar_today
Updated On: 10-07-2024
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms:
After HA is triggered, vSphere Client reports that VMs have been binbbin"Moved or Copied" .
There is a split brain where the VM seem to be running on multiple ESXi hosts
The VMs complain about lock during VM power ON.
You will see similar to below messages in vmware.log
2019-02-24T04:52:25.477Z| vmx| I125: Msg_Question: 2019-02-24T04:52:25.477Z| vmx| I125: [msg.uuid.altered] This virtual machine might have been moved or copied. 2019-02-24T04:52:25.477Z| vmx| I125+ In order to configure certain management and networking features, VMware ESX needs to know if this virtual machine was moved or copied. 2019-02-24T04:52:25.477Z| vmx| I125+ 2019-02-24T04:52:25.477Z| vmx| I125+ If you don't know, answer "I Copied It". 2019-02-24T04:52:25.477Z| vmx| I125: 2019-02-24T04:52:25.477Z| vmx| I125: ---------------------------------------- 2019-02-24T04:52:25.479Z| vmx| I125: Vigor_ClientRequestCb: failed to do op=3 on unregistered device 'Tools' (cmd=queryFields) 2019-02-24T04:52:25.479Z| vmx| I125: Vigor_ClientRequestCb: failed to do op=3 on unregistered device 'CrashDetector' (cmd=queryFields) Received an answer from hostd here reply choice 1 mean "I Moved It" 2019-02-24T04:55:46.537Z| vmx| I125: VigorTransportProcessClientPayload: opID=SWI-4bb841e1-646f seq=35: Receiving Bootstrap.MessageReply request. 2019-02-24T04:55:46.538Z| vmx| I125: VigorTransport_ServerSendResponse opID=SWI-4bb841e1-646f seq=35: Completed Bootstrap request. 2019-02-24T04:55:46.538Z| vmx| I125: MsgQuestion: msg.uuid.altered reply=1 2019-02-24T04:55:46.538Z| vmx| I125: UUID: Writing uuid.location value: '56 4d f2 a6 10 2e ad d2-e5 51 95 0c 71 21 d7 db' Opening all the disks took around 3.7 mins 2019-02-24T04:59:28.782Z| vmx| I125: DISK: Opening disks took 222203 ms. vmkernel starting the VM <vmname> 2019-02-24T04:52:25.438Z cpu56:25138582)World: vm 25302727: 7379: Starting world vmm1:<vmname> of type 8 NFS file locks: 2019-02-24T04:53:56.595Z cpu75:18209067 opID=50304a5d)WARNING: NFSLock: 2219: File is being locked by a consumer on host <hostname> with exclusive lock.
Note:The preceding log excerpts are only examples.Date,time and environmental variables may vary depending on your environment
NFSv3 uses disk-based locks for providing synchronization across hosts.The delay from NFS is due to NFSv3 locking mechanism.
Resolution
NFS uses disk based locks to provide exclusive access to VM files like .vmx, .vmdk etc. This is the behavior of NFS v3 locking in ESX and will be observed in HA setups. The limitation is from NFS v3 locking.In NFS4.1 , NFS Server manages all the locks. NFS 3 locking on ESXi does not use the Network Lock Manager (NLM) protocol. Instead, VMware provides its own locking protocol. NFS 3 locks are implemented by creating lock files on the NFS server. Lock files are named .lck-file_id.. NFS 4.1 uses share reservations as a locking mechanism. The support for NFSv4.1 was introduced in ESXi 6.0.x onwards.
For more information refer to VMware documentation : NFS File Locking