vMotion of large size VMs attached to NSX-T segments fails
search cancel

vMotion of large size VMs attached to NSX-T segments fails

book

Article ID: 322662

calendar_today

Updated On:

Products

VMware NSX Networking VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • Environment has NSX-T 3.0.x deployed. 
  • vMotion of small size VMs complete successfully.
  • For Large VMs, where a vMotion takes approx. an hour and then fails. 
  • On the destination ESXi host log /var/run/log/nsx-syslog, you should see:
    2020-12-22T02:53:41Z nsx-opsagent[2100365]: NSX 2100365 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsxa" tid="2100670" level="INFO"] [DoVifPortOperation] request=[opId:[kip8z8j0-1813788-auto-12vj1-h5:70092855-41-01-e9-d60e-11] op:[HOSTD_ATTACH_PORT(1)] vif:[90c48f52-2435-454f-8eed-03ffde62e4c3] ls:[11b69099-8c5e-413b-9d61-3190a4c8f240] vmx:[/vmfs/volumes/5fdc4609-4744fe6c-6e68-0c42a1cba660/rp000005736/vm01.vmx] lp:[]]
    ... 
    2020-12-22T03:37:41Z nsx-opsagent[2100365]: NSX 2100365 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsxa" tid="2100668" level="INFO"] [DoVifPortOperation] request=[opId:[sync-detach-0] op:[SYNC_DETACH_PORT(1002)] vif:[90c48f52-2435-454f-8eed-03ffde62e4c3] ls:[11b69099-8c5e-413b-9d61-3190a4c8f240] vmx:[/vmfs/volumes/5fdc4609-4744fe6c-6e68-0c42a1cba660/rp000005736/vm01.vmx] lp:[3ccb1acb-215c-42f6-85fc-7df510799f94]]


Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 3.x
VMware vSphere 7.0.x

Cause

On the ESXi host, NSX-T has a resync timer which runs every hour to check for stale or unused ports.
If the vMotion does not complete before the next resync timer run, it will consider the port unused and remove it.

Resolution

This issue is resolved in NSX-T Data Center 3.1 available at VMware Downloads

Workaround:
To allow the large size vMotions to complete, you can extend the resync timer, see details below.
1. Login as root on the ESXi host.
2. Make a backup of the file /etc/vmware/nsx-opsagent/nsxa.json
cp /etc/vmware/nsx-opsagent/nsxa.json /etc/vmware/nsx-opsagent/nsxa.json.bak
3. Use vi and edit the file as below:
#vi /etc/vmware/nsx-opsagent/nsxa.json
Change the default value from:
 "resync" : {
      "disabled" : "false",
      "intervalInSec" : 3600
   }
 
To:
 "resync" : {
      "disabled" : "false",
      "intervalInSec" : 86400
   }
4. Save and quit the file:
#:wq
5. Restart opsagent using the command below on ESXi host : 
/etc/init.d/nsx-opsagent restart
6. Repeat the above steps for all impacted hosts.