NSX lost all the tags of VMs
search cancel

NSX lost all the tags of VMs

book

Article ID: 322513

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • VMs lose tags once they are unregistered for more than 30 minutes or VM is showing as inaccessible for longer than 30 minutes.
  • If the VM is brought back into the inventory after 30 minutes, it would be treated as a new object, similarly if a VM is deleted from vCenter inventory and re-added even if this is done in under 30 minutes it will be a new object and lose its NSX tags.
  • Relevant logs to look for:
    For VMware NSX-T Data Center 3.1.x and below, look in: /var/log/proton/nsxapi.log
    For VMware NSX-T Data Center 3.2.x and above look in: /var/log/cm-inventory/cm-inventory.log

    20XX-05-21T19:12:28.279Z INFO inventoryTasksScheduler-2 CleanupHandler - FABRIC [nsx@6876 comp="manager_host_name" level="INFO" subcomp="manager"] Successfully deleted data for DeletedObjectId 70####18-8###-###c-###b-b########8p0 of type VM
    20XX-05-21T19:12:30.568Z INFO inventoryTasksScheduler-2 CleanupHandler - FABRIC [nsx@6876 comp="manager_host_name" level="INFO" subcomp="manager"] Successfully deleted data for DeletedObjectId 70####18-8###-###c-###b-b########8p0 2 of type VM
    20XX-05-21T19:14:28.309Z INFO inventoryTasksScheduler-2 CleanupHandler - FABRIC [nsx@6876 comp="manager_host_name" level="INFO" subcomp="manager"] Successfully deleted data for DeletedObjectId 70####18-8###-###c-###b-b########8p0 of type VM

    The following log line in cm-inventory.log indicates that the 30 min timer is triggered for VM with external id 50####18-8###-###c-###b-b########8a0:

    -----------------------
    20XX-07-03T08:15:02.300Z INFO inventoryTasksScheduler1 VirtualMachineServiceImpl 4257 FABRIC [nsx@6876 comp="manager_host_name" level="INFO" subcomp="cm-inventory"] Marked VMContainer20####38-8###-###c-###b-b########2q0 as deleted with timestamp 16######02300
    -----------------------

    The following log line in nsxapi.log indicates that the 30 min time limit has been reached and the VM with external id 50####18-8###-###c-###b-b########8a0 is deleted:

    -----------------------
    20XX-07-03T08:46:03.837Z INFO inventoryTasksScheduler10 CleanupHandler 4257 SYSTEM [nsx@6876 comp="manager_host_name" level="INFO" subcomp="cm-inventory"] Successfully deleted data for DeletedObjectId 70####18-8###-###c-###b-b########8p0 of type VM
    -----------------------

Environment

VMware NSX-T Data Center
VMware NSX

Cause

The Host has not reported the instance UUID for specified VMs in more than 30 minutes. The VM and associated tags are then removed from the NSX inventory.

This is the expected behaviour. If a VM is marked for deletion from a host and if that VM is not claimed by any other host within the 30 minutes timeframe then the VM will be removed from the NSX inventory and tags associated with that VM would also be removed from the VM.

Supporting documentation

Example scenario:  

  • Prior to an outage, the host syncs with NSX and vCenter reporting 10 VMs.
  • After an outage of multiple hours, when the host is back online and is able to sync with NSX, if it has 9 VMs, then it will send a full sync to NSX. As the host only has 9 VMs compared to the previous update of 10 VMs, NSX will mark the missing VM for deletion; it will wait for 30 minutes to see if any host claims  the mark for deleted VM, if no other host claims that VM in next 30 minutes then NSX will remove it from the inventory.
  • If the VM is registered on a host after 30 minutes, it is treated as a new VM, and tags must be reapplied.

Resolution

This is the expected product behaviour.
In the scenario of a Virtual Machine losing its NSX tag(s) there are two workaround options:
  • The NSX-T tags need to be manually applied to the affected VMs to ensure proper tagging and alignment with the network and security policies.
  • Tags can also be restored by restoring from a NSX-T Manager backup.
 

Prevention

  • If there is a planned activity that may impact connectivity in the environment, an API call can be used to increase the timeout period from 30 minutes to a time period that covers the maintenance window.
  • This API's side effect is that it keeps tags in memory for a longer period of time. This will impact the JVM memory usage for core NSX Manager services, manager and inventory.
  • The maximum value tag retention can be increased up to 72 hours (4320 minutes), however the smallest value that will cover the maintenance window is recommended.
  • This change is only supported for use during migration/maintenance workflow windows and it must be reverted to the default value on completion of the activity.

  1. Check the current timeout period

    GET  https://<NSX Manager>/api/v1/configs/inventory/virtual-machine
    {
      "vm_tags_delete_delay" : 30
    }

  2. Change the timeout period from 30 minutes to the new value, specified in minutes

PUT https://<NSX Manager>/api/v1/configs/inventory/virtual-machine
{
  "vm_tags_delete_delay" : <value in minutes>
}

Additional Information

VMs may lost communication due to DFW rules not being applied after losing NSX tags.