If CLI, the result contains. {"host": "<UUID>", "overall_state": "UPGRADE_IN_PROGRESS", "ip_address": "***.***.***.***", "upgrade_stage": "VM_RETRIVAL", "_protection": "NOT_PROTECTED"},
<Timestamp> INFO MigrateToCvdsTaskExecutor3 VMOperationImpl 12117 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Getting list of VMs in compute manager <CM UUID>
<Timestamp>
WARN MigrateToCvdsTaskExecutor3 VMOperationImpl 12117 FABRIC [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] Failed to get the VMs on host host-*******
<Timestamp>
ERROR MigrateToCvdsTaskExecutor3 MigrateToCvdsTask 12117 FABRIC [nsx@6876 comp="nsx-manager" errorCode="PM100" level="ERROR" subcomp="manager"] MigrateToCvdsTask on host [<Transport node ID>] failed. Current stage VM_RETRIVAL, Aborting all remaining stages.
java.lang.NullPointerException: null
at com.vmware.nsx.management.policy.migration.util.MigrateToCvdsTask.run(MigrateToCvdsTask.java:518) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_352]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_352]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_352]
Migration task fails when it encounters NullPointerException in getting the list of VMs on the ESXi.
It is known that templates and inaccessible VMs could cause such a NullPointerException.
Workaround:
1. Exit MM and Migrate or Remove all VM templates on the host to other hosts.
2. Clean up old topology by triggering below rest-api
POST https://<nsx_manager_ip>/api/v1/nvds-urt?action=cleanup
3. Create new precheck with below api and note down precheck id from the output
POST https://<nsx_manager_ip>/api/v1/nvds-urt/precheck
4. Generate URT topology with below api using precheck id from step 2.
GET https://<nsx_manager_ip>/api/v1/nvds-urt/topology/<precheck_id>
5. Apply the topology using below api with the payload that was received as output from step 3.
POST https://<nsx_manager_ip>/api/v1/nvds-urt/topology?action=apply
6. Retrigger migration for host using below api
POST https://<nsx_manager_ip>/api/v1/transport-nodes/<tn_id>?action=migrate_to_vds
Note: all versions before and including 3.2.5 could potentially hit this issue