Symptoms:
"precheck_id" : "c2415c53-2555-4b73-b80e3e1d1e37xxxx",
"precheck_status" : "FAILED"
Get VDSUpgradeSummary: {"precheck_id": "c2415c53-2555-4b73-b80e3e1d1e37xxxx", "precheck_status": "FAILED"}
2023-07-31T01:29:37.607Z INFO http-nio-127.0.0.1-7440-exec-6 NvdsUpgradeReadinessCheckServiceImpl 514 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" reqId="5ef0d90a-xxxx-48f4-84b1-5beb5b90ab14" subcomp="manager" username="admin"] NVDS-UPGRADE triggerNvdsUpgradePrecheck by default. tolDiffConfig: true
2023-07-31T01:29:37.611Z INFO http-nio-127.0.0.1-7440-exec-6 NvdsUpgradeReadinessCheckServiceImpl 514 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" reqId="5ef0d90a-xxxx-48f4-84b1-5beb5b90ab14" subcomp="manager" username="admin"] NVDS-UPGRADE triggerNvdsUpgradePrecheck precheckStatus FAILED
2023-07-31T01:29:38.124Z INFO NvdsUpgradeTaskExecutor1 NvdsUpgradeReadinessCheckServiceImpl 514 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] NVDS-UPGRADE precheck TN e6731f9e-xxxx-xxxx-xxxx-723c423f3881 vcID 8168b472-xxxx-xxx-xxxx-0cc9690a14de dcID datacenter-2
2023-07-31T01:29:38.127Z INFO NvdsUpgradeTaskExecutor1 NvdsUpgradeReadinessCheckServiceImpl 514 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] NVDS-UPGRADE TN e6731f9e-b267-xxxx-b930-723c423f3881 vmk id {
value: "vmk1"
}
2023-07-31T01:29:38.131Z INFO NvdsUpgradeTaskExecutor1 NvdsUpgradeReadinessCheckServiceImpl 514 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] NVDS-UPGRADE exception in precheckConfig XXXX: java.util.NoSuchElementException
at java.util.Collections$EmptyIterator.next(Collections.java:4191)
at com.vmware.nsx.management.nvdsupgrade.service.NvdsUpgradeReadinessCheckServiceImpl.validateNvdsUpgradeConfig(NvdsUpgradeReadinessCheckServiceImpl.java:1043)
at com.vmware.nsx.management.nvdsupgrade.service.NvdsUpgradeReadinessCheckServiceImpl.precheckConfig(NvdsUpgradeReadinessCheckServiceImpl.java:900)
at com.vmware.nsx.management.nvdsupgrade.service.NvdsUpgradeReadinessCheckServiceImpl$NvdsUpgradeTask.run(NvdsUpgradeReadinessCheckServiceImpl.java:1361)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
This precheck issue is because of missing logical ports with respect to vmks. During URT precheck if any vmk of host is connected to any lport with TransportZoneType as OVERLAY, then URT will be migrating that manually by unprepping the host saying vmkinic is connected to an overlay logical switch In order to check that URT precheck tries to validate all the lports attached to the vmknics. But if lport attachment id of any vmk is not associated with any logical port in NSX MP, then it fails to find the logical port and hence URT precheck fails with NoSuchElementException. Since precheck ID generation has failed, URT status-summary will claim that provided precheckID is invalid.
Currently there is no resolution for this issue.
Workaround:
Check the status of missing vmk port in the respective esxi host
net-stats -l | grep -w vmk1
134217749 3 0 DvsPortset-3 00:50:56:67:xx:xx vmk1
Check the dvs port associated with above vmk
esxcfg-vmknic -l | grep -w vmk1
vmk1 38d43aeb-xxxx-xxxx-xxxx-edca8172xxxx IPv4 192.168.106.124 255.255.255.0 192.168.106.255 00:50:56:67:xx:xx 9000 65535 true STATIC vmotion
In order to recover from the problematic phase, please follow any of the below approaches.
Approach 1:
If vmks are not required to be migrated under new CVDS switch with migration workflow, then vmks can be migrated to VSS/VDS manually using Vcenter UI in order to avoid original precheck issue. Navigate to host>configure>Virtual switches >click on three dots and select migrate VMkernel adapter
Approach 2:
If you are using Transport node profile - Please power off or migrate all the powered on VMs from all the hosts which gets applied with transport node profile. Follow the API method to restore missing logical ports.
1. Locate the problematic host cluster in NSX UI by navigating to System->Fabric->Nodes->Host Transport Nodes.
2. For the same cluster under NSX configuration, check the name of the transport node profile under tag "Applied Profile"
3. Locate the same transport node profile under System->Fabric->Profiles->Transport Node Profiles.
4. For the same transport node profile, make note of the transport node profile id which is listed under ID field.
5. Make below rest-api call to fetch transport node profile
GET https://{{nsx_manager_ip}}/api/v1/transport-node-profiles/<transport-node-profile-id>
6. In the payload of above api response, check if "vmk_install_migration" section is present under NVDS-Overlay switch. If it is present, please check if the vmk installation mappings are like below.
"vmk_install_migration": [
{
"device_name": "vmk1",
"destination_network": "38d43aeb-xxxx-xxxx-xxxx-edca8172xxxx"
},
{
"device_name": "vmk0",
"destination_network": "00439af1-xxxx-xxxx-xxxx-22af200dxxxx"
}
],
7. If vmk_install_migration is not present in the transport node profile payload, then update the transport node profile with the same payload by adding the above vmk_install_migration details under NVDS-Overlay switch using below rest api
PUT https://{{nsx_manager_ip}}/api/v1/transport-node-profiles/<transport-node-profile-id>
the payload be like
{
"host_switch_spec": {
"host_switches": [
{
"host_switch_name": "NVDS-Overlay",
....
"vmk_install_migration": [
{
"device_name": "vmk1",
"destination_network": "38d43aeb-xxxx-xxxx-xxxx-edca8172xxxx"
},
{
"device_name": "vmk0",
"destination_network": "00439af1-xxxx-xxxx-xxxx-22af200dxxxx"
}
],
With this update of transport node profile, logical port with respect to vmk1 should get restored in NSX Mp. To confirm the same, switch to Manager view in NSX UI and then navigate to Networking->Logical Switches->Ports. Make sure that ports with names vmk0@NVDS-Overlay@38d43aeb-xxxx-44b6-bce2-edca8172xxxx and vmk1@NVDS-Overlay@38d43aeb-xxxx-44b6-bce2-edca8172xxxx are present.
8. Skip this step, if step 7 is exucuted. If step 7 is skipped because of having vmk_install_migration details in transport node profile payload, then in NSX UI navigate to System->Fabric->Nodes->Host Transport Nodes. Select the respective ESXI cluster and then reapply the same transport node profile which is currently applied on the cluster, which should restore logical ports with respect to vmks. To confirm the same, switch to Manager view in NSX UI and then navigate to Networking->Logical Switches->Ports. Make sure that ports with names vmk0@NVDS-Overlay@38d43aeb-xxxx-44b6-bce2-edca8172xxxx and vmk1@NVDS-Overlay@38d43aeb-xxxx-44b6-bce2-edca8172xxxx are present.
Once the logical ports with respect to vmks are restored in NSX Mp, clear old precheck id using below rest-api
DELETE https://{{nsx_manager_ip}}/api/v1/infra/nvds-urt?action=cleanup
Then retry the fresh precheck id generation part.
Approach 3:
In case of individual TN - Please power off or migrate powered on VMs from the individual transport node and follow any of below work around.
The work around steps in Approach 2 can be applied on individual transport nodes using API "/api/v1/transport-nodes/<transport node id>
OR
We can update individual transport node configuration with vmk install migrations in NSX MP, from actions> "Migrate ESXi VMkernel and Physical Adapters option" which triggers TN realization and hence logical ports with respect to vmks will get restored, if at all they are missing.
NVDS to VDS migration URT precheck fails