YYYY-MM-DDTHH:SS:MS INFO [vcf_dm,] [c.v.e.s.c.s.a.t.TaskAggregatorAdapterImpl,http-nio-x.x.x.x-exec-2] Registering the task {"creationTime":yyyyyyyyyyyyy,"taskId":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx12c","taskModel":"FSM","taskRetry":{"errorCodes":[404,500,501],"method":"PATCH","successCode":202,"url":"http://localhost/domainmanager/workflows/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx12c"},"taskType":"NSXT_EDGECLUSTER_CREATION","taskURL":"http://localhost/domainmanager/workflows/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx12c"}
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.n.h.NsxtEdgeClusterValidationUtil,dm-exec-7] Network pool overlaps: NoneYYYY-MM-DDTHH:SS:MS.DEBUG [vcf_dm,] [c.v.e.s.c.util.HostValidationUtil,dm-exec-7] Trying to ping to x.x.x.x
DEBUG [vcf_dm,] [c.v.e.s.c.util.HostValidationUtil,dm-exec-7] Verify ping connectivity to x.x.x.x with command ping x.x.x.x.x -c DEBUG [vcf_dm,] [c.v.e.s.c.util.LocalProcessService,dm-exec-7] Executing the Local command: ping x.x.x.x -c DEBUG [vcf_dm,0000000000000000,0000] [c.v.v.s.c.s.SecurityConfigurationServiceImpl,pool-1-thread-1] Security config retrieved {"certificateValidationEnabled":false,"fipsMode":false} DEBUG [vcf_dm,0000000000000000,0000] [c.v.v.secure.config.LazyTrustManager,pool-1-thread-1] Check if cert validation is enabled false2 DEBUG [vcf_dm,] [c.v.v.n.h.NsxtEdgeClusterValidationUtil,pool-3-thread-41] PING x.x.x.x 56(84) bytes of data.
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.n.h.NsxtEdgeClusterValidationUtil,pool-3-thread-41]YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.n.h.NsxtEdgeClusterValidationUtil,pool-3-thread-41] --- x.x.x.x ping statistics ---YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.n.h.NsxtEdgeClusterValidationUtil,pool-3-thread-41] 5 packets transmitted, 0 received, 100% packet loss, time 1000ms DEBUG [vcf_dm,] [c.v.v.n.h.NsxtEdgeClusterValidationUtil,pool-3-thread-41]
ERROR [vcf_dm,] [c.v.e.s.c.util.LocalProcessService,dm-exec-7] Local Command Failed with exit value 1.Output Logs :LocalProcess Output: YYYY-MM-DDTHH:SS:MS - PING x.x.x.x 56(84) bytes of data.LocalProcess Output: LocalProcess Output: --- x.x.x.x ping statistics ---LocalProcess Output: YYYY-MM-DDTHH:SS:MS - 5 packets transmitted, 0 received, 100% packet loss, time 1000ms.
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.n.s.c.c.ApiConnection,dm-exec-18] Closed ApiClient connection.YYYY-MM-DDTHH:SS:MS ERROR [vcf_dm,] [c.v.v.c.f.p.n.a.CreateNsxtEdgeNodeVmAction,dm-exec-ab] Edge node creation failed, node state is pending, VM deployment state is VM_DEPLOYMENT_FAILED
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.n.s.c.c.ApiConnection,dm-exec-18] Closed ApiClient connection.YYYY-MM-DDTHH:SS:MS ERROR [vcf_dm,] [c.v.e.s.o.model.error.ErrorFactory,dm-exec-18] DEPLOY_NSXT_EDGE_FAILED Failed to deploy NSX-T Edge xxxx on <nsx manager>: Failed to deploy NSX-T Edge xxxx on <nsx manager>.
at com.vmware.vcf.common.fsm.plugins.nsxt.action.CreateNsxtEdgeNodeVmAction.execute(CreateNsxtEdgeNodeVmAction.java:438)
at com.vmware.vcf.common.fsm.plugins.nsxt.action.CreateNsxtEdgeNodeVmAction.execute(CreateNsxtEdgeNodeVmAction.java:59)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionState.invoke(FsmActionState.java:62)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:159)
Caused by: java.lang.IllegalArgumentException: Edge node slopvmanedgeaf302 creation failed, node state is pending, VM deployment state is VM_DEPLOYMENT_FAILED.
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.n.s.c.c.NsxtManagerTransportNodeOperations,dm-exec-16] Error occurred while trying to get transport node xyz02.com: Unable to find transport node with name xyz02.com
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.f.p.n.a.CreateNsxtEdgeNodeVmAction,dm-exec-16] Getting state for edge node xyz02.com
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.n.s.c.c.ApiConnection,dm-exec-16] Closed ApiClient connection.
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.f.p.n.h.NsxtCommonOperations,dm-exec-16] Timeout waiting for Edge node xyz02.com to be deleted
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.f.p.n.a.CreateNsxtEdgeNodeVmAction,dm-exec-16] Edge node xyz02.com still exists
YYYY-MM-DDTHH:SS:MS DEBUG [vcf_dm,] [c.v.v.c.n.s.c.c.ApiConnection,dm-exec-16] Closed ApiClient connection.
YYYY-MM-DDTHH:SS:MS ERROR [vcf_dm,] [c.v.e.s.o.model.error.ErrorFactory,dm-exec-16] DEPLOY_NSXT_EDGE_UNDO_FAILED Failed to undo NSX-T Edge nsx manager
com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Failed to undo NSX-T Edge deployment on nsxmanager
at com.vmware.vcf.common.fsm.plugins.nsxt.action.CreateNsxtEdgeNodeVmAction.lambda$undo$2(CreateNsxtEdgeNodeVmAction.java:528)
DEBUG [vcf_dm,] [c.v.v.c.f.p.n.h.NsxtCommonOperations,dm-exec-18] Finished waiting for Edge node xyz to become ready, currentState is {"details":[{"failureMessage":"Waiting for edge node to be ready.","state":"pending","subSystemId":"","subSystemType":"Host","__dynamicStructureFields":{"fields":{},"name":"struct"}}],"state":"pending","maintenanceModeState":"DISABLED","nodeDeploymentState":{"failureCode":16020,"failureMessage":"Ovf deploy for vm xyz02 failed on vc: Host did not have any virtual network defined.","state":"VM_DEPLOYMENT_FAILED","__dynamicStructureFields":{"fields":{},"name":"struct"}},"transportNodeId":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx12c","__dynamicStructureFields":{"fields":{},"name":"struct"}}
curl -k -s -u 'admin' -X GET https://<nsxmanager_vip_ip>/upgrade-coordinator/api/v1/upgrade/history
or
curl -k -s -u 'admin' -X GET https://<nsxmanager_vip_ip>/api/v1/upgrade/summary
Sample output
{
"initial_version": "3.1.3.7.0.19380457",
"target_version": "3.2.1.2.0.20541212",
"timestamp": 1683276202494,
"upgrade_status": "STARTED" <----!!!!
}
This appears to be a niche case where the NSXT target upgrade is 3.2.1.2.0 or higher and the source NSXT version is 3.1.3.7.0
3.2.1.2 Edge OVF defines 1 more network than Manager (at version "node_version": "3.1.3.7.0.19380482",) is aware of.
So when the NSXT is in this "upgrading" state, the NSXT expects the OVF used to deploy the edge node(s) to include 5 NICs.
As the NSXT is still essentially at version 3.1.3.7.0, the OVF used to deploy the edge node only has a network configuration for 4 NICs.
Workaround: