SDDC Manager upgrade fails at CONFIG DRIFT with error "ADD_EDGE_NSXT_SOURCE_ID_POST_VALIDATE_FAILED"
search cancel

SDDC Manager upgrade fails at CONFIG DRIFT with error "ADD_EDGE_NSXT_SOURCE_ID_POST_VALIDATE_FAILED"

book

Article ID: 305973

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Symptoms:
  •  The Config Drift bundle upgrade fails with error: ADD_EDGE_NSXT_SOURCE_ID_POST_VALIDATE_FAILED, checking sddcmanager_migration_app_upgrade.log:
Caused by: com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Failed to post-validate adding NSX-T source IDs to VCF edge inventory for edge cluster <Edge_Cluster_FQDN>
Failed to set source ID for edge cluster <Edge_Cluster_FQDN>, node <Edge_TN_FQDN>. Expected 98ea0ace-0898-4053-8c45-03b37be66c4f but found 6f738f03-4dd9-44c9-84d7-9a9b279dcc52.
  • Reviewing the postgres DB shows the old source ID for the NSX edge nodes:
psql -h localhost -U postgres -d platform
select * from nsxt_edge_cluster where name='<Edge_Cluster_FQDN>';
id               | 6aff6b70-2925-475e-89ef-88593c3c2ece
creation_time    | 1624055551383 
modification_time| 1624055551383 
status           | ACTIVE 
name             | <Edge_Cluster_FQDN> 
nsxt_edge_nodes  | [{"vmManagementIpAddress":"<IP>","vmHostname":"Edge_TN_FQDN","sourceId":"6f738f03-4dd9-44c9-84d7-9a9b279dcc52","id":"28bb2cc0-d66a-44b0-aa6dd0fcd9c773d3"}, {"vmManagementIpAddress":"<IP>","vmHostname":"Edge_TN_FQDN","sourceId":"ac7384a0-07c3-43c3-bc77-14cbdf0e036a","id":"b4203702-9a2f-4643-b817-88a2f8952259"}] 
source_id        | 554ba1f6-e99f-4e5a-89c7-871f4af3e080

  • Similarly in the nsxt postgres table, it displays the new ID for the edge nodes. 
"edgeClusterMembersTransportNodeIds": [
 "98ea0ace-0898-4053-8c45-03b37be66c4f",
"33c3bdef-2209-4012-c2b6-f3af3f9c324c"
]

 


Environment

VMware Cloud Foundation 4.x

Cause

  • This issue is commonly seen in VCF environments where edge nodes are redeployed from NSX Manager for resizing purposes i.e. changing edge from medium to large. 
  • When redeploying NSX edges from NSX Manager directly, NSX Manager assigns a new ID for the redeployed edge node which will differ from the ID saved in SDDC DB causing the failure for SDDC CONFIG DRIFT upgrade.
  • VCF envirornments should consider utilizing kb https://kb.vmware.com/s/article/87542 for resizing edge nodes.

Resolution

  1. Take a snapshot of the SDDC manager VM.
  2. Connect to SDDC using vcf then switch to root.
  3. Run the following command:
curl localhost/inventory/nsxt-edgeclusters | json_pp > res.json
  1. Search for the Edge cluster inside the file 'res.json' then replace the value written for the edge node in 'edgeNodeNsxtIdwith the new NSXT ID seen in NSXT UI or error message.
  2. Create a new file 'req.json' and put only your edge cluster payload with the updated 'edgeClusterNsxtId'.  Please make sure of removing the [] - the PUT request takes a single object and not an array in the request body.

Example req.json:

{
"clusterIds": [
"ba24f64e-71fb-41df-9678-e46765af0144"
],
"nsxtClusterId": "2f3d2746-7044-4742-8a73-244a15cde569",
"id": "6aff6b70-2925-475e-89ef-88593c3c2ece", <---------------------- Edge clusterID in SDDC DB - We will need it to run our PUT API
"edgeClusterNsxtId": "554ba1f6-e99f-4e5a-89c7-871f4af3e080",
"name": "<Edge_Cluster_FQDN>",
"status": "ACTIVE",
"nsxtEdgeNodes": [{
"id": "28bb2cc0-d66a-44b0-aa6dd0fcd9c773d3",
"hostName": "<Edge_TN_FQDN>",
"managementIpAddress": "<IP>",
"edgeNodeNsxtId": "98ea0ace-0898-4053-8c45-03b37be66c4f" <---------------------- New edge transport node ID 
},
{
"id": "b4203702-9a2f-4643-b817-88a2f8952259",
"hostName": "<Edge_TN_FQDN>",
"managementIpAddress": "<IP>",
"edgeNodeNsxtId": "33c3bdef-2209-4012-a2c6-f3ab3f9c324e" <---------------------- New edge transport node ID 
}
]
}

  

  1. Run  the command:
 curl localhost/inventory/nsxt-edgeclusters/{id} -v -X PUT -H 'content-type:application/json' -d @req.json
  1. Retry upgrade.