Workload domains with exceptions do not load after upgrade from SDDC 5.2.0 to SDDC 9.0
search cancel

Workload domains with exceptions do not load after upgrade from SDDC 5.2.0 to SDDC 9.0

book

Article ID: 404273

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

Workload Domain details do not load in SDDC Manager UI. Error message displayed:

"Failed to load cluster details. Something went wrong. Please retry or contact the service provider and provide the reference token."

Environment

VMware Cloud Foundation 9

Cause

Could be seen in either of the below scenarios,

Scenario #1:


- Customer is at VCF <= 5.2
- Runs Create Domain workflow, which fails. The failed domain and its default cluster in ERROR state are still in the SDDC Manager inventory. 
- Upgrades to SDDC Manager 9.0
- After upgrade, /v1/clusters API throws a 500 Internal Server Error
- /v1/resource-functionalities API throws a 500 Internal Server Error
- SDDC Manager UI -> Workload Domains tab fails to load domain details

Scenario #2: 


- Customer is at SDDC Manager 9.0
- Runs Create Domain workflow, which fails. The failed domain and its default cluster in ERROR state are still in the SDDC Manager inventory.
- Restarts Domain Manager service for some reason
- After restart, /v1/clusters API throws a 500 Internal Server Error
- SDDC Manager UI -> Workload Domains tab fails to load domain details

Resolution

To resolve this we need to delete the failed Domain from SDDC Manager using APIs. 

 

If this issue matches scenario #1, Steps 1, 2, and 3 are required to restore the /v1/resource-functionalities API. If this issue matches scenario #2, jump straight to Step 4.

Fix /v1/resource-functionalities API to return the allowed/blocked functionality.

  1. SSH to SDDC Manager with vcf user and then elevate to root with su.
  2. Get cluster details using the Resource Inventory Api. This returns all clusters in inventory. Use the json payload for the failed cluster to update the `isImported` value.

    curl "localhost/inventory/query/clusters" 

  3. From the cluster returned, use the json returned for the failed cluster, add the `isImported` property and set it to False. This needs to be done for all clusters in ERROR state, if this property is not set.

    curl -X PUT "localhost/inventory/clusters/{id}" -H "Content-Type: application/json" -d '{ <existing cluster payload>, "isImported":false}'

Example:

curl -X PUT "localhost/inventory/clusters/b0a24e08-574a-466b-8d38-ccaeb6959de3" -H "Content-Type: application/json" -d '{"id":"b0a24e08-574a-466b-8d38-ccaeb6959de3","domainId":"41809e5a-bcaf-4d1f-83fd-6fa8c3ad4f2b","vcenterId":"81c852d0-48a5-4090-aa4c-1296c5e44e55","isStretched":false,"isDefault":true,"vdsIds":["01912558-abef-4789-bda0-b9d681ad313d"],"status":"ACTIVE","ftt":1,"primaryDatastoreType":"VSAN","primaryDatastoreSourceId":"datastore-13","datacenterSourceId":"datacenter-3","sourceId":"domain-c9","isImageBased":false,"vsanClusterMode":"NONE","isImported":false}'

 

Delete failed domains from SDDC Manager Developer Center > API Explorer,

4. Execute PATCH /v1/domains API to mark domain for deletion

PATCH /v1/domains/{id}

Request Body:
{
"markForDeletion": true
}

5. Execute the domain deletion API to delete the domain

DELETE /v1/domains/{id}

After deleting the failed domains the /v1/clusters API and /v1/resource-functionalities APIs will be restored and SDDC Manager UI will display the Domain details.