Cluster deletion failed with error: Failed to execute delete cluster: VSAN Cluster moid cannot be null or empty
search cancel

Cluster deletion failed with error: Failed to execute delete cluster: VSAN Cluster moid cannot be null or empty

book

Article ID: 316069

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Delete cluster using SDDC GUI.

Symptoms:
  • When trying to delete a cluster which is in 'ERROR' state, the user will face the issue in the validation step of removing cluster workflow and will get the below error in the domain manager log file (/var/log/vmware/vcf/domainanager/domainmanager.log).
2023-01-17T06:36:58.091+0000 ERROR [vcf_dm,d4f40a595d7f454d,c69f] [c.v.v.c.c.v1.ClusterController,http-nio-127.0.0.1-7200-exec-2] Failed to remove cluster
java.lang.NullPointerException: vSAN Cluster moid cannot be null or empty
at java.base/java.util.Objects.requireNonNull(Objects.java:347)2023-01-17T06:36:58.095+0000 DEBUG [vcf_dm,d4f40a595d7f454d,c69f] [c.v.e.s.e.h.LocalizableRuntimeExceptionHandler,http-nio-127.0.0.1-7200-exec-2] Processing localizable exception InternalServerError
2023-01-17T06:36:58.096+0000 ERROR [vcf_dm,d4f40a595d7f454d,c69f] [c.v.e.s.e.h.LocalizableRuntimeExceptionHandler,http-nio-127.0.0.1-7200-exec-2] [LDUBNE] PUBLIC_INTERNAL_SERVER_ERROR InternalServerError
.
.

Caused by: java.lang.NullPointerException: vSAN Cluster moid cannot be null or empty
at java.base/java.util.Objects.requireNonNull(Objects.java:347)
at org.apache.commons.lang3.Validate.notBlank(Validate.java:439)
at com.vmware.evo.sddc.common.client.vmware.vsan.VsanManagerBase.getVsanDatastores(VsanManagerBase.java:2784)
at com.vmware.evo.sddc.common.client.vmware.vsan.VsanManagerBase.isHciMeshEnabled(VsanManagerBase.java:2628)
at com.vmware.evo.sddc.common.services.InventoryUtil.isHciMeshEnabled(InventoryUtil.java:167)
  • The cluster exists in the VC inventory.
  • Check the status of hosts in the NSX-T. Navigate to NSX-T GUI-> system->Fabric-> Nodes->Host transport nodes->select the vc from the managed by drop down-> select the cluster name to check the status of hosts. The hosts exist in Not configured state.


Environment

Vmware Cloud Foundation 4.5
VMware Cloud Foundation 4.x

Cause

If create vSphere cluster workflow failed because of any reason, the VCF inventory will not have source_id populated for this cluster.  During the execution of Remove Cluster workflow for Errored cluster, DM code does validation check for hci mesh enablement. Due to null source_id under inventory, user will not be able to initiate remove cluster workflow.


Resolution

VMware is aware of this issue, it will be resolved in the future release

Workaround:

To workaround the issue, please follow the below mentioned steps:

  1. SSH to SDDC Manager VM
  2. Fetch cluster details from VCF inventory.
  • Navigate to platform database of SDDC manager DB
root@sddc-manager [ /home/vcf ]# psql -U postgres -h localhost 
psql (10.20)
Type "help" for help.

 
postgres=# \c platform;
You are now connected to database "platform" as user "postgres".

 
  • Look for cluster table
platform=# select name, id, source_id from cluster where name = 'Cluster-2';
name | id | source_id
---------------+--------------------------------------+--------------
SDDC-Cluster1 | b6b7f940-df92-4d44-8855-f256eb0d3c2c |
(1 row)

The above records show us null entry for source_id.   

  1. Go to SDDC Manager UI where cluster tabs shows Error cluster. 
  2. Click on the Services tab and access the vCenter of the affected clusrer
image.png
      5. . Click on the cluster name and Get the Domain ID of the cluster from the URL as below:  In our example, value is domain-c8.
image.png       6. Copy Managed Object ID value from the page you landed after step 6.
  1.  Update VCF inventory with MOB value by running below update commands in the database
platform=# update cluster set source_id='<mob-value>' where name = '<Errored cluster name>';



In our example, we are trying to update SDDC-Cluster1 with source_id as it was null mentioned in step 2 here.      

 

platform=# update cluster set source_id='domain-c8' where name = 'SDDC-Cluster1';

 

  1.  Initiate Remove cluster WF from VCF UI/API.