Symptoms:
# cat /var/log/vmware/wcp/wcpsvc.log
...
2023-06-08T17:31:12.064Z error wcp [licensemonitor/license_event_monitor.go:251] [opID=licenseRefreshMonitor] Supervisor control plane failed: No connectivity to API Master: connectivity , config status REMOVING
2023-06-08T17:31:12.064Z error wcp [common/k8sdeploymentutil.go:38] [opID=#########] Unable to get deployment status of vmware-system-netop/vmware-system-netop-controller-manager. Err: Resource Type ClusterComputeResource, Identifier domain-c8 is not found.
2023-06-08T17:31:12.064Z debug wcp [kubelifecycle/eam_monitor.go:99] [opID=######-#######-####-####-####-##########] Supervisor ######-###-####-####-########## has eam issues [[{178002 *types.Issue {vcente
r.wcp.eam.issue.clusterVmNotDeployed Master EAM Agent with identifier ######-###-####-####-########## could not deployed. See ESX Agent Manager logs for more details.
...
# cat /var/log/vmware/eam/eam.log
...
2023-06-08T16:29:54.395Z | WARN | cluster-agent-4 | VcEventManager.java | 422 | Failed to post agent status changed from yellow to red because the agent is not fully initialized
2023-06-07T14:07:13.871Z | ERROR | cluster-agent-4 | AuditedJob.java | 106 | JOB FAILED: [#325591899] DeployVmJob(ClusterAgent(ID: 'Agent:2a325e44-fff-4sd9-rc63-37543d53eyt4:null'))
java.lang.IllegalStateException: Duplicate key VirtualMachine:vm-######
at java.util.stream.Collectors.lambda$throwingMerger$0(Collectors.java:133) ~[?:1.8.0_345]
at java.util.HashMap.merge(HashMap.java:1255) ~[?:1.8.0_345]
at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1320) ~[?:1.8.0_345]
at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[?:1.8.0_345]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) ~[?:1.8.0_345]
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) ~[?:1.8.0_345]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_345]
...
Menu -> Administration -> Server Extensions -> vSphere ESX Agent Manager -> Configure
VMware vSphere 7.0 with Tanzu
The vCLS VMs are causing the EAM service to malfunction and therefore the removal cannot be completed.
By placing the vSphere Cluster in "Retreat Mode", vCLS VMs will get removed and the deletion will proceed successfully.
Workaround:
IMPORTANT NOTE: The next workaround will affect DRS and HA functionality on the vSphere Cluster. Don't proceed until the customer confirms that it's okay to continue. More details can be found in this KB.
The workaround to fix the issue is placing the vSphere Cluster in "Retreat Mode", please follow the steps below:
1. Identify the cluster domain ID:
# dcli +i com vmware vcenter namespacemanagement software clusters list
|-----------|-----------------|-----------------------------------------------|
|cluster |cluster_name | desired_version
|-----------|-----------------|-----------------------------------------------|
|domain-c8| |v1.23.12+vmware.wcp.1-vsc0.0.22-21450060 |
|-----------|-----------------|-----------------------------------------------|
2. Login to the vSphere Client and Navigate to the cluster on which vCLS must be deactivated.
3. Navigate to the vCenter Server Configure tab. Under Advanced Settings, click the Edit Settings button.
4. Add the following entry and set the value to "False":
config.vcls.clusters.domain-c(number).enabled
## NOTE: Use the domain ID gathered in Step 1
5. Restart the EAM service:
root@vcenter_lab [ ~ ]# service-control --restart eam
Successfully restarted service eam
6. Verify that all the vCLS VMs are no longer present in the inventory.
7. Exit Retreat Mode by setting the value to "True" in step 2. Finally, restart the EAM service.
The Supervisor Cluster will get stuck in "Removing".