vSphere Cluster Services (vCLS) Known Issues/Corner Cases
book
Article ID: 316510
calendar_today
Updated On:
Products
VMware vCenter Server
Issue/Introduction
The list below includes known issues and corner cases to be aware of, but the list is not exhaustive. Any new issues or corner cases will be added periodically.
Environment
VMware vCenter Server 7.0.x VMware vCenter Server 8.0.x
Resolution
Terminology
Going forward, the original version of vCLS will be referred to as External vCLS.
In vSphere 8.0 Update 3, VMware has released a newer revision of the feature known as Embedded vCLS, which will be used when both vCenter and ESXi support it.
Unsupported Operations on vCLS VMs
Attempting to perform unsupported operation on vCLS VMs, including configuring FT, DRS rules, or HA overrides, cloning or moving the vCLS VM under a resource pool or vApp, could impact the health of vCLS for that cluster, resulting in DRS becoming non-functional.
Do not update the hardware version of vCLS VMs. They are kept at HW version 11.
Supported Operations on vCLS VMs
Some of the supported operation on vCLS VMs are attaching tags/custom-attributes on these VMs.
External vCLS VMs additionally support migration of these VMs to different hosts or datastores.
Embedded vCLS VMs do not support migration.
Below are the listed operations that do not depend on the vCLS health and be performed independently of the vCLS VM deployment.
Resource pool creation
DRS configuration (such as automation level, overrides etc.)
Addition/editing of VM/Host rules
vSphere with Tanzu supervisor cluster configuration
Deployment and Management of vCLS VMs
All vCLS VMs for a datacenter inside a vCenter Server are stored in a specific folder named vCLS. Users should not rename or delete this folder. Renaming or deleting the folder could result in the failure to create new vCLS VMs for clusters, impacting the health of vCLS.
vCLS VMs are deployed as soon as a first host is added to a new cluster. Test scripts for empty cluster validation should be updated accordingly. These vCLS VMs should be ignored from the cluster capacity checks.
If shared storage is not configured when External vCLS VMs are deployed, they may be placed in local datastores and should be manually migrated to shared storage once configured to ensure HA protection. These VMs will not be automatically migrated to the configured shared storage.
Embedded vCLS VMs do not use or need datastores and do not participate in failovers, so the above does not apply.
The CPU/Memory consumption of vCLS VMs is not displayed in the VM summary page inside vSphere Client.
When downsizing clusters (reducing the number of hosts), there might be cases where more vCLS VMs are running than necessary. In this situation, some or all of these vCLS VMs could reside on the same host.
Handling Orphaned vCLS VMs and Cluster Changes
When a VMDK is removed or corrupted on vCLS VMs, they become orphaned and will not be recreated automatically; manual deletion is required to prompt new VM creation.
Orphaned vCLS VMs might appear as workload VMs in hosts and clusters navigation, as EAM will not delete them as part of cleanup when a host containing an orphaned VM is added to the cluster.
Resolution: Manually unregister these VMs from the vCenter Server inventory.
Deleting a cluster or removing a host without placing hosts into Maintenance Mode might keep the vCLS VMs.
To avoid conflicts with new vCLS VMs when re-adding hosts with orphaned vCLS VMs, it is recommended to add the hosts using one of the below methods:
Add the host to the vCenter inventory as a standalone host and then move the host into the cluster.
Power off all the VMs running on the host and then add the hosts.
A host cannot be placed into standby mode if DPM is configured on the cluster when a vCLS VM is running on that host or when no workload VMs are running.
If the vCLS is inaccessible and unable to be removed from inventory (greyed out), open the host were the vCLS is located and unregister it from there to remove it from inventory
Impact on DRS and HA Operations
Operations that invoke the DRS algorithm will fail if attempted before the first vCLS VM for the cluster is powered on in a DRS-enabled cluster.
Below are the listed operations that may fail if DRS is non-functional:
A new workload VM placement/power-on
Host selection for a VM migrated from another cluster/host within the vCenter Server
Migrated VM could get powered-on on a non-DRS selected host
Placing a host into Maintenance Mode might get stuck if it has any powered-on VM
Invocation of DRS APIs such as ClusterComputeResource.placeVm() and ClusterComputeResource.enterMaintenanceMode() will result in an InvalidState
vCLS VMs cannot be evacuated in quarantine mode triggered by proactive HA.
In an HA-enabled cluster configured with slot policy admission control, fewer workload VMs could be powered on in some cases.
Technical Requirements and Considerations
vCLS VMs cannot run on a host where vt-x is disabled. vCLS VMs require vt-x (or AMD-v) to be enabled, along with nested page table support due to the deprecation of software MMU starting with vSphere 6.7.
ESXi 6.5 hosts with AMD Opteron Generation 3 (Greyhound) processors cannot join Enhanced vMotion Compatibility (EVC) AMD REV E or AMD REV F clusters on a vCenter Server 7.0 Update 1 system. The CPU baseline for AMD processors of the ESX agent virtual machines have POPCNT SSE4A instructions, which prevents ESXi 6.5 hosts with AMD Opteron Generation 3 (Greyhound) processors from enabling EVC mode AMD REV E and AMD REV F on a vCenter Server 7.0 Update 1 system.
Using the esxcli command to put hosts in Maintenance Mode with vCLS VMs may cause the task to get stuck.
Workaround: Power off the vCLS VMs after running the esxcli command by logging in to vSphere Client, ESXi Host client, or through esxcli in a new session.
Power off will succeed as the host status will be in entering Maintenance Mode and any new power-on operation on a host in that state will fail.
In deployments using VM-based licensing like vSphere for ROBO licensing, vCLS VMs appears in the licensing UI, but are not counted towards the licensed VMs.
Special Considerations
In the event of a host failure in an HA-enabled cluster and when this host has vCLS VMs, HA powers these VMs on in a different host if there is shared storage configured for the cluster. In certain cases, ESX Agent Manager might try to power on these VMs, resulting in some failed tasks, but VMs will be powered on to maintain the vCLS health status. These task failures can be ignored.
Downgrading the vCenter Server to an older version where vCLS is not supported will require a manual cleanup of the vCLS VMs by manually deleting the vCLS VMs. This can be done from the inventory for External vCLS VMs.