Cyclic vMotions triggered on a cluster without any resource contention
search cancel

Cyclic vMotions triggered on a cluster without any resource contention

book

Article ID: 382949

calendar_today

Updated On:

Products

VMware vCenter Server VMware vCenter Server 8.0 VMware vSAN 8.x

Issue/Introduction

  • Multiple cyclic vMotions triggered for a specific VM within the cluster between 2 hosts
  • Upon checking the events for the VM, we observe the below

10/28/2024, 1:56:34 AM   Migrate <VM A> from <Host A> to <Host B>
10/28/2024, 1:48:34 AM   Migrate <VM A> from <Host B> to <Host A>
.

.
10/28/2024, 12:52:39 AM  Migrate <VM A> from <Host A> to <Host B>
10/28/2024, 12:44:37 AM  Migrate <VM A> from <Host B> to <Host A>
10/28/2024, 12:36:41 AM  Migrate <VM A> from <Host A> to <Host B>
10/28/2024, 12:28:48 AM  Migrate <VM A> from <Host B> to <Host A>
10/28/2024, 12:20:54 AM  Migrate <VM A> from <Host A> to <Host B>
10/28/2024, 12:12:54 AM  Migrate <VM A> from <Host B> to <Host A>
10/28/2024, 12:04:57 AM  Migrate <VM A> from <Host A> to <Host B>

  • /var/log/vmware/vsan-health/vsanvcmgmtd-xxx.log

YYYY-MM-DDTHH:MM:SS info vsanvcmgmtd[08418] [vSAN@6876 sub=AdapterServer opID=WorkQueue-50d62f09-d0b8] Invoking 'queryClusterDrsStats' on 'vsan-cluster-config-system' session '52a34f5c-7891-2aa5-2a77-c1d77d6da5ad' active 1/1
YYYY-MM-DDTHH:MM:SS info vsanvcmgmtd[08286] [vSAN@6876 sub=vmomi.soapStub[1620] opID=WorkQueue-50d62f09-d0b8] SOAP request returned HTTP failure; <<io_obj p:0x00007f9ed049bfa8, h:62, <UNIX ''>, <UNIX '/var/run/envoy-hgw/hgw-pipe'>>, /hgw/host-37663/vsan>, method: queryHostDrsStats; code: 500(Internal Server Error); fault: (vmodl.fault.SystemError) {
-->    faultCause = (vmodl.MethodFault) null, 
-->    faultMessage = <unset>, 
-->    reason = "TypeError("'NoneType' object is not iterable")"
-->    msg = "Received SOAP response fault from [<<io_obj p:0x00007f9ed049bfa8, h:62, <UNIX ''>, <UNIX '/var/run/envoy-hgw/hgw-pipe'>>, /hgw/host-37663/vsan>]: queryHostDrsStats
--> 'NoneType' object is not iterable"
--> }

Cause

vSAN DRS stats query to ESX side could hit null pointer exception impacting DRS on vCenter to exclude the target host even if there is no issue

Resolution

Broadcom is working towards a permanent fix for this issue.

In order to workaround the issue, please implement the below steps:

  • Log in to the vCenter server via UI
  • Select the impacted cluster on the left pane
  • Click on Configure on the right pane and select vSphere DRS
  • Click Edit on the right corner and navigate to Advanced Options
  • Add the below parameter

Option: StrictReadLocalityCheck

Value: false

  • Click Ok