Multi-writer VMDKs on certain versions of vSAN could experience high IO latency and IO timeouts during maintenance mode operation
search cancel

Multi-writer VMDKs on certain versions of vSAN could experience high IO latency and IO timeouts during maintenance mode operation

book

Article ID: 378198

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

On a vSAN HCI setup, putting a host in maintenance mode typically causes VMs to migrate to another host, which in turn, causes the backend in-memory storage objects (owner objects) for the VM to migrate out of that host.

In case of VMs with multi-writer vmdk (e.g., Oracle RAC), the VMs migrate to another host but the owner objects remain on the host that is put in the maintenance mode until the host is rebooted.

 As a result, the VM from another host will continue issuing IO to the owner object residing on the host in maintenance mode. If NSX-T vib upgrade is initiated in this scenario, then it can result in network disruption or packet loss, which can cause IO latency or timeouts.



Environment

VMware vSAN below 8.0U2

Resolution

This issue is fixed in ESXi version 8.0U2.

This requires both vCenter and ESXi to be upgraded to 8.0U2 or above.

In version 8.0U2 and above, owner objects of multi-writer vmdks automatically rebalance and get moved out from the host in Maintenance mode. It may take two minutes for all objects to migrate. After putting the host in Maintenance Mode, please wait a few minutes and ensure health is green before performing any maintenance tasks on the host.


If you cannot upgrade to 8.0U2 right away, please follow the workaround as below.

Workaround:

Manually abdicate all the DOM owners on the Host as part of the maintenance steps by using the following command after putting the host into Maintenance Mode by running the below command. Please wait a few minutes and ensure health is green before performing any maintenance tasks on the host.


vsish -e set /vmkModules/vsan/dom/ownerAbdicateAll