Power on or vMotion of any VM in vSAN fails with the error "Module MonitorLoop power on failed"
search cancel

Power on or vMotion of any VM in vSAN fails with the error "Module MonitorLoop power on failed"

book

Article ID: 374211

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

  • Power on or vMotion of any VM in vSAN fails with the error "Module MonitorLoop power on failed"
  • vCenter Web Client shows the below error msg when trying to Power On a VM
    Module MonitorLoop power on failed. Failed to start the virtual machine. Failed to power on VM. Could not power on virtual machine: Failure. Failed to create swap file

    '/vmfs/volumes/vsan:5207xxxxxxxxxxxxxxxxxx/dd267xxxxxxxxxxxxxxx/New Virtual Machine-xxxxxxx.vswp' : Failure

  • vCenter Web Client shows the below error msg when trying to vMotion a VM
    A general system error occurred: Launch failure 2024-06-20T13:06:52.764072Z The VM failed to resume on the destination during early power on. Module MonitorLoop power on failed.

Environment

vSAN (All Versions)

Cause

This is due to a rare issue where DOM Component Manager runs out of usable memory over time.

Resolution

This has been fixed in 8.0U3b


Workaround:
1) To identify the host with the issue open an SSH session to all hosts in the cluster and run the following command:

    # vsish -e get /vmkModules/vsanutil/slabs/dom-CompServer-objSlab/stats | grep "Current allocations" | cut -f2 -d":"

If "Current allocations" exceeds 8000, the host hit the issue.

2) Get the impacted host UUID by running the following command:
   cmmds-tool whoami

3) Exclude the impacted host from CLOM placement on all hosts by running the below command:

    # /usr/lib/vmware/vsan/bin/clom-tool set-global-exclusion-list --exclusion-list=<host UUID>

4) Place the impacted host into maintenance mode with Ensure Accessibility mode and reboot the host.

5) Run the command from Step 1 again on the rebooted host to confirm "Current allocations" is now below 8000

6) Restart CLOMD service on all hosts that wasn't rebooted to revert CLOM exclusion mode back to default by running

    /etc/init.d/clomd restart