ESXi host can PSOD if memory reservation for a VM is changed from 100% to less
search cancel

ESXi host can PSOD if memory reservation for a VM is changed from 100% to less

book

Article ID: 317851

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

To provide information about how to fix the objects identified by vSAN Health Alarm to avoid potentially PSODing host(s). The alarm displayed on the vSAN cluster is called "Potential PSOD issue is detected due to improper object flag leak for some of vSAN objects".

Symptoms:

Hosts running 7.0.3 U3f or earlier and tries to resync a component that has (1<<27) flag, it will PSOD. An alarm on the vSAN cluster with message "Potential PSOD issue is detected due to improper object flag leak for some of vSAN objects" will be displayed if there is potential this issue can occur.

Conditions For 0 Byte Object with (1<<27) Flag

  • VM must be provisioned with a full memory reservation

  • When the VM is powered on, the swap object will be created with 0 byte address space, but the components will not have (1<<27) flag

  • The swap object goes through a configuration change due to one of the following conditions:

    • During VM power on, enough hosts are unavailable such that the swap object gets force provisioned, and later enough hosts are added so that it is reconfigured to the target policy. This may be common in a stretch cluster if there is a loss of connectivity to witness or one of the sites is down, and later the connectivity is restored.

    • The swap object goes through a policy change due to scale-up or scale-down

    • A new policy is applied to the virtual machine

  • After one of the reconfigurations above, the 0 byte swap object will now have (1<<27), flag

  • The memory reservation is removed or reduced for the virtual machine while it is powered on, and the virtual machine is vMotion or other such events on it to cause the swap object to be resized from 0 bytes to non-zero bytes

  • If the host running the virtual machine is overcommitted from the memory perspective, this virtual machine may swap out some of its memory to the swap object
  • The component needs to be resynchronized due to state resync or another policy change or such operation


Environment

VMware vSAN 7.0.x

Resolution

This has been addressed in ESXi 7.0 U3g. However if the error message is present on the environment, the workaround should be applied before updating ESXi hosts or performing any maintenance which could start a resync and cause hosts to fail.

Workaround:
1. Download the attached script fixESACompFlag.py
2. Upload the script to /tmp on one of the hosts in the cluster
3. Run the script on one of the hosts with the following command fixESACompFlag.py

Sample output:
 
[root@hostname:~] ./fixESACompFlag.py
Fixed object 32e12c63-f302-07c7-3863-0200cc722300
All objects fixed

[root@hostname:~] ./fixESACompFlag.py
No object fixes necessary.

Note: A health alarm has been developed to warn users whenever detecting there is a problematic object 




Attachments

fixESACompFlag get_app