vSAN objects detect as “Reduced Availability” with several components in Absent and Degraded state
search cancel

vSAN objects detect as “Reduced Availability” with several components in Absent and Degraded state

book

Article ID: 413125

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms : 

  • Navigate to the affected vSAN cluster > Monitor > Virtual objects. One or more vSAN objects display the status as “Reduced Availability”.
  • Select one of the affected object. Clicking “View Component Placement” shows one or more components in Absent or Degraded state.


  • The output of the below command will confirm that one or more objects are in reduced availability state. 
    esxcli vsan debug object health summary get
    Health Status                                              Number Of Objects
    ---------------------------------------------------------  -----------------
    remoteAccessible                                                           0
    inaccessible                                                               0
    reduced-availability-with-no-rebuild                                       1

  • Use the following command to verify the health of the affected object. The Health of the affected object will show as "Reduce availability".

    Example:
    esxcli vsan debug object list -u <object UUID>
    Object UUID: ########-####-####-####-############
       Version: ##
       Health: reduced-availability-with-no-rebuild
       Owner: #################
       Size: 250.00 GB
       Used: 484.11 GB


Environment

VMware vSAN 6.x
VMware vSAN 7.x
VMware vSAN 8.x

Cause

  • The vSAN object contains concat components that need to be merged. The CLOMD attempts to merge these concats, but the operation fails due to insufficient free space on the disks.
  • Due to this, subsequent component rebalancing will also fail.

Cause validation:

  • vSAN Skyline Health displays warnings such as:
    • Storage space”
    • “What if most consumed host fails”


  • The /var/run/log/clomd.log file shows concat merge failures with messages indicating that no suitable disks were found due to space constraints.
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312345] CLOMProcessWorkItem: Op CONCAT starts:1804312345
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312345] CLOMReconfigure: Reconfiguring ########-####-####-####-############ workItem type CONCAT
    YYYY-MM-DDTHH:MM.SSSZ warning clomd[2099048] [Originator@6876 opID=1804312345] CLOMCreateOneDiskList: Disk dropped due to decomDisks:0 decomNodes:8 maxCompLimit:0 capFull:112 unhealthySSD:0 unhealthyNodes:0 unhealthyDisks:0 ReadCacheResvFailures:0 noMinFreeSpace: 0 Nodes with no disk:0 deltaCreateLimit: 0 numNoMinFreeSpace: 0 numIncompatibleStorageType: 0, numIncompatibleEncryptedDisk: 0 nu
    mIncompatibleDedupDisk: 0, numIncompatibleDiskVersion: 0
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312345] CLOMAssignDisksToConfig: Failed to assign disks: Not found
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312345] CLOM_ConcatReplace: Failed to assign disks to the new components Not found
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312345] CLOMCleanupConcat: Failed to cleanup CONCATs: Not found
    YYYY-MM-DDTHH:MM.SSSZ warning clomd[2099048] [Originator@6876 opID=1804312345] CLOMReconfigure: WorkItem failed ignoring disk capacity/transient capacity/component limit: Not found.
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312345] CLOM_PublishResyncBytes: About to fail the workItem(Failure), reset queued resync bytes to 0
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312345] CLOMProcessWorkItem: Op ends:1804312345

  • The var/run/log/clomd.log file also confirms that the component rebalance operations also fail for the same reason. 
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312346] CLOMProcessWorkItem: Op REACTIVE_REBALANCE starts:1804312346
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312346] CLOMReconfigure: Reconfiguring ########-####-####-####-############ workItem type REACTIVE_REBALANCE
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312346] CLOMReplCompPreWorkRebalance: Moving component ########-####-####-####-############ for rebalancing from disk ########-####-####-####-############
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312346] CLOMBalance_CheckMoveGoodnessV2:  Failed to move comp ########-####-####-####-############:########-####-####-####-############ size: 258525993369 initSD: 0.047563, finSD: 0.126757 srcFF 0.807888 destFF 0.900998
    YYYY-MM-DDTHH:MM.SSSZ warning clomd[2099048] [Originator@6876 opID=1804312346] CLOMBalance_CheckMoveGoodnessV2: Unsetting disk assignment!
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312346] CLOMReplaceComponentsWork: Partial fix supported but failed to find any successful fixes
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312346] CLOMReplaceComponentsWork: Could not replace components for ########-####-####-####-############: Not found.
    YYYY-MM-DDTHH:MM.SSSZ warning clomd[2099048] [Originator@6876 opID=1804312346] CLOMReconfigure: WorkItem failed ignoring disk capacity/transient capacity/component limit: Not found.
    YYYY-MM-DDTHH:MM.SSSZ error clomd[2099048] [Originator@6876 opID=1804312346] CLOMReconfigure: exit: obj ########-####-####-####-############ transiantCapGenerated - total: 0, site1: 0, site2: 0, workItem type REACTIVE_REBALANCE configDelay 0 newConfigGenerated 0 status Not found
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312346] CLOM_PublishResyncBytes: About to fail the workItem(Failure), reset queued resync bytes to 0
    YYYY-MM-DDTHH:MM.SSSZ info clomd[2099048] [Originator@6876 opID=1804312346] CLOMProcessWorkItem: Op ends:1804312346

Resolution

Add additional capacity to the vSAN cluster by either, 

  • Adding new disks to existing disk groups on the hosts.
    or
  • Adding new hosts to the cluster.