VM snapshot task is failing in vSAN cluster as vSAN clomd service has crashed
search cancel

VM snapshot task is failing in vSAN cluster as vSAN clomd service has crashed

book

Article ID: 439531

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

  • Unable to take snapshots of VM, task fails with message

    “A CDSM is not attached. This could indicate the clomd daemon is not running. An error occurred while decomposing the file.”

  • vSAN skyline health check shows "CLOMD" status as Unhealthy

  • The clomd service will not remain running.
    Command "/etc/init.d/clomd status" will state service is not running.

Environment

VMware vSAN 8.0.x

Cause

  • There is a vSAN object at an object version higher than the on-disk version on a node in cluster.

  • The clomd logs show the following messages.

    clomd[2121408]: [Originator@6876 opID=1804289383] CLOMProcessWorkItem: Op VOTES_REBALANCE starts:1804289383
    clomd[2121408]: [Originator@6876 opID=1804289383] CLOMReconfigure: Reconfiguring ########-####-####-#####-############  workItem type VOTES_REBALANCE
    clomd[2121408]: [Originator@6876 opID=1804289383] CLOMSetQuorumVotes: Counted votes good:2, absent:0, bad:0; upperFDs:2, minLowerFDs:1, nTotalReplicas:2, nUpperReplicas:2, nLowerReplicas:1
    clomd[2121408]: [Originator@6876 opID=1804289383] PANIC: CLOMSanityCheckNewConfig: Object version regressed !!
    clomd[2121408]: [Originator@6876 opID=1804289383] Backtrace:
    ....
    clomd[2121408]: [Originator@6876 opID=1804289383] Failed to dump core: Failure.
    clomd[2121408]: [Originator@6876 opID=1804289383] Msg_Post: Error
    clomd[2121408]: [Originator@6876 opID=1804289383] [msg.log.error.unrecoverable] vSAN Cluster level Object Manager unrecoverable error: (host-####)
    clomd[2121408]: [Originator@6876 opID=1804289383] CLOMSanityCheckNewConfig: Object version regressed !!
    clomd[2121408]: [Originator@6876 opID=1804289383] [msg.panic.requestSupport.withoutLog] You can request support.
    clomd[2121408]: [Originator@6876 opID=1804289383] [msg.panic.requestSupport.vmSupport.vmx86]
    clomd[2121408]: [Originator@6876 opID=1804289383] To collect data to submit to VMware technical support, run "vm-support".
    clomd[2121408]: [Originator@6876 opID=1804289383] [msg.panic.response] We will respond on the basis of your support entitlement.
    clomd[2121408]: [Originator@6876 opID=1804289383] ----------------------------------------
    clomd[2121408]: [Originator@6876 opID=1804289383] Exiting

  • Due to a failing disk on one node, the on-disk format upgrade did not complete on this node, so disks remain in older format version.

 

Resolution

  • Please contact Broadcom support to assist.