Upgrade prechecks fails with ESXi Cluster precheck errors: vsan disk
search cancel

Upgrade prechecks fails with ESXi Cluster precheck errors: vsan disk

book

Article ID: 369423

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

During SDDC Manager upgrade prechecks It fails with two errors for the ESXi precheck:

  • vsan disk component
  • vsan disk group mode
  • In /var/log/vmware/vcf/operationsmanager/operationsmanager.log, the below is seen:

    2024-04-04T09:56:05.989+0000 DEBUG [vcf_om,7bc8743d140d8,1b3c] [c.v.v.b.p.updaters.PropertyUpdater,pool-2-thread-4] Executing updater method vsanPhysicalDiskComponentHealth of updater VsanPhysicalDiskHealthUpdater, updaterInfo is {"entityType":"cluster","entityName":"Cluster01","propertyName":"vsanPhysicalDiskComponentHealth","isMandatory":true}
    2024-04-04T09:56:05.990+0000 ERROR [vcf_om,7bc8743d160d8,1b3c] [c.v.v.b.p.updaters.PropertyUpdater,pool-2-thread-4] Failed to execute updater method vsanPhysicalDiskComponentHealth on entity CLuster01 of type cluster from vcenter.vmware.com due to an exception {}
    java.lang.reflect.InvocationTargetException: null
    Caused by: java.lang.IllegalStateException: Failed to find group test with id com.vmware.vsan.health.test.componentmetadata in group with id com.vmware.vsan.health.test.physicaldisks

    The precheck failed with the "Severity" level as "WARNING"
         {
                      "id": "physical-disk-component-health",
                      "constraintExpression": "vsanPhysicalDiskComponentHealth=='green' or vsanPhysicalDiskComponentHealth=='info'",
                      "description": "Checks whether vSAN has encountered an integrity issue of the metadata of a component on a disk for this cluster",
                      "name": "vSAN disk component",
                      "validationCode": "ClusterPerspectiveResourceConstraintsMessages.PHYSICAL_DISK_COMPONENT_HEALTH",
                      "validationSucceededMessage": "All vSAN components are healthy",
                      "validationFailedMessage": "vSAN has encountered an integrity issue of the metadata of a component on a disk for this cluster",
                      "remediationMessage": "This could be due to faulty drives, faulty controller or a misbehaving device driver, but could also originate from a problem in the vSAN software. The best course of action is to engage VMware Support",
                      "severity": "WARNING" }

A Similar exception for disk group mode:

  • In /var/log/vmware/vcf/operationsmanager/operationsmanager.log, the below is seen:

    2024-04-04T09:56:06.029+0000 ERROR [vcf_om,7bc874d40d8,1b3c] [c.v.v.b.p.updaters.PropertyUpdater,pool-2-thread-4] Failed to execute updater method vsanControllerDiskGroupModeVmwareCertifiedStatus on entity CLuster01 of type cluster from vcenter.vmware.com due to an exception {}
    java.lang.reflect.InvocationTargetException: null Caused by: java.lang.IllegalStateException: Failed to find group test with id com.vmware.vsan.health.test.controllerdiskmode in group with id com.vmware.vsan.health.test.hcl

Environment

VMware Cloud Foundation 5.x 

Cause

  • Since vSAN ESA feature was introduced in SDDC Manager, Operations manager (precheck assessment) logs are looking for hcl.controllerdiskmode and physicaldisks.componentmetadata and since they are not found in the results they are reported as errors.

  • The vSAN ESA clusters do not have controllerdiskmode and componentmetadata health. This is a precheck solely for vSAN OSA clusters. 

Resolution

Ensure that Skyline Health checks are green on the cluster before proceeding with disabling the checks. 

 

Workaround: 

vSAN ESA Clusters: 

  • For vSAN ESA clusters controllerdiskmode and componentmetadata health checks errors are not valid and can be ignored. These checks are for vSAN OSA clusters and due to these checks being ran on a vSAN ESA cluster, this results in an ignorable error. 

 

vSAN OSA Clusters:

  • In vSAN OSA clusters if the upgrade fails we can do the following: 
  • If the upgrade fails with same checks we will need to set the following steps

    • Take a snapshot of the SDDC Manager

      • Login to SSH of the SDDC Manager using vcf, and su -

    • Change into the following directory: 

      cd /opt/vmware/vcf/lcm/lcm-app/conf/

    • Edit the file: vi application-prod.properties

    • Search for the term: vsan

    • Change all the values related to vsan from true to false

    • Example as follows:

      vsan.healthcheck.enabled=false

      vsan.hcl.update.enabled=false

      vsan.precheck.enabled=false

      vsan.healthcheck.encryption.enabled=false
    • Save the file using ESC :wq!

    • Restart the LCM service:

      systemctl restart lcm