vSAN skyline health reports error: "Object Health".
search cancel

vSAN skyline health reports error: "Object Health".

book

Article ID: 391607

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms : 

  • vSAN skyline health reports error: "Object Health".
  • Click on troubleshoot option for the Object heath error to validate the number of inaccessible/reduced availability objects.

Validation Step: 

  • Navigate to Cluster > Monitor > vSAN > Virtual Objects to verify the inaccessible/reduced availability objects.

Environment

VMware vSAN 6.x
VMware vSAN 7.x
VMware vSAN 8.x

Cause

vSAN Objects are inaccessible since the Durability component still in resync state where as no vSAN resync running (or already completed).

Cause validation:

  • Navigate to Cluster > Monitor > vSAN > Virtual Objects. Select the checkbox for the inaccessible object and click View Placement Details. The component status will be displayed as "Resynchronizing."

  • Access the SSH session of the vSAN host and execute the command "esxcli vsan debug resync summary get" to verify that no resynchronization is currently in progress.

     Total Number Of Resyncing Objects: 0
     Total Bytes Left To Resync: 0
     Total GB Left To Resync: 0.00
  • Run the command "esxcli vsan debug object list" and check the inaccessible object to confirm that the RAID_D (Durability) component is present.

    Owner: Host-1
       Size: 0.00 GB
       Used: 106.71 GB
       Used 4K Blocks: 105.71 GB
       Policy:
          stripeWidth: 1
          cacheReservation: 0
          proportionalCapacity: 0
          hostFailuresToTolerate: 1
          forceProvisioning: 0
          spbmProfileId: ########-####-####-####-########
          spbmProfileGenerationNumber: 0
          objectVersion: 20
          CSN: 11417
          SCSN: 11417
          spbmProfileName: vSAN Default Storage Policy

       Configuration:

          RAID_1
             RAID_D
                Component: ########-####-####-####-########
                  Component State: ABSENT,  CSN: STALE (11405!=11417),  Address Space(B): 107374182400 (100.00GB),  Disk UUID: ########-####-####-####-########,  Disk Name: naa.############:2
                  Votes: 1,  Capacity Used(B): 57860423680 (53.89GB),  Physical Capacity Used(B): 57285804032 (53.35GB),  Total 4K Blocks Used(B): 56745713664 (52.85GB),  Host Name: Host-1
                Component: ########-####-####-####-########
                  Component State: ACTIVE,  Address Space(B): 107374182400 (100.00GB),  Disk UUID: ########-####-####-####-########,  Disk Name: naa.############:2
                  Votes: 1,  Capacity Used(B): 12582912 (0.01GB),  Physical Capacity Used(B): 8388608 (0.01GB),  Total 4K Blocks Used(B): 5070848 (0.00GB),  Host Name: Host-2
             Component: ########-####-####-####-########
               Component State: ABSENT,  CSN: STALE (11414!=11417),  Address Space(B): 107374182400 (100.00GB),  Disk UUID: ########-####-####-####-########,  Disk Name: naa.############:2
               Votes: 1,  Capacity Used(B): 57860423680 (53.89GB),  Physical Capacity Used(B): 57285804032 (53.35GB),  Total 4K Blocks Used(B): 56745713664 (52.85GB),  Host Name: Host-3
          Witness: ########-####-####-####-########
            Component State: ACTIVE,  Address Space(B): 0 (0.00GB),  Disk UUID: ########-####-####-####-########,  Disk Name: naa.############:2
            Votes: 1,  Capacity Used(B): 12582912 (0.01GB),  Physical Capacity Used(B): 4194304 (0.00GB),  Total 4K Blocks Used(B): 4194304 (0.00GB),  Host Name: Host-4

       Type: vdisk
       Path: /vmfs/volumes/vsan:########-####-####-####-########/########-####-####-####-########/VM.vmdk (Exists)
       Group UUID: ########-####-####-####-########
       Directory Name: None

  • The reason resync is not progressing for the impacted objects is that the objects had durability components and were set to pending resync owing to delta overlap, and need to wait for delta resync.

  • Verify the configuration of the object. RAID_D (Durability) components should not be present unless there is a valid reason, such as when hosts are in maintenance mode or failing. In a healthy environment where all hosts are online and not in maintenance mode, RAID_D components should not be introduced. These durability components are typically created when an ESXi host enters maintenance mode, and a new "durability component" is generated for the components residing on that host. This ensures that all new VM I/O operations are written to both the existing and the durability components.

Resolution

  • Navigate to the vSAN cluster as shown below:
  • Open an SSH session to the vCenter appliance.

  • Log into RVC with "rvc localhost" if still using [email protected] otherwise log in as rvc administrator@<domain>.local@localhost

    "Welcome to RVC. Try the 'help' command.
    0 /
    1 localhost/"

  • Change directory to "localhost" using command "cd 1"

  • List the folder using "ls" command
    /localhost> 1s
    0 Test Datacenter (datacenter)

  • Change directory to "Datacenter" using command "cd 0"
    /localhost> cd 0
  • List the folders using "ls" command.
    /localhost/Test Datacenter> 1s
    0 storage/
    1 computers [host]/
    2 networks [network]/
    3 datastores [datastore]/
    4 vms [vm] /

  • Change directory to "computers [host]" using command "cd 1"
    /localhost/Test Datacenter> cd 1

  • List the folders using "ls" command
    /localhost/Test Datacenter/computers> 1s
    0 Test Cluster (cluster) : cpu 19 GHz, memory 1 GB

  • Change directory to "vsan cluster" using command "cd 0"
    /localhost/Test Datacenter/computers> cd 0
    /localhost/Test Datacenter/computers/Test Cluster>

    Example:
  • Run the command "vsan.check_state ." to check for inaccessible objects in the vSAN cluster.

    YYYY-DD-MM HH-MM-SS mmmm: Step 1: Check for inaccessible vSAN objects
    Detected 3 objects to be inaccessible
    Detected <Object UUID 1> on Host-1 to be inaccessible
    Detected <Object UUID 2> on Host-2 to be inaccessible
    Detected <Object UUID 3> on Host-3 to be inaccessible
  • Run the command "vsan.check_state . -r" to refresh the state of the inaccessible vSAN objects.

    YYYY-DD-MM HH-MM-SS mmmm: Step 1: Check for inaccessible vSAN objects
    Detected <Object UUID 1> to be inaccessible, refreshing state
    Detected <Object UUID 2> to be inaccessible, refreshing state
    Detected <Object UUID 3> to be inaccessible, refreshing state

  • Retest the Skyline Health to verify the status.