VMware Cloud Foundations gets incorrect vSAN configuration during deployment
search cancel

VMware Cloud Foundations gets incorrect vSAN configuration during deployment

book

Article ID: 336754

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Symptoms:

The vSAN is not configured as expected prior to the VMware Cloud Foundations deployment.


Environment

VMware Cloud Foundation 2.1.x

Cause

This issue occurs either due to a prior failed vSAN deployment or a previous vSAN configuration that did not get fully cleaned up.
 
Example: 1 of 3 cache devices on each host has a remaining partition on it. The host should have 3 Cache disks and 12 capacity disks configured to 3 separate disk groups, 1 cache and 4 capacity each. The example set in place has 2 separate disk groups, 1 cache and 6 capacity each.

Resolution

If the host devices have previous partitions on them, perform these steps to clean those devices up and reconfigure vSAN:
  1. Identify miss-configured vSAN devices (both cache and capacity devices) and hosts.
  2. Optional: Migrate VMs to alternative datastore, off vSAN.
  3. Confirm auto-disk claim is disabled
    a) Connect to a host in the cluster via SSH and run the following command esxcli vsan storage automode get
    b) If auto-disk claim is enabled disable it by running esxcli vsan storage automode set --enabled=false
  4. Perform steps 5 -12 one host at a time.
  5. Set host to maintenance mode and migrate full data off the host
  6. Clean-up all partitions on the ineligible devices of host:
     
    1. Go to Cluster > Manage > Virtual SAN > Disk Management for 6.0, for 6.5 and above Cluster > Configure > vSAN > Disk Management.
    2. Select host under disk groups in top frame.
    3. In bottom frame, click Show drop down arrow and select Ineligible.
    4. Clean up the bad partition.
    5. Devices should move to not in use.

      For example: 1 cache device on host has an old partition.
       
  7. Destroy the disk groups while selecting the option to fully migrate the data off those DGs of the host.

    For example: destroy the 2 disk groups on the host.
  8. Confirm that expected devices are available and unused.

    For example: 3 cache devices and 12 capacity devices should be available.
     
  9. Re-enable vSAN automatic configuration if desired by runningesxcli vsan storage automode set --enabled=true.
  10. Wait for process to finish and verify the disk group configuration is as expected.

    For example: 3 disk groups each with 1 cache and 4 capacity devices should be configured.
     
  11. Take host out of maintenance mode.
  12. Verify health again.
  13. Repeat steps 5 -12 for each host.
  14. Optional: Migrate VMs back onto the vSAN datastore.
  15. Verify that everything looks correct and is healthy.

    For example: Each host should have 3 disk groups which have 1 cache and 4 capacity devices.
Note: If the partitions fail to delete open a case with support for further assistance.

Additional Information

Impact/Risks:
This could impact the entire vSAN datastore, but the risk is minimal while migrating VMs and data off the affected hosts within maintenance mode.