vSAN Cluster Storage Usage Spikes Above 90% in Recent Hours
search cancel

vSAN Cluster Storage Usage Spikes Above 90% in Recent Hours

book

Article ID: 391635

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • Recent vSAN Disk failure triggered massive resync
  • vSAN Disk capacity is above 90% for most vSAN disks.

Validation Steps : 

  • Connect putty to one of the vSAN node and run the below script to get the current vSAN disk utilization reports for all vSAN nodes in the Cluster. 

cmmds-tool find -t HOSTNAME -f json | egrep "uuid|hostname" | sed -e 's/\"content\"://g' | awk '{print $2}' | sed -e 's/[\",\},\,]//
g' | xargs -n 2 | while read hostuuid hostname; do echo -e "\n\nHost Name: $hostname::: Host UUID: $hostuuid\n Disk Name\t\t| Disk UUID\t\t| Disk Usage     |
 Disk Capacity | Usage Percentage" ; cmmds-tool find -f python -t DISK -o $hostuuid | grep uuid | cut -c 13-48 | while read diskuuid;do cmmds-tool find -f js
on -t DISK -o $hostuuid -u $diskuuid| egrep "uuid|content" | sed -e 's/\"content\":|\\"uuid\"://g' | sed -e 's/[\",\},\]//g' | awk '{printf $0}' | sed -e 's/
},/\n/g'| awk '{print $37 " " $5 " " $45}'| while read disknaa diskcap maxcomp; do diskcapused=$(cmmds-tool find -f json -t DISK_STATUS -u $diskuuid | grep c
ontent |sed -e 's/[\",\},\]//g' | awk '{print $3}'); diskperc=$(echo "$diskcapused $diskcap" | awk '{print $1/$2*100}'); if [ "$maxcomp" != 0 ]; then echo -e
n " $disknaa\t| $diskuuid\t| $diskcapused\t | $diskcap\t | $diskperc%\n"; fi;done;done;done;

Example :

[root@test:/tmp] cmmds-tool find -t HOSTNAME -f json | egrep "uuid|hostname" | sed -e 's/\"content\"://g' | awk '{print $2}' | sed -e 's/[\",\},\,]//
g' | xargs -n 2 | while read hostuuid hostname; do echo -e "\n\nHost Name: $hostname::: Host UUID: $hostuuid\n Disk Name\t\t| Disk UUID\t\t| Disk Usage     |
 Disk Capacity | Usage Percentage" ; cmmds-tool find -f python -t DISK -o $hostuuid | grep uuid | cut -c 13-48 | while read diskuuid;do cmmds-tool find -f js
on -t DISK -o $hostuuid -u $diskuuid| egrep "uuid|content" | sed -e 's/\"content\":|\\"uuid\"://g' | sed -e 's/[\",\},\]//g' | awk '{printf $0}' | sed -e 's/
},/\n/g'| awk '{print $37 " " $5 " " $45}'| while read disknaa diskcap maxcomp; do diskcapused=$(cmmds-tool find -f json -t DISK_STATUS -u $diskuuid | grep c
ontent |sed -e 's/[\",\},\]//g' | awk '{print $3}'); diskperc=$(echo "$diskcapused $diskcap" | awk '{print $1/$2*100}'); if [ "$maxcomp" != 0 ]; then echo -e
n " $disknaa\t| $diskuuid\t| $diskcapused\t | $diskcap\t | $diskperc%\n"; fi;done;done;done;


Host Name: test.com::: Host UUID: xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx
 Disk Name              | Disk UUID                          | Disk Usage     | Disk Capacity         | Usage Percentage
 naa.xxxxxxxxxxxx:2 | xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx  | 3136019080080  | 3637540651008         | 86.2126%
 naa.xxxxxxxxxxxx:2 | xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx  | 3136019080080  | 3637540651008         | 86.2126%
 naa.xxxxxxxxxxxx:2 | xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx  | 3275191187629  | 3637540651008         | 90.0386%
 naa.xxxxxxxxxxxx:2 | xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx  | 3136019080080  | 3637540651008         | 86.2126%
 naa.xxxxxxxxxxxx:2 | xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx  | 3275191187629  | 3637540651008         | 90.0386%
 naa.xxxxxxxxxxxx:2 | xxxxxxx-xxxxxx-xxxx-xxxx-xxxxxxxxxxxx  | 3275191187629  | 3637540651008         | 90.0386%

 

  • Connect Putty to one of the vSAN node and run the 'localcli vsan debug resync summary get' command to check the current vSAN resync status. 

    Example : 

    [root@test:/tmp] localcli vsan debug resync summary get;

    ResyncSummary:
       Total Number Of Resyncing Objects: 38
       Total Bytes Left To Resync: 4661841221120
       Total GB Left To Resync: 4341.68

  • To monitor the vSAN resync via vSAN Skyline health refer: 'Monitor the Resynchronization Tasks in the vSAN Cluster'

Environment

VMware vSAN 7.x
VMware vSAN 8.x

Cause

The vSAN sudden space spike was due to an environmental issue with the backup server which was creating more snapshot on the vSAN VMs. 

 

Resolution