The Datastore space utilisation KB shows that X amount of space is consumed on the datastore.
However from a kubernetes perspective, the total sum of the persistent volumes is Y.
This KB will help identify the source of the discrepancy.
Tanzu Kubernetes cluster with CSI PVs backed by NFS datastore
If there are any backup solutions, confirm they are working as expected and that there are no stale backups.
For velero backup
velero backup get
velero backup describe <backup name>
velero backup logs <backup name>
kubectl get volumesnapshot -A
kubectl get volumesnapshotcontent -A
Identify the datastore usage and UUID, this can be retrieved from vSphere UI or using esxcli or esxcfg-info
esxcli storage filesystem list
Mount Point Volume Name UUID Mounted Type Size Free
------------------------------- ------------------ ----------------------------------- ------- ------ -------------- ----
/vmfs/volumes/abcs1234-abcd1234 datastore_1 abcd1234-abcd1234 true NFS 38482906972160 31134093824000
For this datastore, Usage = Size - Free 38482906972160 - 31134093824000 => 7348813148160
Converted to TB:7348813148160/(1024 * 1024 * 1024 * 1024) => 6.68 TB
esxcfg-info -a | grep -A15 <UUID>
|----Volume UUID.....................................abcd1234-abcd1234
|----Volume Name.....................................datastore_1
|----LVM Name........................................10.##.##.## /datastore_1
|----Type............................................NFS
|----Head Extent.....................................nfs:abcd1234-abcd1234
|----Console Path..................................../vmfs/volumes/abcd1234-abcd1234
|----Block Size......................................4096
|----Total Blocks....................................9395240960
|----Logical Disk Block Size.........................512
|----Physical Disk Block Size........................512
|----isSw512e........................................false
|----Blocks Used.....................................1793955619
|----Size............................................38482906972160
|----Usage...........................................7348042215424
Check disk usage
du -sh /vmfs/volumes/<UUID>/
du -sh /vmfs/volumes/abcd1234-abcd1234
/
3.6T /vmfs/volumes/abcd1234-abcd1234/
Check size of all files on datastore
ls -Risla /vmfs/volumes/abcd1234-abcd1234/ > ls-datastore.txt
cat ls-datastore.txt | grep total | awk '{print $2}' | awk '{ sum += $1 } END { print sum }'
This returns 3908174964
KB in this example which is 3.6 TB.
Check free space
df -h
Filesystem Size Used Available Use% Mounted on
NFS 35.0T 6.7T 28.3T 19% /vmfs/volumes/datastore_1
In this example, the disk usage and free space reported by the NFS server are not consistent.
On ESX Host, capture tcpdump on the NFS port and vmknic through which the NFS Server is connected and
run df -h.
tcpdump-uw -i <vmknic> port 2049 -w /vmfs/volumes/df.pcap
Analyse the packet capture using the steps below.
Select any directory from the datastore, see ls-datastore.txt above for full list.
/vmfs/volumes/abcd1234-abcd1234/vm-0079a88c-e317-440a-bf4e-01c28b01b152 is selected in this example
Using the directory name from above, identify the file handle hash for the datastore.
tshark -2 -Tfields -e frame.number -e frame.time_relative -e rpc.xid -e nfs.name -e nfs.fh.hash -Y >'rpc.procedure == 3 && rpc.msgtyp == 0' -r df.pcap | grep vm-0079a88c-e317-440a-bf4e-01c28b01b152
45460 22.057884000 0x35164b64 vm-0079a88c-e317-440a-bf4e-01c28b01b152 0xc4045aca
The file handle hash for the datastore is 0xc4045aca, use this file handle hash to get the xid in FSSTATtshark -2 -Tfields -e frame.number -e frame.time_relative -e rpc.xid -Y 'rpc.procedure == 18 && >rpc.msgtyp == 0 && nfs.fh.hash == 0xc4045aca' -r df.pcap
57824 28.540564000 0x35165c9
The xid is 0x35165c9, use this to filter FSSTAT reply. tshark -2 -Tfields -e frame.number -e frame.time_relative -e rpc.xid -e nfs.fsstat3_resok.tbytes -e >nfs.fsstat3_resok.fbytes -Y 'rpc.procedure == 18 && rpc.msgtyp == 1 && rpc.xid == 0x35165c96' -r df.pcap
57825 28.540948000 0x35165c96 38482906972160 31118049619968
Total bytes 38482906972160 (35TB)
Free Bytes 31118049619968 (~28.30 TB)
In this example, the NFS server is reporting inaccurate free space.
Once the source of the discrepancy is identified, engage the appropriate team for further assistance.