This KB article addresses the issue of the disk usage in the NFS Server VMs filling up to near or at maximum capacity.
For example, In the image below, we notice that the disk usage in the /var/vcap/store
directory on the NFS Server VM is at maximum capacity (100%).
A symptom of this issue would be running the df -h
command on the nfs_server
VM, and noticing that the disk space in the /var/vcap/store
directory is anywhere from 80% to 100%.
Stale droplets and stale application packages can contribute to a build-up of disk space on the NFS Server VM.
We can also SSH into the nfs_server
VM, and run the du -h -d 1 /var/vcap/store/shared | sort -rh
command to find out what subdirectory in /var/vcap/store
is taking up the most space:
1. SSH into nfs_server VM:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh nfs_server
2. Become the root user:
sudo su -
3. Run the du
command:
du -h -d 1 /var/vcap/store/shared | sort -rh
If the output of the above du
command shows that most of the space is being taken up by /var/vcap/store/shared/cc-droplets
and /
var/vcap/store/shared/cc-packages
like in the image below, the issue might be attributed to the presence of stale droplets and application packages which need to be cleared out manually.
We have 2 separate workarounds that we can choose from to resolve this issue which are:
expire.rb
). See the steps below on how to use the expire.rb
script to clean these stale artifacts:
1. Per the documentation (https://docs.vmware.com/en/VMware-Tanzu-Application-Service/4.0/tas-for-vms/configure-pas.html#configure-file-storage-16), set the maximum droplets and packages per application to be both 1 as seen in the image below. We can find this setting by clicking TAS tile > File Storage, and change the Maximum valid packages per app and Maximum staged droplets per app to 1 and click the save button.
NOTE: After clicking the save button, we would need to run an Apply Changes only on the TAS tile
2. After the apply changes has successfully completed, we can SSH into any Cloud Controller VM:
bosh -d $(bosh ds --column=name | grep ^cf-) ssh cloud_controller/0
sudo su -
/tmp
directory: cd /tmp
expire.rb
script file, which will contain the contents of our Ruby script: touch /tmp/expire.rb
/tmp/expire.rb
file, and copy the contents below into vim
editor and save the file:/tmp/expire.rb
with vim
:vim /tmp/expire.rb
vim
editor and save the file with the filename of expire.rb
:puts "starting expiring droplets/packages script...."
def self.output_bits_info
current_droplets = DropletModel.where(state: DropletModel::STAGED_STATE).count
current_packages = PackageModel.where(state: PackageModel::READY_STATE).count
puts "Number of droplets: #{current_droplets}"
puts "Number of packages: #{current_packages}"
end
puts "State before"
output_bits_info
AppModel.all.each do |a|
expirer = BitsExpiration.new
expirer.expire_droplets!(a)
expirer.expire_packages!(a)
end
puts "State after"
output_bits_info
7. Run the expire.rb
script:
cat expire.rb | /var/vcap/jobs/cloud_controller_ng/bin/console
8. When running the script, we may get an output that looks like the image below. Notice the total number of droplets and packages printed out. In this case, we have a total droplets and packages count of 8,312 and 11,792. These totals are BEFORE the cleanup is actually run. If the issue is truly due to stale droplets and packages taking up disk space, the number of droplets and packages should decrease after the script is finished running.
9. After the clean-up portion of the script runs and finishes, we may get output similar to whats in the image below. If you get a similar output, press the 'q' key to exit out of the script output:
10. After pressing the 'q' key, we get this output seen in the image below. We notice that the total number of droplets and app packages has decreased down to 7,074 and 7,904. From step 7, we recall that the total number of droplets and app packages were 8,312 and 11,792. Because the number of droplets and packages has decreased, it is likely that we had a fair number of stale droplets and app packages.
11. We can now check to see if the NFS Server VM disk usage has decreased as well. To do this, we can SSH into the NFS Server VM, and run the command df -h
to check disk space usage:
nfs_server
VM: bosh -d $(bosh ds --column=name | grep ^cf-) ssh nfs_server
sudo su -
df -h
command: df -h
Checking the disk usage, we see that it has decreased significantly: