When a VM is in an "unresponsive state", you may be unable to "BOSH SSH" to the VM, but performing a regular SSH may work. When you SSH to the affected VM, you may find that all services are up and running per "monit summary
". Services may be restarted successfully, but the underlying problem will remain unresolved.
Check whether the inodes are exhausted on any filesystem, in particular the root filesystem (/), with this command:
df -hi ubuntu@bosh-stemcell:~$ df -hi Filesystem Inodes IUsed IFree IUse% Mounted on udev 995K 387 995K 1% /dev tmpfs 998K 397 998K 1% /run /dev/sda1 10M 215K 9.8M 3% / none 998K 3 998K 1% /sys/fs/cgroup none 998K 1 998K 1% /run/lock none 998K 1 998K 1% /run/shm none 998K 1 998K 1% /run/user tmpfs 998K 3 998K 1% /home/tempest-web/uaa/configIf >
IUse%
is 100% for the root filesystem, the VM has become unresponsive.touch
command to create an empty file:
touch /testfile
If you see the error "No space left on device
", this confirms that the filesystem is not writeable. If the file is successfully created, then delete it with rm /testfile
.
cd
to /
and run this command string:
sudo find . -xdev -type f | cut -d “/” -f 2 | sort | uniq -c | sort -n
Note: The "sudo
" is unnecessary if you have already used sudo to assume the root privilege in your shell.
Find the directory with the largest number of files and cd
to that directory. If it necessary to identify the likely sources of the files, run the command again. For example, if McAfee has generated the files, then you may see a large number of files (>100,000) under /var/McAfee
.
If a third-party software product, such as McAfee, has generated too many files and has used up all available inodes, then the you will need to open a support case with that software vendor to safely delete the unneeded files and to manage the files in the future to avoid this issue from repeating.