"no space left on device: unknown" error when Diego Cell fails to start app containers when no space left on tmpfs /var/vcap/data/sys/run
search cancel

"no space left on device: unknown" error when Diego Cell fails to start app containers when no space left on tmpfs /var/vcap/data/sys/run

book

Article ID: 298465

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

You run into this error when:
  • You upgraded to Tanzu Application Service for VMs (TAS for VMs) v2.12+, v2.11.4+, v2.10.16+, v2.9.24+, v2.7.36+, which all contain garden-runc v1.19.29+.
  • The Linux Xenial stemcell version used in TAS for VMs is lower than v621.115 or v456.152.
The following are examples of the error symptoms seen:
# error from cf push

2021-12-15T12:54:37.47+0000 [CELL/1] ERR Cell 0ae5e5a3-62e9-4b78-8e86-5920c25661ec failed to create container for instance 6411cf6e-3289-4b07-46a8-7fb9: failed to create shim: write /var/vcap/sys/run/containerd/state/io.containerd.runtime.v2.task/garden/6411cf6e-3289-4b07-46a8-7fb9/options.json: no space left on device: unknown
# error in rep log

{"timestamp":"1639621126.668596029","source":"rep","message":"rep.garden-healthcheck.garden-health.healthcheck.create.failed","log_level":2,"data":{"attempt":0,"error":"failed to create shim: write /var/vcap/sys/run/containerd/state/io.containerd.runtime.v2.task/garden/check-2522be25-4dff-4af1-65a0-57cbe269f098/options.json: no space left on device: unknown","session":"5.1.100.3"}}
# space full under /var/vcap/data/sys/run

diego_cell/5b08405c-82bf-410d-9680-92d7eb75f221:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  103M   16G   1% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/nvme1n1p1  2.9G  1.4G  1.4G  50% /
/dev/nvme0n1p2   68G  8.7G   56G  14% /var/vcap/data
tmpfs           1.0M 1000K   24K  98% /var/vcap/data/sys/run
/dev/loop0       64G  3.9G   60G   7% /var/vcap/data/grootfs/store/unprivileged
/dev/loop1       64G   97M   64G   1% /var/vcap/data/grootfs/store/privileged


Environment

Product Version: 2.7
OS: linux

Resolution

The BOSH agent in certain lower Xenial stemcell versions mounts /var/vcap/data/sys/run as an 1 MB tmpfs.

The garden-runc v1.19.29+ starts to store container states into this 1 MB tmpfs. When multiple app containers are scheduled to a Diego Cell, the 1 MB disk space is easily used up, which causes the error.


Workaround

To temporarily work around this issue, you can enlarge the size of the tmpfs mounted at /var/vcap/data/sys/run to 16 MB.

1. Run this command to stop all services: 
monit stop all

2. Run this command: 
umount /var/vcap/data/sys/run

3. Run this command: 
mount -t tmpfs -o rw,relatime,size=16m tmpfs /var/vcap/data/sys/run

4. Run this command to start all services: 
monit start all


Resolution

To resolve this issue, upgrade to the following appropriate stemcell version. The following BOSH agent versions, with corresponding stemcell versions, set up a 16 MB disk space for the tmpfs mounted at /var/vcap/data/sys/run.
  • BOSH agent 2.268.21+ (Xenial stemcell 621.115+)
  • BOSH agent 2.234.11+ (Xenial stemcell 456.152+)