App deployment fails due to "no space left on device" with diego cell although it has sufficient space
search cancel

App deployment fails due to "no space left on device" with diego cell although it has sufficient space

book

Article ID: 433455

calendar_today

Updated On:

Products

VMware Tanzu Application Service

Issue/Introduction

App deployments fail with a no space left on device error. This occurs even when the Diego Cell appears to have sufficient free disk space on the primary data partition. The issue persists across different cells and is temporarily "resolved" only by recreating the cell.

In the Diego Cell garden.log, the following error is present: "error":"open /var/vcap/data/grootfs/store/unprivileged/images/.../image_quota: no space left on device"

Run df -h on the affected Diego Cell. You will likely see a discrepancy between the physical data partition and the GrootFS loopback mount:

  1. /dev/sdb2 (/var/vcap/data): Shows ample free space.

  2. /dev/loop0 (/var/vcap/data/grootfs/store/unprivileged): Shows 100% utilization.

  3. Massive entries under /var/vcap/data/grootfs/store/unprivileged/volumes

Cause

The issue is caused by a race condition between Docker image layer accumulation and the Diego Cell disk cleanup trigger.

  • Layer Density: Unlike Buildpack apps (which share 1–2 stack volumes), Docker-based apps utilize multiple unique layers. Each layer creates a unique volume in GrootFS.

  • Threshold Mismatch: The "Diego Cell disk cleanup scheduling" (typically set in Ops Manager) triggers based on the free space of the physical disk (/dev/sdb2).

  • The Problem: If many Docker apps are deployed, the GrootFS loopback filesystem (/dev/loop0) can fill up entirely before the physical disk hits the threshold (e.g., 15GB) required to trigger a cleanup.

Because the cleanup never triggers, GrootFS cannot prune old cached layers, leading to the "No space left" error on the virtual mount despite physical capacity.

Resolution

Please increase the reserved disk space for the cleanup trigger in Ops Manager to ensure it activates before the loopback device is exhausted.

  1. Go to Ops Manager > EAR Tile > Settings > App Containers.

  2. Increase the value for "Reserved disk space for other jobs" , such as from 15360 to 25600 trigger point so that cleanup starts earlier before /dev/loop0 fill up.