Pods evicted due to node low on ephemeral-storage error
search cancel

Pods evicted due to node low on ephemeral-storage error

book

Article ID: 394739

calendar_today

Updated On:

Products

VCF Automation VMware Cloud Foundation

Issue/Introduction

Some pods could crash and get evicted if a node becomes low on disk space.

Environment

  • VMware Cloud Foundation 9.0
  • VMware Identity Broker 9.0
  • VCF Automation 9.0

Cause

If one or more pods running on the node produces a large volume of logs and the node's disk usage is high at the same time, it is possible to run out of ephemeral storage before periodic cleanup happens, leading to pod eviction.

Resolution

If this occurs, increase the disk used by the object store for storing support bundle data using Fleet Manager APIs.

Method: POST

URL: https://<FleetManager hostname>/lcm/lcops/api/environments/<environment ID>/products/<product ID>/actions/invoke 

Payload:

{
  "name": "configure-packages",
  "properties": {
     "namespace": "vmsp-platform",
     "name": "vmsp-platform",
     "values": {"profiles":{"overrides":{"supportBundle":{"logOffloader":{"s3BucketRequests":{"size": "<value>"}}}}}}
  },
  "ref": "/webhooks/core/vmsp/configure"
}

where <value>  depends on the profile used (these are minimum recommended values, you can use larger ones):

Profile
Size
small 83Gi
medium 108Gi
large 183Gi

 

Additional Information

Usually pod eviction leads to a temporary degradation of services with a subsequent recovery. However, if the disk size is not increased proactively or when this problem is observed, generated support bundles might miss the log data for the duration of degradation and recovery.