Starting Container Count and LRP Auctions Spike During TPCF BBR Backup

search cancel

Starting Container Count and LRP Auctions Spike During TPCF BBR Backup

book

Article ID: 392385

calendar_today

Updated On:

Products

VMware Tanzu Application Service

Issue/Introduction

While taking a BBR backup, you may notice a large spike in starting containers and LRP auctions. This could put you over the maximum in-flight container start limit and apps may get delayed and show the following message when starting up:

Error starting instances: 'waiting to start instance: reached in-flight start limit'

The BBR backup may also time out due to this as it tries to start some system apps such as the autoscaler or usage service

Cause

This can be caused by crashing apps. If you have a significant number of crashing apps, you may get a large number of apps trying to start at the same time after the backup completes (specifically when the cloud controllers get unlocked).

We have observed the below behavior:

There are some apps that are continuously crashing and getting restarted by Diego
Cloud controller gets locked for BBR backup
Apps crash and try to start again
Diego can't download the droplet because cloud controller is locked
Crashed apps keep crashing due to missing droplet
Cloud controller gets unlocked after backup
Now there's a large number of apps that are trying to start at the same time

Resolution

There are a couple solutions to this issue:

Recommended: Fix or stop the apps that are continuously crashing
Increase the maximum number of starting containers. Please be aware that setting this too high can overload the diego cells during a cold start

Below are some metrics that may be helpful for tracking the number of starting containers and crashing apps

origin: rep
statsd metric name: StartingContainerCount
origin: bbs
statsd metric name: CrashedActualLRPs
origin: cc
statsd metric name: tasks_running.count

Additional Information

https://docs.cloudfoundry.org/running/all_metrics.html

Feedback

thumb_up Yes

thumb_down No