Applications facing Graceful shutdown exceeding timeout issue
search cancel

Applications facing Graceful shutdown exceeding timeout issue

book

Article ID: 298320

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

TAS for VMs requests a shutdown of your app instance in the following scenarios:

  • When a user runs cf scalecf stopcf pushcf delete, or cf restart-app-instance

  • As a result of a system event, such as the replacement procedure during Diego Cell evacuation or when an app instance stops because of a failed health check probe

To shut down the app, TAS for VMs sends the app process in the container a SIGTERM. By default, the process has ten seconds to shut down gracefully and if that is achieved, you will see following line in logs.

2024-01-11T11:41:46.65+0100 [APP/PROC/WEB/0] OUT Exit status 143


If the process has not exited after ten seconds, TAS for VMs sends a SIGKILL and you will see following line in logs.

 

2023-09-04T17:35:35.080312285Z APP/PROC/WEB/14  OUT Exit status 137 (exceeded 10s graceful shutdown interval)

 

By default, apps must finish their in-flight jobs within ten seconds of receiving the SIGTERM before TAS for VMs terminates the app with a SIGKILL. For instance, a web app must finish processing existing requests and stop accepting new requests. 

I

Environment

Product Version: 2.11

Resolution

In order to know where the app is spending time strace command can be used. Use this instructions in order to do so.

  1. Follow this kb article to get the app instance OS level pid.
  2. As root and from the diego_cell, run strace -tt -T -fp <PID> -o /tmp/strace.txt .

Analyse strace.txt to see where the app is spending time 

Although it's not the purpose of this article to give a detailed process on how to read strace output but rather on how to use it, one of the functions to focus on is Write, which is used to write into a file or file descriptor. Here is an example.

3085078 10:38:34.628573 write(4, "2023-09-13 10:38:34.630 UTC [000"..., 87) = 87 <0.000030>


The line above shows the app writing logs somewhere. 

If you need help on analysing the output, please open a support request.