Admin Observes "Admin" Issued Stop on an Application Without Root Cause
search cancel

Admin Observes "Admin" Issued Stop on an Application Without Root Cause

book

Article ID: 298067

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Admins or Developers may observe their application receive an "Admin Stop" without issuing a request to stop the application themselves.

Applications would be powered off by the Admin user when an API call to the app (via AppsManager, CF CLI) is made to stop the app. Within the application logs (cf logs appname) the message below would be seen:

Updated app with guid GUID_ID ({"state"=>"STOPPED"})

No matter the manner the API is called, if a user or script using admin privileges sends the request, it will shut down the app saying "Stopped App - Admin".

For the Foundation or the cells to stop the application automatically, a problem would be reported within the application logs and why the app is being shut down by the Foundation. Something similar to the following:

2020-07-06T11:58:40.968-04:00 [HEALTH/0] [ERR] Failed to make HTTP request to '/healthcheck' on port 8080: timed out after 1.00 seconds 
2020-07-06T11:58:40.968-04:00 [CELL/0] [OUT] Container became unhealthy 
2020-07-06T11:58:40.972-04:00 [APP/PROC/WEB/0] [OUT] 2020-07-06 15:58:40.970 INFO [spring-cloud-service-broker,,,] 15 --- [ Thread-16] ationConfigEmbeddedWebApplicationContext : Closing



Symptoms:
The application's event log shows it stopped without any admin knowingly prompting the request to stop the app. 

time                          event                      actor                    description
2020-07-02T13:16:54.00-0400   audit.app.start            [email protected]
2020-06-30T18:02:07.00-0400   audit.app.update           admin                    state: STOPPED
2020-06-30T10:17:02.00-0400   audit.app.start            [email protected]
2020-06-30T10:01:46.00-0400   audit.app.update           admin                    state: STOPPED

The admin users STOPPED state does not mean the platform stopped the app. It is a direct call from someone or some script running as the admin user stopped the app.

Cause

To help identify where the requests are originating:

cf CLI:

  • cf logs <appname> to see what the root cause is for the stop. Use the reference above to make sure the application itself is not crashing.
  • cf curl /v2/events - Curl the /v2/events endpoint using CF CLI to get additional insight into the admin stop request. Output similar to the information below will be presented which will give you more detail. 
"entity":{
"type": "audit.app.update",
"actor": "GUID",
"actor_type": "user",
"actor_name": "admin",
"actor_username": "admin",
"actee": "GUID",
"actee_type": "app",
"actee_name": "appmon-broker-1.4.4",
"timestamp": "2020-07-07T14:10:45Z",
"metadata":{"request":{"state": "STOPPED" }},
"space_guid": "GUID",
"organization_guid": "GUID"
{
 "metadata": {
  "guid": "GUID",
  "url": "/v2/users/GUID",
  "created_at": "2019-10-04T18:15:30Z",
  "updated_at": "2020-01-23T14:34:23Z"


Go Router and Cloud Controller:

  • /var/vcap/sys/logs/gorouter/access.log - Look through the foundations Go-Router access.log. This will also help provide reference to where the request is originating from. Using the Apps GUID, look for a line similar to the one below. The requests body will be what contains the word stop and is not in this log.
PUT /v2/apps/<GUID>?async=true&inline-relations-depth=1
  • Next take the X-Vcap-Request-Id guid from that request, which should be in the gorouters access log. Using that request ID,  grep that ID in the Cloud Controller logs.
  • Look to the "x_forwarded_for" field to see which client the request is being forwarded for. 

Diego Cell:

  • The STOP in the application logs should have something similar to "2020-06-30T10:01:46.473-04:00 [CELL/0] [OUT] Cell <GUID> stopping instance GUID". Take note of the bolded Cell GUID in that specific instance.
  • Using that Cell ID you can investigate what is happening to the application live on the Diego cell using the applications GUID.

Resolution

Customers should identify any scripts or services that could be using the Admin account and issuing admin stop.

NOTE: Do not use "admin" for clients or scripts. It is recommended to use dedicated users/clients to provide more specific information when tracking requests. 

If you still have not identified a root cause for the Admin user stop being sent to the application. Please engage support with the necessary logs needed to investigate.



Checklist:
To help flush out any environmental or infrastructure services: 

  • Are the stop requests occurring at a specific time of day, or a consistent amount of time between stop requests?
  • Are any batch/background jobs running or backups.
  • Is it a specific application or set of applications.
  • Are there any security applications installed which periodically turn off applications
  • Have there been any upgrades or changes to the application.