app-usage-server frequent failure and crashes in large TAS foundations
search cancel

app-usage-server frequent failure and crashes in large TAS foundations

book

Article ID: 298095

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Frequent crashes were observed for the app-usage-server app in Large TAS foundations.

The app-usage-server app is crashing with errors:

[ERR] Error during failsafe response: ActionController::InvalidAuthenticityToken
2021-03-16T16:53:46.655+05:30 [APP/PROC/WEB/1] [ERR] /home/vcap/deps/0/vendor_bundle/ruby/2.6.0/gems/actionpack-5.2.4.3/lib/action_controller/metal/request_forgery_protection.rb:211:in `handle_unverified_request'
2021-03-16T16:53:46.655+05:30 [APP/PROC/WEB/1] [ERR] /home/vcap/deps/0/vendor_bundle/ruby/2.6.0/gems/actionpack-5.2.4.3/lib/action_controller/metal/request_forgery_protection.rb:243:in `handle_unverified_request'
2021-03-16T16:53:46.655+05:30 [APP/PROC/WEB/1] [ERR] /home/vcap/deps/0/vendor_bundle/ruby/2.6.0/gems/actionpack-5.2.4.3/lib/action_controller/metal/request_forgery_protection.rb:238:in `verify_authenticity_token'
2021-03-16T16:53:46.655+05:30 [APP/PROC/WEB/1] [ERR]

 

Cause

This issue is generally caused due to a very high number of task events in a very busy and large environment.
Any query for the app-usage generally pulls the last 30 days of task events. If there are a very large number of task events for an environment, it will cause app crashes.
You may check the number of task events in the app_usage_service database by accessing the internal MySQL database for TAS:

SELECT COUNT(*) FROM task_events;
------------
count(*)
------------
34476892
------------



Environment

Product Version: 2.9

Resolution

Workaround

Change health-check type to 'port' as a workaround.
If health-check type change doesn't fix this, the only other workaround is to purge and reseed the cloud-controller events
 

Permanent Fix

The R&D team is working on paginating the app-usage-server app to cache a smaller number of task events at a time to avoid overloading the backend queries.
Upgrade to one of these versions: 2.12.02.11.52.10.17 , 2.9.252.8.292.7.37