Loading logs in Apps Manager (TAS for VMs 2.7+) and CF CLI (6.50+) is very slow on large foundation
search cancel

Loading logs in Apps Manager (TAS for VMs 2.7+) and CF CLI (6.50+) is very slow on large foundation

book

Article ID: 298034

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Fetching recent logs in Apps Manager can be slow. By checking the request sent out by your web browser to the platform, you can confirm the request below is not responding or if it takes more time than other requests. For example, for Chrome, click View > Developer > Developer Tools > Network.

Request URL:

https://apps.<SYSTEM_DOMAIN>/proxy/home/log-cache/api/v1/read/<APP_GUID>?envelope_types=LOG&envelope_types=EVENT&limit=1000


This problem can also be observed with CF CLI v6.50 when executing `cf logs --recent`. For example, a smoke-test errand which uses `cf logs --recent` to fetch app logs may timeout on a large foundation. CF CLI ~v6.49 doesn't hit the problem because it fetches logs from the Loggregator traffic-controller instead of log-cache. Here is the timeout issue you encounter when the smoke-test errand uses `cf logs --recent`:

 Failure [75.932 seconds]
Loggregator: cf logs linux [It] can see app messages in the logs
 /var/vcap/packages/smoke_tests/src/github.com/cloudfoundry/cf-smoke-tests/smoke/logging/loggregator_test.go:39
Timed out after 30.000s.
Expected process to exit. It did not.
/var/vcap/packages/smoke_tests/src/github.com/cloudfoundry/cf-smoke-tests/smoke/logging/loggregator_test.go:42
: ------------------------------
 Summarizing 1 Failure:
[Fail] Loggregator: cf logs linux [It] can see app messages in the logs
 /var/vcap/packages/smoke_tests/src/github.com/cloudfoundry/cf-smoke-tests/smoke/logging/loggregator_test.go:42


Another similar issue in Apps Manager is many individual requests are being sent for a single attempt to load the recent logs for an app.

https://apps.<SYSTEM_DOMAIN>/proxy/home/log-cache/api/v1/read/a8671b4f-1e8b-463a-aacb-ecc91bc94e43?envelope_types=LOG&envelope_types=EVENT&limit=1000&start_time=1580937423622548241


Environment

Product Version: 2.7

Resolution

Since Tanzu Application Service (TAS) for VMs v2.7.0, Apps Manager loads recent logs from log-cache instead of firehose. There are two issues that have come up since this change:

  • In TAS for VMs 2.7.0 through 2.7.9, and TAS for VMs 2.8.0 through TAS for VMs 2.8.3, the method of fetching recent logs used for apps that have a lot of logs can result in many requests being sent to Log Cache. These recent logs will not be shown until all of these requests have finished, which can result in a delay. This is fixed in TAS for VMs 2.7.10+ and TAS for VMs 2.8.4+. If you are experiencing this issue please upgrade to get the fix.

  • In TAS for VMs 2.7.0 through 2.7.13 and TAS for VMs 2.8.0 through 2.8.7, the log-cache component itself does resource permission check. In foundations with thousands of apps, this generates a large amount of cloud controller API requests. It has been observed to take 20+ seconds to load recent logs for a single app in some large environments. This is fixed in TAS for VMs 2.7.14+ and TAS for VMs 2.8.8+. If you are experiencing this issue please upgrade to get the fix.

For TAS for VMs smoke-test which includes CF CLI v6.50 or above, the smoke-test might fail due to `cf logs --recent` timeout caused by this issue. If for some reasons you can't upgrade to fixed releases, as temporary workaround to pass the smoke-test, we suggest granting the smoke-tests user `doppler.firehose` scope temporarily to skip the user permission check by the log-cache.  

  1. Refer to the following documentation "Creating and Managing Users with the UAA CLI (UAAC)" about UAAC usage. 
  2. Target at TAS UAA endpoint `uaac target uaa.<SYSTEM_DOMAIN>`.
  3. Login with admin client `uaac token client get admin -s <ADMIN_CLIENT_SECRET>` - ADMIN_CLIENT_SECRET is available at "Ops Manager > TAS > Credentials > UAA > Admin Client Credentials".
  4. Grant smoke_tests user the doppler.firehose permission `uaac member add doppler.firehose smoke_tests`.
  5. Rerun the TAS smoke-test errand - it should be successful, `cf logs --recent` would not timeout.
  6. Remove the permission after the errand runs successfully `uaac member delete doppler.firehose smoke_tests`.