Scaling cartographer's parameters increased api server CPU usage
search cancel

Scaling cartographer's parameters increased api server CPU usage

book

Article ID: 393810

calendar_today

Updated On:

Products

VMware Tanzu Application Platform

Issue/Introduction

After tuning cartographer's parameters as discussed on out Tanzu Application Platform tech docs.

Specifically the following parameters:

  • max_workloads
  • max_deliveries
  • max_runnables

You notice that control plane CPU usage also increases. 

 

Resolution

Increasing Cartographer concurrency is followed by increased control plane API calls is an expected behaviour hence this is likely causing the increase in CPU usage. Cartographer's work is to create/update objects, read the results from those objects and then create/update more objects with those results.

Here is a sample of series of API calls:

  1. Carto updates a kpack Image spec.
  2. The kpack controller reacts to that updated object by creating a kpack Build object.
  3. Eventually that build completes and the kpack Image status is updated.
  4. Carto reacts to that by updating the associated ImageVulnerabilityScan object.
  5. Then the scan controller reacts.

Each time an object is created or updated, that's at least one API call. With increased concurrency, that can become a burst of CPU burst. It is expected to have two sorts of bursts:

  • Most significant would be an update to buildpacks, since that could kick off updates to the builds for many workloads all at the same time.
  • A less likely cause would be human synchronization. For example, if every team had a practice of committing work at the end of a day you could see TAP testing, building, scanning, etc all at once. Cartographer would be able to propagate more of that work all at once.

It is recommended to scale control plane. Control plane instances should be scaled vertically (more powerful instances) before scaling horizontally (increased number of instances).
For more info on scaling control plane instances please refer to this link