SaltStack Enterprise Performance Configurations

search cancel

SaltStack Enterprise Performance Configurations

book

Article ID: 312987

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

This guide provides details on tweaking the SaltStack Enterprise Configuration for performance considerations.

Concurrency

max_processes: 8
num_processes: 0
background_workers:
  combined_process: true
  concurrency: 0

By default SaltStack Enterprise (SSE) auto-calculates based on CPU cores the number of web workers and background workers up to a maximum of 8. If you wish to go above that limit, you must configure the max_processes limit or configure the processes for each individually. num_processes configures the number of web workers that will be started and background_workers:concurrency configures how many background workers get started. When the load of web requests is high, you may need more web workers. If you run a lot of schedules, you may need more background workers. To determine if you need more workers of any kind, you can monitor CPU usage and see how busy each set of processes are. Web workers are labeled with [Webserver] and background workers are labeled with [celeryd].

SSE typically starts up both of these process types automatically. To split out the two processes, configure background_workers:combined_process to false. When this is done, the raas command will only start web workers, and the raas worker command will only start background workers.

Worker limitations

webserver_max_memory: 0
webserver_max_time: 0
webserver_body_timeout: 
webserver_max_body_size: 
webserver_max_buffer_size: 
background_workers:
  max_tasks: 100000 
  max_memory: 0

Use webserver_max_memory and background_workers:max_memory to limit memory usage. This is used to mitigate memory leaks sometime present in library dependencies in unpatched systems. If the limit is reached, the worker will be restarted. background_workers:max_tasks has a similar effect in restarting background workers after they have completed a certain number of jobs.

webserver_max_time, webserver_body_timeout, webserver_max_body_size, and webserver_max_buffer_size all limit the time or size that is allocated to one web request. If the limit is passed, then the request is dropped. This is useful if you experience a denial of service attack.

Redis queue tuning

background_workers:
  prefetch_multiplier: 4
  without_heartbeat: false
  without_mingle: false
  without_gossip: false

By default, background workers coordinate queue activities to better load balance tasks between them. However, this causes more noise on the "queue" (Redis). If you are processing lots of background work or have a Redis instance that can not be upgraded, you can turn off without_heartbeat, without_mingle, and without_gossip. This turns off worker coordination. The trade-off is that tasks will not load balance as well, but you get the advantage of less queue activity, which means jobs often start faster.

Additionally, you can set prefetch_multiplier higher. Each worker will fetch multiple tasks at once to reduce queue usage. When you set prefetch_multiplier higher, the worker will acquire more tasks at once. This again reduces the load balancing of tasks, but also in turn reduces queue usage.

Cycle tuning

By tuning how often caching and clean up is done, you can increase or decrease performance of the system as a whole.

Caching

cache_cycle: 30

During a cache cycle, tasks like aggregating return data are performed. By increasing the cache cycle time, you can reduce duplicate work and decrease the load on the system. The drawback is that information is usually not presented in the UI as quickly.

Conversely, by decreasing this time, you will see results in the UI faster, but put a greater load on your background workers.

Clean up cycle

clean_up_cycle: 900
job_unresponsive_check: 5
job_unresponsive_check_stop: 2880
master_unresponsive_check_limit: 2

The clean up cycle performs task like checking for stuck jobs and cleaning them up or checking in with the master to make sure the job is still running.

Increasing the clean_up_cycle decreases how often clean up cycles occur, which lowers the load on the system. However, you can also tune some other configurations to make the clean up cycle less intensive.

job_unresponsive_check is the amount of time SSE will wait before considering a job to be stuck. If you know your jobs take longer, increase this time so that SSE will not check up on them as often. Conversely, if you know your jobs are shorter, decrease this time so jobs will get clean up faster.

job_unresponsive_check_stop is the maximum allowed time for a job to run. Adjusting this to better fit your job profile can allow SSE to clean jobs faster.

When a job is "stuck", SSE will send a find_job to the master to determine if the job is still running. If the master responds, SSE will consider the job to still be in progress. However, if the master does not respond, it will try again on the next clean up cycle. SSE continues this pattern until the master_unresponsive_check_limit is reached. If you reduce this limit, SSE will reduce the number of check ups done. If you expect your jobs to complete within a clean up cycle, reducing this limit will clean up stuck jobs faster while also reducing load on the system.

Scheduler

schedule_cycle: 10
scheduler_max_futures_per_cycle: 500
scheduler_max_futures_weeks_ahead: 12

The scheduler runs based on the schedule_cycle time. Increasing this time will reduce the load on the system. Decreasing this time means you will get finer grained schedules, thus increasing the load.

If you don't need a 10 second granularity, you can increase the timing to reduce load on your system without an perceivable impacts.

The scheduler also pre-calculates future schedules out to a certain limit based on scheduler_max_futures_weeks_ahead. You can decrease this limit to reduce the amount of work needed to calculate future schedules. However, this is a one time calculation, so unless you change schedules a lot, you probably won't notice much of a change in load.

During these calculations, work is chunked by the scheduler_max_futures_per_cycle configuration. If you want more schedules to be calculated, you can increase this number, or decrease it to reduce the load on individual jobs. This configuration should be based on how much data your database can handle at once.

Miscellaneous

enable_grains_indexing: true

A considerable amount of background work needs to be done to produce the auto-complete indexing needed for the UI. If you have a large amount of minions or targets, you can turn this processing off to considerably reduce the computation needed.

minion_onboarding_throttle: 0

Adding minions to SSE is an expensive operation, especially when there are many minions. If you add minions often from multiple masters, you might need to throttle masters so they do not push minions to SSE all at once. minion_onboarding_throttle is the time SSE is locked before allowing another master to add more minions. If the master adding minions finishes before this time, then SSE is unlocked and the next master is let in.

websocket_debounce: 5

websocket_debounce is the smallest interval a websocket subscription can get updated data. Increase this time to decrease load on the system. However, you will get updated data, less often.

Environment

VMware Aria Automation Config 8.12.x

Feedback

thumb_up Yes

thumb_down No