Some jobs fail when running large number of jobs against a salt master.
book
Article ID: 384282
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
If a large number of jobs are run on a salt-master the following errors can be seen and some jobs fail to complete:
"ERROR executing 'state.apply': File client timed out
"ERROR executing 'state.apply': Nonce verification error"
Resolution
- Run the jobs in smaller batches so not to overwhelm the system.
- Increase worker threads (worker_threads) to 24 and possibly up it to 1.5 number of cpus and monitor how much memory is consumed.
- Increase the queue (event_return_queue) to to at least 1x possibly 2-3x number of minions.
- Increasing the auth timeout on the minion side (auth_timeout to 240).
Feedback
thumb_up
Yes
thumb_down
No