Using Timeouts in Orchestration and States

search cancel

Using Timeouts in Orchestration and States

book

Article ID: 325826

calendar_today

Updated On: 08-05-2024

Products

VMware Aria Suite

Issue/Introduction

In Salt, both orchestration and states have timeouts. Sometimes, when creating orchestrations that depend on various states to complete, you may run into issues with orchestrations failing due to timeouts in long running states. This may be because the orchestration gives up waiting before the state has been able to complete. I will briefly describe the scenario and a solution in this article.

Let's say we need to execute a script that may take a long time to run before returning successfully. We'll use an orchestration to execution this across multiple nodes. The example we use is a bit contrived, but the setup is common enough, having an orchestration execute a state.

The orchestration state file orch.longtest.sls will allow a Salt Minion ID to be passed in as Salt Pillar data to determine the target for the Salt State execution. If the pillar data is not provided or is Null, then the default value of testing will be used as the targeting parameter. Notice that this orchestration has a very long timeout set to allow plenty of time for this state to complete and return a response. If following along, be sure to place all files in the same directory.

# orch.longtest.sls
#
# The orchestration will wait up to 15 minutes before timing out
run_sample_state:
  salt.state:
    - tgt: {{ pillar.get('minion', 'testing') }}
    - sls: {{ slspath }}.randomlongtest
    - timeout: 900

# randomlongtest.sls
#
# pycheck.py calls randint to sleep for a random amount of time 
# before returning True

copy_script:
  file.managed:
    - name: /root/pycheck.py
    - source: salt://{{ slspath }}/pycheck.py
    
run_a_random_long_test:
  cmd.run:
    - name: "python3 /root/pycheck.py"
    - timeout: 10
    - retry:
        interval: 3
        attempts: 50
    - require: 
      - file: copy_script

# code for pycheck.py script
from random import randint
from time import sleep

def randomizethis():
    """
    Sleep for a random time between 0 and 60 seconds 
    before returning True
    """
    sleep_timer = randint(0,60)
    print("Sleeping for {}".format(sleep_timer))
    sleep(sleep_timer)
    return True

if __name__ == '__main__':
    randomizethis()

Environment

VMware Aria Automation Config - all versions

SaltProject - all versions

Resolution

Be sure to use the timeout option in your state or orchestration appropriately for actions that may take longer than expected to complete

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No