Maximum Values for MaxRetry/SleepInterval/FailureInterval POJO parameters

search cancel

Maximum Values for MaxRetry/SleepInterval/FailureInterval POJO parameters

book

Article ID: 388640

calendar_today

Updated On: 02-19-2025

Products

Autosys Workload Automation

Issue/Introduction

Use Case: During testing of POJO jobs it was discovered that some reported failure in autosys although they were completed successfully in databricks for example. After adjusting the values, the job is reporting as success in both autosys and databricks. This would need to be addressed on a case by case for each job as they all may have different run times. It is anticipated that many more jobs will be added, and we would like to have a static value; or some infinite value, that could be used as a threshold for all jobs. This would decrease effort and time spent, and would allow for some jobs to run longer than expected without producing failures or requiring JIL changes on the fly. Attempted using different values but some jobs will run 30 seconds and other may run 60+ minutes.

databricks.MaxRetry	The number of retries for the monitor status request in case of failure.
databricks.SleepInterval	The interval in milliseconds before resubmitting the monitor status request after a failure.
databricks.FailureInterval	The amount of time in seconds before marking the job as failed due to Databricks REST API failure.

iics.SleepInterval	The time interval, in seconds, before resubmitting a monitor status request.
iics.FailureInterval	The time interval, in seconds, before a job fails due to API failure.
iics.MaxRetry	The maximum number of retries for monitoring status requests in case of failures.

JIL Examples:

insert_job: DATABRICK_POJO_BRAD job_type: POJO
machine: localhost
owner: autosys
permission:
date_conditions: 0
description: "Testing"
alarm_if_fail: 1
alarm_if_terminated: 1
method_name: runJob
j2ee_parameter: databricks.Endpoint=https\://<host>.azuredatabricks.net
j2ee_parameter: databricks.Token=$$DATA
j2ee_parameter: databricks.JobId=112233445566771
j2ee_parameter: databricks.Parameters="\"notebook_params\":{\"parm1\":\"hello-2025\"}"
j2ee_parameter: databricks.ProxyHost=""
j2ee_parameter: databricks.ProxyPort=""
j2ee_parameter: databricks.ProxyUser=""
j2ee_parameter: databricks.ProxyPassword=""
j2ee_parameter: databricks.MaxRetry=500
j2ee_parameter: databricks.SleepInterval=1000
j2ee_parameter: databricks.FailureInterval=300
j2ee_parameter: databricks.LogLevel=16
class_name: com.broadcom.databricks.DatabricksPojo

insert_job: INFORMATICA_POJO_GLOBAL job_type: POJO
machine: localhost
owner: autosys
permission:
date_conditions: 0
alarm_if_fail: 1
alarm_if_terminated: 1
method_name: startAndMonitorTaskFlow
j2ee_parameter: iics.AuthURL=https\://<host>.informaticacloud.com
j2ee_parameter: iics.Username=some_user_account
j2ee_parameter: iics.Password=$$INFO_PW
j2ee_parameter: iics.ServiceURL=https\://<host>.informaticacloud.com
j2ee_parameter: iics.TaskflowName=tf_generic_source_overwrite_v1
j2ee_parameter: iics.UseParamSet=false
j2ee_parameter: iics.InputFields=""
j2ee_parameter: iics.FailSuspended=false
j2ee_parameter: iics.SleepInterval=5
j2ee_parameter: iics.FailureInterval=5
j2ee_parameter: iics.MaxRetry=10
j2ee_parameter: iics.LogLevel=16
class_name: com.broadcom.pojo.iics.IICS

Cause

AutoSys Workload Automation r12.x, r24.x
Workload Automation System Agent r12.x r24.x

Resolution

The values provided in the Job Definition are converted to java primitive types values internally.

For example, if we provide the value as 2 for 'databricks.MaxRetry', it will be converted to java.lang.Integer value.

While infinity values cannot be provided, consider the maximum allowed values.

Integer.MAX_VALUE = 2147483647
See Class Integer

Please note that it is not recommended to use max values.

SleepInterval: Controls how long to wait before the next status check of the job. This will be the minimum time for status to transition states. For example, if SleepInterval was set to 1 hour and the job transitions to a terminal state (success or failed) right after checking the job status, the job would take 1 hour to transition to the terminal state in AutoSys.

IMPORTANT: The max values for SleepInterval depends on if the unit is seconds or milliseconds. This is the max delay that you are willing to see in job completing.
Consider the following max values for each
- For seconds the max value is an integer, which is 2147483647
- For milliseconds the max value is a long, which is 9223372036854775807
  
  It is recommended to not set the SleepInterval to anything longer than a couple of minutes due to the delayed status transition. Doing so would give the impression of jobs stuck in RUNNING state.
FailureInterval: The maximum time the agent will ignore API or network failures before failing the job. The job owners will need to decide how long they are willing to wait for the job to fail due to api outage (on databricks, iics, etc side) or a network failure.
MaxRetry: Controls how many times we will check the status of the job. If the MaxRetry is exceeded the job will fail.
Note: For this particular use case, the recommeded value for MaxRetry is 2147483647

Taking all of the above into consideration, it is recommend to set the values accordingly for each job so as not to introduce unwanted/unnecessary delays in your batch cycles.

It is important to note that for SleepInterval there is an unfortunate discrepancy between the different plugins. Some uses seconds and some use milliseconds.

DataBricks uses milliseconds
IICS uses seconds

Another important note for consideration; since these are cloud services, there are maintenance windows in which the vendor can make your service unavailable while applying maintenance on the application. Every cloud vendor, service, and SAAS apps have maintenance windows. Consult the vendor of the cloud service regarding contractual maintenance windows.

Feedback

Was this article helpful?

thumb_up Yes

thumb_down No