IntelliRollup SD Job Container in error SDM228518

book

Article ID: 220705

calendar_today

Updated On:

Products

CA Client Automation - IT Client Manager CA Client Automation CA Client Automation - Software Delivery

Issue/Introduction

SD Job Container for IntelliRollup patch is in error :
Job Execution error, job container removed. [SDM228518]

 

Cause

IntelliRollup SD Job needs 2 SD Jobcheck executions on target machine
 
1- First one to execute the global script and find out which patches are missing. SD Agent sends the list of missing patches to its Scalability Server
2- Second one to install the missing patches.
 
The error SDM228518 could occur if there is a big delay between the 2 JobChecks and SD Job Container timeout is reached.
 
 
 
Example :
 
- At 9am a SD Job Container for "UPM - CA - Patch Me - Security IntelliRollup v2106.00" is sent to target T with a timeout of 3 hours
 
- The scalability Server of target T sends SD Trigger every 10 minutes. But due to network reasons (firewall, or other reasons), Target T could not received the SD Trigger.
 
- at 11am a manual or scheduled JobCheck is executed on Target T.
IntelliRollup script is executed and list of missing patches are sent to the Scalability Server
 
- The scalability Server of target T sends SD Trigger every 10 minutes. But due to network reasons (firewall, or other reasons), Target T could not received the SD Trigger.
 
- At 12pm, SD Job Container is in timeout.
 
- At 12h20 a manual or scheduled JobCheck is executed on Target T. 
But as SD Job container is in timeout error, following error is generated 
Job Execution error, job container removed. [SDM228518]
 

Environment

Client Automation - All Versions.

Resolution

There are different possibles solutions
 
1- Increase the timeout of SD Job Container. Example from 3h to 12h.
 
or
 
2- If possible resolve the network problem which prevents Target agent to receive the SD Trigger from its scalability Server
 
or
 
3- If SD Jobcheck was executed manually. Wait some minutes and execute again a second manual Jocheck to do the second part of IntelliRollup Job
 
or
 
3- By Default SD Jobcheck are automatically executed at these times :
 
- Manually by clicking on "Start SD Job Check...."
- When SD Trigger is received from the scalability server.
- At startup of caf
- Every day between 12pm and 1.30pm (via scheduled Task refreshregistration)
- Every 12 hours + random delay 4h (via scheduled Task sdagentschedule)
 
 
Maybe a solution is to change the configuration policy applied on the machines to change the scheduled Task sdagentschedule to run every 6 hours instead of 12h.
DSM/Common Components/CAF/Scheduler/Run the USD agent/CAF Scheduler: Repeat = 6
 
 
 
 
Warning :
If there are a lot of machines per scalability server (ex: 1000 or more), do not put a too low value for "CAF Scheduler: Repeat". Because all Agents with execute SD agent frequently and this will overload SDServer plugin on the Scalability Server.
 
Ex:
1000 agents with a frequency of 6 hours means 1000 SD Jobcheck in 6 hours = 166.6 Jobchecks per hour = 2.7 Jobchecks per minute. This could be ok
1000 agents with a frequency of 3 hours means 1000 SD Jobcheck in 3 hours = 333.3 Jobchecks per hour = 5.5 Jobchecks per minute. This could be ok but could affect a bit SD Server performance
1000 agents with a frequency of 1 hour means 1000 SD Jobcheck in 1 hour = 16.6 Jobchecks per minute. This could cause problem in SDServer plugins.
 
 
 

Attachments