SD Job remains in status "Job Execution Ordered" when Alternate Server feature is enabled
search cancel

SD Job remains in status "Job Execution Ordered" when Alternate Server feature is enabled

book

Article ID: 370491

calendar_today

Updated On:

Products

CA Client Automation - IT Client Manager CA Client Automation

Issue/Introduction

SD Job remains in status "Job Execution Ordered"
Alternate Server feature is enabled on Domain Manager with a list of servers.
 

In TRC_USD_TaskMan*.log we could see that a XML with strange format is sent to the scalability server :

TaskMan |TM:00 |AAORDEXE.CXX |000574|INFO | Job Container TESTCONTAINER [3/4/2024 08:42:00 PM] sending to Server <ServerName>

TaskMan |cfUtilities |cfUtilities |000000|DETAIL | CFIMessage::getMessageXML: xmlmsg: <message version="1.0.0.0"><Container version="0.0.0.0"><Container Ack="1" UID="<UUID>" MgrVer="14.5.0.600" Rollup="0" UserMsg="" TargetTypes="0" Prio="5">
<Type>0<Appl UID="<APPLUUID>" RunAt="3887037740" JobUID="<JOBUUID>" DTS="0"/>
</Type>
<LastCOF>1</LastCOF>
<Transaction>0</Transaction>

There is no Target tag and Type tag is not correct.

A valid XML file is like this :

TaskMan |TM:00 |AAORDEXE.CXX |000574|INFO | Job Container TESTCONTAINER [3/4/2024 08:43:48 PM] sending to Server <ServerName>

TaskMan |cfUtilities |cfUtilities |000000|DETAIL | CFIMessage::getMessageXML: xmlmsg: <message version="1.0.0.0"><Container version="0.0.0.0"><Container Ack="1" UID="<UUID>" MgrVer="14.5.0.600" Rollup="0" UserMsg="" TargetTypes="0" Prio="5">
<Target Name="TARGETNAME" uuid="<UUID>" DBuuid="<DBUUID>" Address="<Target Address>" Calendar="" DLType="0">
<Appl UID="<APPL UUID>" RunAt="3887037848" JobUID="<JOBUUID>" DTS="0"/>
</Target>
<Type>0</Type>
<LastCOF>1</LastCOF>
<Transaction>0</Transaction>

Environment

Client Automation 14.5 CU6

Cause

The problem occurs when alternate Server feature is enabled on Domain Manager with a list of servers

Problem occurs in following scenario :

  1. A job container J1 is sent to computer C1 using Scalability Server S1
    The list of alternate SD Server is built by TaskMan (all servers in the configured list are alternate Server except S1)
    The package is big and the DTS transfer is taking several minutes (package was not already staged in SS library)

  2. Before end of DTS transfer another job container J2 is sent to computer C2 using Scalability Server S2
    The list of alternate SD Server is built by TM (all servers in the list are alternate Server except S2)
    The package is small and the DTS transfer is taking few seconds
    The problem is that the list of alternate SD Server has been overwritten.

  3. The DTS transfer for J1 is terminated but TM uses the the list of alternate SD Server built for J2 and nothing is sent to S1 (as S1 appears in the list of alternate scalability server)

Resolution

Workaround :
 
Disable Alternate SD Server support on Domain Manager 
 
In configuration policy applied on Domain Manager, modify
DSM/Software Delivery/Manager/Alternate SD Server: Enable Alternate SD Server support = False
 
- Restart sdmgr_tm
caf stop sdmgr_tm
caf start sdmgr_tm
 
 
 
Solution :
 
Open a case at Broadcom Technical Support and ask for the fix T533386
This fix delivers the file
sd_Taskm.exe
Size = 2 136 576 bytes
Date = 17/05/2024
Version = 14.5.0.601