Client Task Agent Connectivity
search cancel

Client Task Agent Connectivity

book

Article ID: 150990

calendar_today

Updated On:

Products

IT Management Suite Task Server

Issue/Introduction

 Information on better understanding Client Task Agent Connectivity.

Environment

ITMS 8.x

Resolution

Client Task Server Policy

When the agent starts for the first time the Client Task Agent (CTA) requests a list of task servers and their sort algorithm from the Symantec Management Platform (SMP). This sort algorithm is specified from agent policy and can be one of three values:

  • FewestComputers - Select a task server based on the number of connections
  • FastestConnection - Perform a ping test on the task servers in the list to find the fastest connection relative to the CTA
  • AvailableCapacity - Select a task server based on the shares assigned for each entry in the returned list

The last value is new for the enhanced connectivity agents only. This value will only appear in the ''recommendedServerFindMethod'' attribute within the settings element. An example of the policy XML is as follows:

<Policy name="Task Agent Settings" version="7.1.1671.0">

    <ClientPolicy agentClsid="Altiris.ClientTaskAgent">

        <settings timeout="30" taskReportingInterval="15" serverFindMethod="FewestComputers" recommendedServerFindMethod="AvailableCapacity"/>

    </ClientPolicy>

</Policy>

NOTE: In this example the existing ''serverFindMethod'' attribute is set to FewestComputers whilst the ''recommendedServerFindMethod'' is set to use the shares-based selection algorithm. This is expected behavior to support agents that have yet to be upgraded. In the event that the ''recommendedServerFindMethod'' attribute is either FewestComputers or FastestConnection then the ''serverFindMethod'' attribute value will be the same.

Obtaining the assigned Task Servers for a CTA

Upon receiving either initial or updated policy from the NS, the CTA requests a list of task servers from the SMP. This is done by sending a POST to the GetClientTaskServers.aspx web service. This request takes the following parameters:

  • ResourceGuid - The machine GUID for the CTA (REQUIRED)
  • shares - Must have a value of 1 (OPTIONAL)

In the event that the ''recommendedServerFindMethod'' is set to ''AvailableCapacity'' in the CTA portion of the agent policy then ''shares'' must be set to 1.
For example:

http://deathstar.domain.com/Altiris/TaskManagement/CTAgent/GetClientTaskServers.aspx?ResourceGuid=b37ee46c-2736-4acb-bfcb-c394382be8cd&shares=1

Further, the request body XML must contain a list of IP addresses of the available adapters for the CTA regardless of the server find method. This must exclude the ''localhost'' or 127.0.0.1 address. The format for this is as follows:

<request>

    <interfaces>

        <ipAddress ip="192.168.0.11"/>

        <ipAddress ip="192.168.0.53"/>

    </interfaces>

</request>

Here is how the page GetClientTaskServers.aspx provides list of TS-s in response to CTA request:

  1. Initially stored procedure “CtsGetAllRegisteredTaskServers” is called that returns all “active” TS-s currently working with this NS – this is the TS list #1.
  2. Then, list of site servers for this computer’s IP subnet range is obtained – this is the TS list #2.
    1. For CEM clients core method SiteServiceLocator.GetSiteServersForInternet() is used.
      If nothing is returned, then SiteServiceLocator.GetSiteServers() for the SMP machine (not using client’s IPs from the request!).
    2. For internal clients we use SiteServiceLocator.GetSiteServers() with the IP list coming from the client.
  3. Then list #2 is filtered so that it contains only items that are also present in the list #1. This way all “inactive” TS-s are removed from the list #2.
  4. If after step (3) there are some TS-s to return to the client, then the filtered list is returned.
  5. If there are no TS-s in the filtered list, then we can return the full list #1 to the client in 2 cases:
    1. If this is a CEM client. 
    2. If list #2 is empty – i.e. there are no site servers for this client.

The NS will send a response message to the CTA which will contain the FQDN (Fully Qualified Domain Name) of each task sever or the exception condition if the request fails. An example of a successful request is as follows:

<response result="success">

    <servers>

        <server name="deathstar.domain.com" shares="5000" />

        <server name="darthvader.domain.com" shares="400" />

    </servers>

</response>

NOTE: In the above example the server element contains a ''shares'' attribute. This will only appear if shares is specified in the request.

In the event that there is a preexisting manual assignment of task servers to a given agent then only those task servers for that CTA are returned. The selection method is still applied in the same fashion.

Task Server connection method

There are two possible ways of connecting with a task server, one being using HTTPS and the second over HTTP. The method to use is determined by the connection method used for the NS (HTTP or HTTPS). If the current communication between the agent and the NS is using SSL (i.e. HTTPS) then attempt to contact the task server using SSL. If this fails then fall back to using HTTP. If, however, the communication between the agent and the NS is HTTP then ''only'' use HTTP for connecting to a task server.

NOTE: Having a Task Server using SSL with the SMP not using SSL is NOT supported!

Selecting the appropriate Task Server

Once the list is received from the NS, selection order of which task server to attempt to register with must be applied based on the method sent from the SMP in agent policy. In the event that a task server cannot be contacted then the next task server in the list is selected and registration is attempted.

Fewest Computers Order

In order to connect using this method the number of current connections for each task server must be established. This is performed by querying each task server in the list to obtain the current active connections. A web service has been provided to satisfy this requirement and is in the following format:

http://deathstar.domain.com/Altiris/ClientTaskServer/GetComputerCount.aspx

Where deathstar.domain.com is the task server being queried.

The response message from this call is very simple, returning an integer value representing the current number of agents that have registered with this server. An example of what is returned is as follows:

<response result="success">1</response>

Once all task servers have been queried, they are sorted in ascending order based on the number of current connections that have been returned and the registration is performed using this order.

Fastest Connection Order

In order to establish the fastest connection order, the CTA sends a series of ICMP pings to each server recording the response time and averaging this value. Registration with the Task servers should follow the order of lowest average ping time to highest.

Available Capacity Order

This method is calculated without any additional contact with any task server, it relies solely on the information sent from the SMP in response to the GetClientTaskServers request. Ordering of Task Servers is provided by using a random value between 1 and the sum of all task server share values.

Conceptually, a way of thinking of this is if there is a drawer of colored socks where there is one sock that is green and many red socks. A sock is removed from the drawer at random and it will 'probably' be a red sock. Each subsequent removal of a sock from the drawer will increase the ''chance'' that the next sock will be the green sock. In essence, this is what is being achieved with the shares component that is assigned to each task server, the task server list returned from the SMP is sorted randomly such that servers with higher number of shares would get proportionally higher chance of being the first attempted task server for registration.

An example of how this is applied to the task server share values returned from the SMP is provided below.

Consider that we have 3 task servers named deathstar, darthvader and darlek with share values of 4300, 1000 and 200 respectively. A random value is generated based on the total number of shares for all servers; in this case between 1 and 5500.

The generated value happens to be 5211; in which case it falls between the value of 4300 (the first server share value) and 5300 (the sum of the first and second server share value). This indicates that the first Task Server is darthvader in the ordered list; subsequently this is then removed from the selection criteria and a new random value is generated for the sum of the remaining task server shares; which is now between 1 and 4500. A value of 3202 is generated which is within the range of the first task server returned in the list and it becomes the second task server in the ordered list; in this case deathstar; and is removed from our share selection group. Now that there is only a single task server remaining it is placed at the end of the ordered list.

Saved Task Server

Once the appropriate sorting order has been applied to the list of task servers returned from the SMP, a check is made to see if the CTA has previously been registered with a task server. This is done to preserve Task Server Job / CTA continuity.

This check must be performed after sorting the list based on ordering dictated by policy but before any registration attempt is performed. To obtain the previously assigned task server a web service is provided on the SMP and is called with the machine GUID of the CTA in the query string as follows:

http://deathstar.domain.com/Altiris/TaskManagement/CTAgent/PersistentSettings.aspx?operation=get&ResourceGuid={b37ee46c-2736-4acb-bfcb-c394382be8cd}

The NS will send a response message to the CTA which will contain the FQDN (Fully Qualified Domain Name) of last task server in the LastServer property. An example of a successful request is as follows:

<response result="success">

    <properties count="1">

        <property name="LastServer">darthvader.domain.com</property>

    </properties>

</response>

  

In the event that that a task server is returned from the SMP and the task server is in the sorted list of task servers, it is moved to the top and is the first task server that the registration is attempted. However, if the task server is not in the list then it's cleared from the SMP and the ordered list of task servers is not modified.

Task Server retry and back-off mechanism

The periodic connection back-off feature of Client Task Agent (CTA) allows for a periodic retry of communication with the task server in the event of a communication failure. This could be during registration, checking for jobs or sending task status back to the task server for example.

The back-off process follows these rules:

  1. Initial registration
  1. Agent starts up.
  2. Agent queries the last task server from SMP.
  3. Agent gets the list of task servers from SMP (note that there is no file-system level cache) and checks whether the last task server is still there.
  4. Agent attempts to re-register with the last task server.
  5. FAIL: retry registering with the last task server in 2 minutes (+/- 30 seconds).
  6. FAIL: retry registering with the last task server in 4 minutes (+/- 30 seconds).
  7. FAIL: forget the last task server (send empty string to SMP). Re-request list of task servers from SMP, sort it and perform full registration attempt (old task server is not treated specially).
  8. Agent orders the list of task servers and pushes the last task server to the end of the list.
  9. Agent attempts to register with a task server one after another.
  10. Agent registers with a task server.
  11. Agent persists the name of the task server on SMP.
  1. Any failed communication to the current task server (this includes periodic check-ups or any other requests).
    1. Agent aborts the current operation (e.g. requesting task definition XML).
    2. Agent attempts to re-register with the current task server.
    3. FAIL: retry registering with the current task server in 2 minutes.
    4. FAIL: retry registering with the current task server in 4 minutes.
    5. FAIL: forget the current task server (send empty string to SMP). Re-request list of task servers from SMP, sort it and perform full registration attempt (old task server is pushed to the end).
    6. FAIL: retry registering with previously available list of servers from SMP in 2 minutes.
    7. FAIL: retry again in 4 minutes.
    8. FAIL and so on: retry again by increasing the back-off period by multiplying it by 2 (2 minutes, 4 minutes, 8 minutes, 16 minutes, etc.)
    9. Registration attempts are randomized with +/- 30 seconds if back-off period is larger than 5 minutes (could be +/- 1 minute, if seconds resolution is not available).
    10. Any intermediate request attempt to task server (e.g. periodically asking for new tasks), restarts the registration cycle from point 2.5.
    11. From the first failure the Agent considers it's not registered until registration succeeds.
    12. Until registration succeeds, no attempts to communicate to a task server (e.g. post task execution status) should be even attempted.
    13. The only exceptions to this rule are (which will cause the back-off period to start all over again):
      1. User/API forced task check-up.
      2. User/API forced re-registration (this should cause the list of task servers to be refreshed from SMP).
    14. Success: Agent now thinks it's registered and cancels the back-off period.

These options:

"Choose the Task Server relative to the remaining capacity of each server" --Default one by the way

"Choose the Task Server to which the agent has the fastest connection"

"Choose the Task Server with the fewest computers currently connected"

can be found under Settings>Notification Server>Settings\Notification Server>Task Settings>Task Agent Settings

 

Customers Examples

Example 1:

I still can’t understand why I’m seeing inconsistencies with the same client registering differently on two different subnets. In short it seems when a client is plugged in to subnet A it always registers to the local task server A. When I plug the client in to subnet B (a newly setup vlan to segment workstations), it always registers with one of a number of task servers on the other side of the world.
I have "Choose the Task Server with the fewest computers currently connected" as my current Task Agent Setting.

The behavior observed when plugged in to subnet B can be explained by our current task agent setting to choose server with fewest computers. Looking at the clients agent log it appears “depot01” does have the fewest connections. Our local server depot02 has the most at 287.

 The question I have is when I then plug in to subnet A again, how come it attempts first to register with the SMP and then registers with the local task server? It seems to only check the computer count of these two servers.

Response:

SMP server has logic how to provide the list of ‘suitable’ task-servers for an agent which requests the registration. This logic will filter-out all task servers from the network segments which are different from the network segment in which agent currently resides. So it is expected to see different behavior if network interface on agent side switched to different VLAN.
Subnet A is where the SMP and local task server reside. That explains why with the current setting when the client does it’s check it only checks these two servers and then registers.