Rally - WSAPI - Best Practices on filtering and queries.


Article ID: 125586


Updated On:


CA Agile Central On Premise (Rally) CA Agile Central SaaS (Rally) Rally Perpetual Hosted Rally Saas


This article lists the variables that are in-play when making WSAPI queries and should be considered when designing your WSAPI programs.

1. A practice of exporting all data of certain artifacts, or entire workspace or subscription is quite different than exporting incremental updates of these objects. Not only exporting all data will take longer to perform, but also if used to override previously exported data then only the 'last snapshot' is always preserved and hence there will be no ability to figure out modifications between snapshots or modification dates.
One of the major benefits of our analytics service is that it creates snapshots for every modification.
Lookback API (LBAPI) is a mechanism that allows you to look back into these snapshots and learn of changes that happened overtime. For example, if you used LBAPI then you could find all changes that happened between certain dates or certain snapshots. None of that is possible with Web Services API (WSAPI) which holds only the current data. Also, none of that will be possible for a customer who chooses to always export all data and use it to overwrite their previous export.

2. With regard to exporting all data, at any point:
To export all that exists currently we can either use WSAPI, or use LBAPI filtering all objects on their 'current' snapshot.
WSAPI's advantage is that it allows to query/export beyond artifacts whereas LBAPI only records artifacts.
LBAPI, however, is better designed for massive queries, it supports much larger page sizes (up to 20000 objects, 10 times larger than WSAPI). If only artifacts are exported then probably better to use LBAPI.
See Web Services API documentation at https://rally1.rallydev.com/slm/doc/webservice/
See Lookback API documentation at https://rally1.rallydev.com/analytics/doc/#/manual
Both mechanisms will run into thresholds at some point where the result count is too large to export. It may result in timeouts, perhaps in partial results returned. Large subscriptions need to consider their requirements and processes, and look to avoid this practice of exporting all data, certainly on a continuous (let alone frequent) basis.  We do limit the maximum connections per user per server to 16.
There is no way to predict what the data threshold is. There are too many parameters involved including number of total artifacts, size of data of artifacts, network, clients/applications used, etc. It may even be that large attachments may play a part even if not queried for.
However, as subscriptions grow they may start experiencing longer execution time for their queries as they near this moving 'threshold'.


Component: ACSAAS


In general, requesting less data will help alleviate performance problems.  There are many ways to accomplish this.
Restricting requested fields
One method to reduce the requested data is proper use of the Fetch argument.  Do not use 'Fetch=true' in your WSAPI query.  A 'true' value requests data for all fields.  Hence using this value will export all fields for every requested artifact, this will increase the data volume, possibly by much or too much.  Instead, it is recommended to explicitly request the fields that contain your required data by using: 'Fetch=<comma separated list of fields>"
The point in specifying fields directly is to avoid unnecessary fields. This includes specifying ALL fields in your query, as it may contain collections, which are separate collections of data. These will add a number of queries, complexity, and data to any API query. Using ‘Fetch=true’ basically pulls all fields, with only a reference to collection fields. Collection fields are also best avoided if possible as they add data to the request result. Collection fields are usually links between artifacts. Sometimes the collection is 1-1 and sometimes 1-many. 1-many relationships will increase the query latency, so if possible to avoid them it will help. Refer to the WSAPI doc to identify collections.  
Another technique to retrieve all results 'by chunks' is to use sets of filters. The idea here is also to lessen the requested data so it won't timeout, for example: use: ((Name >= "A") and (Name <= "Z")) , then use: ((Name >= "a") and (Name <= "z")) etc..
This example basically suggests running a first query that gets of objects of certain artifact that start with an uppercase, then run a second query that returns all objects of same artifact that start with a lowercase. The idea is to use filters that do not overlap, retrieve data-sets that can be joined together to form the full result-set. It's similar to paging where requesting each page, then combining all pages on the client.
This is a similar idea where you get chunks of your results with each query, then combine them to form your full result-set.
The advantage of this method, though, is that it eases the server performance time as it does not ask for all artifacts and then places them in pages, but here it actually filters the results immediately on the server. This method will most likely run faster.
The caveat with this method is that one should be familiar with data. So, perhaps asking for all objects that start with an uppercase, later ask for all that start with lowercase - perhaps that can work for one endpoint (say, Defect), but not for another (say, UserStory). The reason being is that perhaps user stories have some common convention where most start with 'US'. If most or all result start with a specific prefix, then you'll need to break your queries accordingly, so for example:
((Name >= "USA") and (Name <= "USZ"))  , then later:
((Name >= "USa") and (Name <= "USz"))  
It can take some effort (possibly for each exported object) to figure out how to break the queries to chunks.
One way to help here, could be to use the 'Artifact' endpoint (instead of each specific artifact), for example:
https://rally1.rallydev.com/slm/webservice/v2.0/artifact?query=((Name >= "A") and (Name <= "Z")) 
Another useful filtering method is to use the LastUpdateDate attribute, which is the last update date of an object. It is automatically assigned when an object is created or updated. This is useful when pulling
https://rally1.rallydev.com/slm/webservice/v2.0/hierarchicalrequirement?pagesize=2000&fetch=ObjectID,formattedID,Name&query=(LastUpdateDate > “2018-06-19”)
Smaller page sizes will perform faster for slow Internet connections as less data is sent by the server for delivery to the client. WSAPI max page size is '2000'. When experiencing latencies or time-outs then decreasing the page size may help if the query is not too expensive to execute on the database.  It's important to understand that a query executed on the database still needs to compute the total potential recordset and while the delivery of the data from our datacenter to your application may be faster with a smaller window size, the database query may be just as slow with a page size of 20 as it is with 2000.  Therefore, it may reduce overall load on the server to request 2000 records.
There isn't a way to predict by how much to reduce the page-size. Rather, it depends on the amount of data, the subscription size etc - arguments which we don't have exact benchmarks for in the first place.
Reducing the page-size has a cost though. It will require many more runs of that query. Reducing the page size from '2000' to '200' means that the query needs to run 10 times to retrieve the same amount of data. The best practice here is 'trial and error', simply try out different page sizes and see if they return reliably. So, try page size of '1500', if still not reliable try '1000' etc until you get to a page size that seems to perform better.
The 'pagesize' argument goes alongside with the 'start' argument that indicates which page is requested, for example:
  • start=1,pagesize=200 - returns the first 200 query results.
  • start=201,pagesize=200 - returns the second set of 200 query results, page #2.
  • start=1601,page=200 - returns the 9th set of 200 query results, page #9.
  • start=5501,page=500 - returns the 12th set of 500 query results, page #12.

Example 1:

Example 2:
For normal users that are assigned permissions on a project level such as project viewer, editor or admin permissions, a permissions check must be performed for each work item that this user looks up.  This permission check isn't noticeable when using the UI to view small bits of data, however when performing larger data requests, it can impact the performance of a query if that user is granted permissions across a large number of projects.
Because of this, we recommend that integrations are granted workspace admin rights, as this will skip the permission lookup routine since it is known that the account has access to all projects in the workspace.  It is possible to create a read-only API key to use in your integration to prevent data from being altered within Rally when using an account with this permission level.

Assuming the user must use WSAPI to export all data, recommendations are to:First, use specific fetch fields and don't request what's unnecessary.Second, use paging. It will take trial & error to see what page size may reliably perform, then they'll need to figure out how many times it needs to run and if it's at all 'doable'.  If yes - great. Third, add filters to the queries and design a 'pull' mechanism that will combine filters and paging to essentially combine and put together all data they needed.

Additional Information

agile central; querying; filtering; api calls;