TDM Portal Data Generation - Intermittent commit of 10,000 records for Snowflake masking is not working as expected
search cancel

TDM Portal Data Generation - Intermittent commit of 10,000 records for Snowflake masking is not working as expected

book

Article ID: 273865

calendar_today

Updated On:

Products

CA Test Data Manager (Data Finder / Grid Tools)

Issue/Introduction

After upgrading to TDM Portal 4.10.218.0, we are now able to connect to the Snowflake warehouse. Data generation works great for less than 500k records. But we need to publish over 2M records and in that case, it looks like Portal was not committing each 10,000 records iterations into the target but is waiting to commit all the rows at the end of the entire run. Instead of finishing the execution, the job basically just fails.

Our guess is because the job was not committing every 10,000 records (default setting) it was maybe keeping all the data in memory till it basically runs out of memory and the jobs fail.
 
Another reason why we think it was not committing every 10,000 records is that we don't see any data on the target side at all. The target remained with 0 rows after the job failed.
 
There is no information in the publish log to pin down the issue.
 
Can you please advise if you had done anything else for the other customer? Or is there anything else that we need in the setting to commit every 10K records?

Environment

Release : 4.10

Resolution

TDM Portal is working as designed. When publishing data from the TDM Protal Data Generator, the batch updates work based on the publish iterations (repeat count [global publish count]), and not the table iterations (table count). 

In this particular case, where the repeat count set to 1, there is no way that it would even get close to the 10,000 commit limit, and the table repeat count was 19,850,226. Unfortunately, it means that Portal was trying to cache almost 20 million rows, before sending them to the database and committing them.

Using the global publish repeat count instead of table repeat count you will see better results, and the publish job will intermittently publish records to the target table in batches of 10,000. To set this up, for this example : repeat count =19,850,226  and table repeat count = 1.

The downside to this solution is there will need to be a generator created for every table that you are working with.