Seed table limitation?

Products

CA Test Data Manager (Data Finder / Grid Tools)

Issue/Introduction

We need to obfuscate over 400,000 unique credit card numbers. We are doing this via tool obfuscation.

We have a seed table within the GTSRC_REFERENCE_LOV1 table with a RL_REF_ID of EJCREDITCARDS and it used to contain 250,000 rows

When we ran the credit card obfuscation we encountered a large number of duplicate credit card values, which we attributed to having more unique credit card numbers than we had obfuscations for them (400,000 vs 250,000).

We wrote a program to generate 1,000,000 unique obfuscated credit card values and we removed the old values from the GTSRC_REFERENCE_LOV1 table and inserted our new values.

Here is a list of the first 10 and last 10 rows in the seed table GTSRC_REFERENCE_LOV1.

PCAGRID.GTSRC_REFERENCE_LOV1

PROD ----------FETCH STATUS: COMPLETE--------------------------

RL_REF_ID RL_RN RL_TOTAL RL_REF_VALUE

EJCREDITCARD 1 1000000 7000000000012349

EJCREDITCARD 2 1000000 7000000000024682

EJCREDITCARD 3 1000000 7000000000037023

EJCREDITCARD 4 1000000 7000000000049366

EJCREDITCARD 5 1000000 7000000000061700

EJCREDITCARD 6 1000000 7000000000074042

EJCREDITCARD 7 1000000 7000000000086384

EJCREDITCARD 8 1000000 7000000000098728

EJCREDITCARD 9 1000000 7000000000111067

EJCREDITCARD 10 1000000 7000000000123401

EJCREDITCARD 999991 1000000 7000012339888943

EJCREDITCARD 999992 1000000 7000012339901282

EJCREDITCARD 999993 1000000 7000012339913626

EJCREDITCARD 999994 1000000 7000012339925968

EJCREDITCARD 999995 1000000 7000012339938300

EJCREDITCARD 999996 1000000 7000012339950644

EJCREDITCARD 999997 1000000 7000012339962987

EJCREDITCARD 999998 1000000 7000012339975328

EJCREDITCARD 999999 1000000 7000012339987661

EJCREDITCARD 1000000 1000000 7000012340000009

We believe we have entered the data correctly with an incremental number in the RL_RN column and the total number of obfuscation seed values in the RL_TOTAL column,

When we ran the tool obfuscation we still had a significant number of duplicates.

Output from our job below

JOB OBMJBFB(JOB04280) SUBMITTED RC=0

JOB OBLJBFB(JOB04281) SUBMITTED RC=4

DCFP0001.CARD_31811

DCFP0001.A31811X1

155342 DUPLICATE KEY ERRORS

In addition, I saw the following in the SYSOUT for the job which makes me think that the tool is only ingesting 250,000 rows from the seed table (the previous amount).

Program name GTXDEF

Program version 6.2.00

Program date 2020-09-01

Program Name GTXWHR

Program Version 6.2.00

Program Date 2020-09-01

MAPCSV Out:

Passed (80) =>Table,Column,Function,Parm1,Parm2,Parm3,Parm4,Parm5,Parm6,Parm7,Parm8,Parm9,Parm

Passed (80) =>CARD_31811,CARD_NO,HASHLOV,EJCREDITCARD,1,,,,,,,,,Y,,,,,,,,,,,,,

Run Type ==> F

File Type ==> F

File Format ==> X

WHERE ==> Y

Bundle IDs ==> 00000

Sort Cards ==> 00004

MAPCSV Inp ==> 00002

Header ==> 00001

Blank ==> 00000

Comment ==> 00000

Standard ==> 00001

MAPCSV SUB ==> 00002

MAPCSV OUT ==> 00002

MAPLEN Out ==> 01000

Ignored ==> 00000

Replaced ==> 00000

Generated ==> 00000

Program name GTXMAP

Program version 6.2.01

Program date 2020-09-16

Program name GTXMSKF

Program version 6.2.04

Program date 2020-11-05

Processed 000250000 records 08:48:46 – this was the old number of Credit Card entries – I don’t think its picking up the whole 1,000,000 credit card numbers ???

ACF0C038 ACF2 LOGONID ATTRIBUTES HAVE REPLACED DEFAULT USER ATTRIBUTES

READY

DSN SYSTEM(PROD)

DSN

RUN PROGRAM(GTXMSKF) PLAN(GTXMSKF) LIB('SDB3ETDM.P.LOADLIB')

DSN

END

READY

Is there some parm set somewhere where it is still instructing the program that we only have 250,000 seed table rows for EJCREDITCARD?

Environment

Release : 6.2

TDM Mainframe.

Cause

N/A

Resolution

HASHLOV/HASHLOV1 by itself does not guarantee unique values.

If you want uniqueness, then you’ll need to create a lookup/cross-reference table.

One column containing the “original” number, the 2nd column containing the “obfuscated” numbers.

You can implement this two ways:

(1) Using the existing GTSRC_XREF and specifying masking routines that leverage XREF

(2) Using your own table and using SQLFUNCTION masking to SELECT the obfuscated value based on the original.

We do have a recent “reflov” function that would have worked, but unfortunately, that has not been implemented on Mainframe yet…

https://techdocs.broadcom.com/us/en/ca-enterprise-software/devops/test-data-management/4-9/reference/masking-functions-and-parameters.html