FORMATENCRYPT and the use of the ZEROREPLACEFORMATENCRYPT option
search cancel

FORMATENCRYPT and the use of the ZEROREPLACEFORMATENCRYPT option

book

Article ID: 372879

calendar_today

Updated On:

Products

CA Test Data Manager (Data Finder / Grid Tools)

Issue/Introduction

Use Case:

Mask data contained in multiple flat files, and database tables consistently across all data sources, using the FORMATENCRYPT masking function.

Issue:

All the data contained in the flat files have already been masked, and uploaded for use by our testers. The numeric data was masked as Character Data Types, and not Numeric Data Types, so all 9-digits of the original values were masked. We need a way to mask all 9-digits of the database table, in the same manner as the flat files. Some of the SSN masked data contains a leading 0.

Example:

Here we are working on numeric data types, in a database table, and due to its characteristic the leading zeros will be eliminated then actual shape of our SSN as below

  • 12
  • 123
  • 1234
  • 12345
  • 123456
  • 1234567

While masking, can option ZEROREPLACEFORMATENCRYPT=Y consider the source values as below and gives me appropriate masked output value with leading 7?

  • 000000012
  • 000000123
  • 000001234 
  • 000012345 
  • 000123456
  • 001234567

 As per the documentation in Broadcom website - Masking Functions and Parameters:

Applies to:

Characters and Numbers.
Review the following considerations:
    • For numeric columns, FORMATENCRYPT ignores the first digit of input values. This is to avoid the generation of a masked value with leading zeroes, which databases typically truncate, and which can then become identical to another value.
      This rule does not apply to character columns, because databases do not truncate character values.

Since the database already eliminates leading zero's for numeric columns, what is the reason for FORMATENCRYPT ignores the first digit? In our case since we are working with multiple data sources, such as flat files, and database tables, and the SSN data exist in both varchar, and integer data types, what is the best way to maintain data integrity across all data sources? We are seeing a mismatch in masked data for the database table.

Environment

All Supported releases of FDM

Cause

The reason you're seeing a mismatch in masked data across the flat files and the database tables is because FORMATENCRYPT handles character data, and numeric data differently, as you have already pointed out. When using FORMATENCRYPT, the data type is an important part. FDM relies on the definition file, to understand the flat file it is masking. Currently, all data in a flat file is treated as either a character data type (varchar), or a date data type, if you specify the date/time format. Therefore, when setting up the masking configuration, you need to keep this in mind. There are two factors that play a large role in ensuring masking consistency across data sources:

  1. Consistent input data.
  2. Proper masking configuration

 

Why does FORMATENCRYPT ignore the first digit of a numeric value?

When working with database tables and the numeric data types, you will never have a leading zero, because the database will trim the numeric value and remove any leading zero, as your tests show. 

For example, say you input the value 001234567 into your database table with a Numeric Data Type. The database trims the value, dropping the leading zeros, so the actual value in the database table is 1234567. You then mask this value, 1234567 using FORMATENCRYPT. Since this is a Numeric Data Type, FORMATENCRYPT will ignore the first digit to ensure we do not replace the 1 with a 0, and mask the remaining 6 digits. So our masked value is something like 1098765, which resembles the original input data, containing a 7 digit value. Otherwise, you run the risk of the database, dropping one or more digits after we commit the masked value.

If we didn't ignore the first digit, what would happen if the FORMATENCRYPT algorithm generated the value of 0007832? Once this value is committed to the database, the database would trim the leading zeros, and we would have a masked value of 7832. A 4-digit number instead of a 7-digit number.

This is extremely important when masking something such as a Social Security numbers, because a valid looking Social Security number will need to have 9 digits. 

 

Using the  ZEROREPLACEFORMATENCRYPT option to mask all digits of a numeric data type:

In the context of "How to make FORMATENCRYPT mask all digits of a Numeric value in a Database table?" If we apply the same logic as above, the database would strip the leading zeros before FORMATENCRYPT would ever see them, so the values presented to FORMATENCRYPT would actually be:

  • 12
  • 123
  • 1234 
  • 12345 
  • 123456
  • 1234567

With ZEROREPLACEFORMATENCRYPT=Y enabled, FORAMTENCRYPT will still provide you with a masked value containing the same number of digits provided by your input data. If by chance the algorithm generates a value containing a leading zero, we will replace the leading 0 with a 7 to ensure we maintain the same number of digits as the input data. So say FORMATENCRYPT masked the value 12 as 09, with  ZEROREPLACEFORMATENCRYPT=Y enabled, the committed value to the database would be 79.

However, simply using the ZEROREPLACEFORMATENCRYPT option to mask all digits of a numeric column is not fail proof. You still run the risk of data mismatch for those instances where the varchar data types, contain a leading 0, and corresponding numeric data types containing a leading 7.

 

Resolution

Best practice:

The proper solution is to ignore the first digit, when masking any numeric fields in the flat file, to ensure we get the same results as we are seeing in the database tables.

So when masking the flat file, where you have numeric data types, you need to set Parm1 (Ignore first nn characters) to 1. 


Additional Information

The ZEROREPLACEFORMATENCRYPT options doesn't exist in the FDM UI, so you can't run the masking job from the UI. If you try running the masking job from the FDM UI, when you save the options file to run the job, the options file will be overwritten, removing the ZEROREPLACEFORMATENCRYPT=Y, and resulting in a normal FORMATENCRYPT job, for a Numeric Data Type, and the first digit will be ignored.

You have to do the following:

  1. Open the FDM UI and configure the masking job for your database table.
  2. Set up all the options in the FDM Options tab.
  3. At the Summary tab, only save the masking configuration, and the options file, but do not run the masking job.
  4. Open a file explorer, and navigate to the location where the FDM masking csv file and options file are saved.
  5. With a Text editor of your choice, edit the options file, and manually add the ZEROREPLACEFORMATENCRYPT=Y line to the options file, and save the file.
  6. Open a Windows Commend line (cmd), and navigate to the location where the masking csv file, options file, and batch file (bat) are saved. 
  7. Execute the bat file. You will see the results of the masking job inside the terminal windows where you executed the batch file.