We have a requirement to mask addressees using FDM. However, the masking should choose data from the seed list based on Country code, so that Canadian addresses are masked with Canadian addresses from the seed list, US addresses are masked with US seed data, etc. One of the main problems we face is when we try masking a row where the Country Code is missing, or doesn't match with the Country Codes in the seed list. When this happens, the masking job skips over those rows, and they are not masked.
We would like to know how we should be configuring the HASHLOV1 masking function for this scenario, and how to set a default value.
Release : 4.10
Assuming you have data to mask that can be grouped into buckets, and you want to mask the data within the limits of the bucket, let's take an example:
Suppose you have a table that contains at least 2 columns. The first column contains cities in the US like Los Angeles and Chicago, and the second column contains States like California and Texas.
You want to mask cities in the US and you have a seed list of cities within the US to mask your data. You want to mask cities in each state with cities within that state. It means you do not want to mask a city in California with a city in Nevada.
You also need a seed list that stores states with their cities to use for masking.
From the FDM UI you should select a Data Category that contains states with their cities.
A hash value is generated based on the city name to allow FDM to pick up the new city from the seed list.
Restrict Values Column: represent the column that is used to match against the Seed Column Bucket ID from the seed list. It would be the column where state names are stored.
When we mask a city in a state, we generate the hash value that represents the city. We then use the restrict values column to get the bucket list of cities to that particular state from the seed list. Using the hash value on the specific bucket in the seed list, we retrieve the city that is used for masking.
In case the seed list does not have an entry for the Seed Column Bucket that matches the Restrict Values Column, the seed list could contain an extra entry where the Seed Column Bucket can have a new value DEFAULT and the Seed Column can have the default value to use for masking like Timbuktu.
Note: When configuring your seed list to contain a DEFAULT string, the DEFAULT string must be placed in the same column used as your Seed Bucket Column. Otherwise, when the masking job runs, FDM will not find the default value, and the rows where a match cannot be made will not be masked.