FDM - MongoDB masking fails with "Sampling was not enough to generate metadata for column"
search cancel

FDM - MongoDB masking fails with "Sampling was not enough to generate metadata for column"

book

Article ID: 411407

calendar_today

Updated On:

Products

CA Test Data Manager (Data Finder / Grid Tools)

Issue/Introduction

While using FDM to mask some MongoDB collections, we are seeing several of the jobs fail with sampling errors. 

Examples of error message from 2 sample collections are below:

<HOSTNAME>:<PORT>_<COLLECTION>: ValidationWorker-<NUMBER> - Sampling was not enough to generate metadata for column <COLUMN> into table <TABLE>. Number of Samples: '2000'

<HOSTNAME>:<PORT>_<COLLECTION>: ValidationWorker-<NUMBER> - Sampling was not enough to generate metadata for column <COLUMN> into table <TABLE>. Number of Samples: '100'. (2) <HOSTNAME>:<PORT>_<COLLECTION>: Masking process exited with non-zero code: 1

Environment

FDM 4.11.202.0 or greater

Cause

An analysis of the whole collections showed that there were exactly zero properties/attributes found for the columns in question. Therefore, FDM will always return the sampling error because there are no samples found. 

Resolution

In this case, the only way around the sampling error is to either use:

  1. The MONGODBSAMPLEMISSINGFIELDFAILURE option to ignore the sampling error. This can be done by setting MONGODBSAMPLEMISSINGFIELDFAILURE=Y in the FDM options.

  2. Remove the column mapping for failing property in the FDM masking configuration.

Notes to help you decide which method is right for your use case:

  • If you expect that the property could appear for some new document in the future, and are afraid removing the column from the masking configuration would cause unmasked property leaking, then our suggestion is that you keep the mapping and use the MONGODBSAMPLEMISSINGFIELDFAILURE option. As a result, the masking job won't fail and, columns where sampling is not found will be skipped.  

  • However, if you don't trust the MONGODBSAMPLEMISSINGFIELDFAILURE option, you shall remove the mapping (currently it won't fail, however the property won't be masked if it becomes available in the future).

 

Additional Information

Other considerations:

  1. Getting the sampling error when the property doesn't exist at all is NOT a defect (and shouldn't be reported, but instead shall be remedied by using one of the methods discussed above.

  2. Getting the sampling error when at least one document with the property exists IS a defect, and should be reported, by opening a Support case. Please include a sample (json file) of the whole collection being masked, and a copy of the masking configuration (csv file).