We would like to do a PoC that will involve the following within Hadoop (HIVE and/or HDFS)
1. Create connections to Hadoop environments
2. Create data models for data in HDFS/HIVE
3. Tag mark PHI/PII data
4. Assign masking functions
5. Execute Masking jobs
etc.
Can you point me to the latest sets of JAR files that we must install? The level of access the TDM person would require in the environment/Edge Node/server etc.
Also, step-by-step process a user must go through including the interface details to mask data within Hadoop environment/s?
Would it be possible for someone to actively guide us to 'set up the environment and ensure the setup is correct? (Steps to be performed).
Release : 4.9.1 Test Data Manager
Component : Hadoop Integration
Looking at the Supported Data Sources - Non-Relational Data Sources, I see very limited support for Hadoop (Hive).
See https://techdocs.broadcom.com/us/en/ca-enterprise-software/devops/test-data-management/4-9/installing/supported-data-sources.html
To help better set the expectations for you POC, and answer your specific questions: