Integration Wizard Best Practices

book

Article ID: 171346

calendar_today

Updated On:

Products

Information Centric Analytics Data Loss Prevention Core Package

Issue/Introduction

Applying best practices when working with ICA's Integration Wizard

Environment

SQL Server

CAUTION: These commands assume a basic understanding of SQL. Any changes to the Database pose the potential to affect the operation of the product. ALWAYS ENSURE YOU HAVE A BACKUP OF YOUR DATABASE BEFORE MAKING CHANGES.

In order for Symantec to provide full support of the schema and the DB environment, you have to be running the DB in the version specific to the used version as outlined in the System Requirements Guide. Please see the limitations section at http://www.symantec.com/docs/TECH220433

Resolution

  1. Review the column names from the integration mapping entity prior to developing the Data Source Query for the Integration. This will give you a better concept of what data the integration entity will accept for the integration mapping. Reviewing the entity will provide insight into the data that will be required by the integration entity and it will also document the data type and length accepted by the column. This will provide you with an opportunity to create a plan to resolve any data type mismatches or any data sizing issues up front.
  2. When creating the data source query, alias the column names within the source query to align column names from the data source to the column name defined for the integration entity. When Column names are aliased, the Integration Wizard automatically maps the columns defined for the integration, eliminating the need to manually map the source to target columns when defining the integration mapping.
    • For example, if you are using the computer endpoints entity to load computer endpoints and the primary key column in your source system is called AssetTag and it will be mapped to the column SourceComputerKey, we recommend creating an alias for the target column in order to rename it to SourceComputerKey. 
    • If your source system calls the column that represents the serial number for a computer endpoint as SerialNo, alias the column to SerialNumber so you can take advantage of the automapping features when associating a data source query to an integration mapping. 
  3. To prevent data type errors from occurring during integration processing, use cast statements in your SQL statement to minimize the risk of data size conflicts when loading data into ICA. Listed below is a sample case statement that will reinforce this concept.
    • Sample Cast Statement that will cast the column SamAccountName to an NVARCHAR 256 field and alias the column to AccountName. This will ensure the data selected is capped at 256 characters and we are using the alias AccountName so the column will be automated when the query references an integration mapping.
      • CAST ([sAMAccountName] as NVARCHAR(256)) as AccountName
  4. There are a number of formulas that are shipped out of the box that can be used to supplement your integration, the most common one used is converting EPOCH time to SQL server time when building an integration.  It is common for some systems to store time in epoch format, we have a predefined formula entitled 'Convert epoch time to SQL Server datetime' that can be used to convert the epoch time to SQL Server Time.
  5. When defining the integration mappings for a column, you can specify a fixed value, a source column from a query, or you can use a preexisting formula / create a new custom formula. 
  6. When creating a formula, enclose the variable in '{ }'.  Doing so will allow you to correctly define the variable and it will prompt ICA to specify a column value to pass during run time. Formulas are applied when the data is going from staging into the preprocessing tables.
  7. When defining a data source query within ICA, it is possible to have the data source query run on an intraday schedule.  Be sure to also set the integration mapping to also run on an intraday schedule. The data source query schedule is independent of the integration mapping schedule and the integration mapping needs to be configured to run as part of the intraday process.
  8. If you would like the Integration mappings to run in a specific order be sure to establish a run order when configuring the integration mapping.  Setting the run order will dictate the sequence in which a mapping will run.
  9. Determine if a watermark is required for a data source query. If there is no watermark defined for a data source query, the staging table will be truncated and reloaded on a nightly basis. If there is a watermark defined for a data source query the data in stage is never truncated, data will continue to accumulate in the staging table.
  10. When a watermark is defined for a data source query, ensure you define a data retention policy for the staging table. Establishing a retention policy for the data source query aide in managing how much information is being retained in the staging tables. If no watermark is defined for the integration there is no need to establish a retention policy.