A data source query job (RiskFabric_IW_DataSourceQueryID_<n>) is failing during staging (step 5) and logging the following error:
String or binary data would be truncated.
Release : 6.x
One of the columns in the data source contains a string value that exceeds the length of the destination column to which it is mapped in the staging table.
Information Centric Analytics (ICA) has the capability and flexibility to handle data from a variety of sources and source types, and to transform data as needed through import rule mapping formulas; however, it should not be expected to replace dedicated Extract, Transform, and Load (ETL) tools. To resolve this type of error, transform, trim, or truncate the data as close to its source system as possible.
For example, suppose you have a data source that contains event data that you want to integrate with ICA, but the source does not provide an API that ICA supports. To make the data available to ICA, they are first extracted into a SIEM like Splunk which is then integrated with ICA. After successfully adding Splunk as a data source in ICA, creating a data source query, and integration rule mappings, the data source query job runs successfully and imports data as expected into ICA; however, a few weeks later, you discover the job is now failing and logging the error captured in the introduction of this article. You discover the cause is an excessively long value in a machine description column. To correct this, consider applying a transformation to the column at any of the following points (in recommended order):
For flat file integrations or SQL integrations in which data are initially provided by a flat file extract or SSIS package, transformations should ideally be handled at these stages of the ETL process.