Data source query job staging failure
search cancel

Data source query job staging failure

book

Article ID: 273609

calendar_today

Updated On:

Products

Information Centric Analytics

Issue/Introduction

A data source query job (RiskFabric_IW_DataSourceQueryID_<n>) is failing during staging (step 5) and logging the following error:

String or binary data would be truncated.

Environment

Release : 6.x

Cause

One of the columns in the data source contains a string value that exceeds the length of the destination column to which it is mapped in the staging table.

Resolution

Information Centric Analytics (ICA) has the capability and flexibility to handle data from a variety of sources and source types, and to transform data as needed through import rule mapping formulas; however, it should not be expected to replace dedicated Extract, Transform, and Load (ETL) tools. To resolve this type of error, transform, trim, or truncate the data as close to its source system as possible.

For example, suppose you have a data source that contains event data that you want to integrate with ICA, but the source does not provide an API that ICA supports. To make the data available to ICA, they are first extracted into a SIEM like Splunk which is then integrated with ICA. After successfully adding Splunk as a data source in ICA, creating a data source query, and integration rule mappings, the data source query job runs successfully and imports data as expected into ICA; however, a few weeks later, you discover the job is now failing and logging the error captured in the introduction of this article. You discover the cause is an excessively long value in a machine description column. To correct this, consider applying a transformation to the column at any of the following points (in recommended order):

  1. In the data source system
  2. If that's not possible or desirable, in an intermediary layer (in this example, Splunk)
  3. If not in the intermediary system, in the data source query within ICA (for example, using conversion functions in Splunk's Search Processing Language (SPL))
  4. If not in the data source query, in an import rule mapping formula (if the failure is occurring during the RiskFabric Processing job and not during the data source query job)

For flat file integrations or SQL integrations in which data are initially provided by a flat file extract or SSIS package, transformations should ideally be handled at these stages of the ETL process.