FAQS regarding the EDM Indexing Process

book

Article ID: 160187

calendar_today

Updated On:

Products

Data Loss Prevention Endpoint Prevent Data Loss Prevention Network Monitor Data Loss Prevention Network Prevent for Email Data Loss Prevention Enforce Data Loss Prevention Network Discover Data Loss Prevention Network Prevent for Web Data Loss Prevention Network Protect Data Loss Prevention Endpoint Discover

Issue/Introduction

Questions about the EDM indexing process, such as:

What files are created during the EDM indexing process? 

Where are they created? 

How much space do they take up? 

How much memory is used in creating them?

Resolution

What files are created during the EDM indexing process?

Two kinds of files are created during the EDM indexing process:  one .idx file and one or more .rdx files. The .idx file is temporary. It's removed once the indexing completes.

Where are they created?

They are stored in the \Vontu\Protect\index (DLP 11.6 -> \SymantecDLP\Protect\index) directory on Windows or /var/Vontu/Protect/index (DLP 11.6 -> /var/SymantecDLP/Protect/index) on Linux.  The original data remains in the datafiles directory during the indexing process. Once the indexing completes the original data file is removed.

How much space do they take up?

.idx

The size of the .idx file is:  .idx file size = # rows * # columns x 25. This file takes up disk space only for the duration of the indexing process. Once the indexing is done the .idx file is automatically deleted.

.rdx

.rdx files can get very large but in general they are smaller than the .idx file.  Symantec DLP may need to create several of these files due to the 32-bit platform limitations.  The file size limit is defined by the max_loaded_index_memory setting in the Indexer.properties on Enforce. The theoretical maximum is 2GB for Windows and 3GB for Linux. The difference is due to how 4GB of the 32-bit process virtual address space is split between kernel and user mode.  The Symantec DLP default install does not take advantage of this.  However, the indexer.properties file can be changed to take advantage of an additional 1 GB available on Linux. When the EDM profile index doesn't fit in a single .rdx file the files are named as follows:

DataSource.#.#.rdx.0

DataSource.#.#.rdx.1

DataSource.#.#.rdx.2

etc. 

How much memory do they use?

EDMs are created in two phases.  Phase 1 (gives .idx) does not use much memory.  Phase 2 (gives .rdx) is more memory intensive.  This second phase of indexing needs about 1.9 GB of RAM in addition to what Enforce normally requires.

Additional information:

In Phase 1 the .idx file is created.  This is an an intermediate file, which is removed once the indexing is complete.  The .idx files are stored in the \Vontu\Protect\index (DLP 11.6 -> \SymantecDLP\Protect\index) directory or /var/Vontu/Protect/index (DLP 11.6 -> /var/SymantecDLP/Protect/index). 

Phase 1 reads in the input file sequentially, finds the patterns, and normalizes the data.  Phase 1 does not use much memory.  The size of the file is:

The .idx file size = # rows * # columns x 25.

In Phase 2 the index files are created.  This is a memory intensive process. The final files are labeled .rdx and have the format of DataSource.#.#.rdx.#

These files are stored in the same index directory and later replicated to all Symantec Data Loss Prevention servers.