We are planning on a new CA Directory project. What tools do we use to bulk load the initial LDAP database, ldapadd/ldapmodif or dxmodify?
Release: 14.x
Component: CA Directory
For a new CA Directory implementation, the first thing is to confirm or create a data model to hold the data for the LDAP service it is providing.
At times, the Data Architect of the project may realize that there is a need to create a custom LDAP schema in order to host the data appropriately. If so, the following product documentation can get you started:
or you may want to refer to the following KB:
How to Create a Custom CA Directory LDAP Schema for a New CA Directory Implementation?
to help plan your efforts as well.
The fastest way to load a large number of entries for a CA Directory implementation is to use the dxloaddb command.
The dxloaddb along with the dxdumpdb is often used in the backup/restore a CA Directory LDAP service. This command is commonly invoked as
dxloaddb dsaName ldifFile
see DXloaddb Tool -- Load a Data Store from an LDIF File for more details.
This ldif file very often is a backup of a CA Directory LDAP service using the command dxdumpdb as seen in DXdumpdb Tool -- Export Data from a Datastore to an LDIF File. An ldif file created using the dxdumpdb tool actually contains all the relevant CA Directory operational attributes not included in the original data set as well. This allows the LDAP services to be perfectly restored.
For a new CA Directory LDAP service, an administrator will have to create an LDIF file from scratch to take advantage of the dxloaddb performance boost.
The entries in an LDIF file by design are organized in a Directory Information Tree (DIT) that consists of containers and sub-containers to hold objects within them. As a result, to build an LDIF file from scratch can be challenging or at least tedious.
The popularity of Excel and Relational Data Base often encourages administrators to start the data collection of the initial set of data using a csv file. To ease the creation of an LDIF, CA Directory provides a csv2ldif utility to help convert a csv file into a ldif file:
csv2ldif Tool -- Create an LDIF File from a CSV File
To see how this tool actually works, administrators are encouraged to try out the democorp and unspsc that are included on a Directory Server installation under samples subdirectory. Essentially, in additon to a properly formatted csv file, an adminstrator needs to create a custom ldt file. The following is the sample democorp.ldt under the samples/democorp:
# organization node
dn:o=DEMOCORP,c=AU
objectClass:organization
# Division node
dn:ou=$1,o=DEMOCORP,c=AU
objectClass:organizationalUnit
# Department node
dn:ou=$2,ou=$1,o=DEMOCORP,c=AU
objectClass:organizationalUnit
# Person (leaf node)
dn:cn=$4 $5,ou=$2,ou=$1,o=DEMOCORP,c=AU
objectClass:inetOrgPerson
cn:$4 $5
sn:$5
title:$6 $7
telephoneNumber:$8 $9
description:$3
mail:$4.$5@DEMOCORP.com
postalAddress:$10 $11 $12\$$14 $13
postalCode:$15
This template shows the DIT of the democorp DSA starts from an organization node that contains divisions nodes. A dvision node further contains department nodes. Then person nodes are put under the department nodes.
The $1, $2, ..., $15 are referencing the filed numbers within the democorp.csv file.
From the setup script under the samples/democorp, the
csv2ldif -i1 15 democorp.ldt democorp.csv > democorp.ldi
ignores (bypasses), the first line (-i1) and processes up to 15 fileds each line of data in democorp.csv using the template file democorp.ldt shown above to create an unsorted democorp.ldi file. The unsorted nature is to assume no sorting was done within the csv file for ease of preparing it. Further
ldifsort -u democorp.ldi democorp_sorted.ldi
is used to check uniqueness of entries in the democorp.ldi and generate a sorted LDIF file, see ldifsort Tool -- Sort LDIF Records for more details.
and then this democorp_sorted.ldi being an LDIF file can then be used with
dxloaadb democorp democorp_sorted.ldi
to load into the democorp dsa.
Generally, dxmodify is very much similar to ldapadd/ldapmodify. The most important aspect of using dxmodify is that it is part of the CA Directory installation and hence is an officially supported tool, unlike the usual ldapadd/ldapmodify that tend to be open-source tools and hence there may be compliance concerns for some enterprises.
Regardless, the nature of LDAP technology tends to demand input files to be prepared in order to invoke these tools. See the following for more details including examples of the input files regarding dxmodify:
DXmodify Tool -- Add New or Changed Information to a Directory
After the initial deployment of a CA Directory LDAP services, they may be time when one may need to do a large number of updates to the DIT. Under this scenario, using some popular Brower-type of tools like JXplorer and Apache Directory Studio may not be as adequate. It is generally recommended to prepare inputs files that contain multiple entries and use dxmodify to apply the changes to a Directory.
For performance's sake, when the bulk load involves a complete container of a DIT. You may want to try the following and see these approaches may perform better for your particular use cases:
For an enterprise that use containers like organization units to hold data beneath them, there may be the time when a new organization unit is being formed and a bulk of data need to be loaded into the overall Directory Information Tree. In this case, modifying the production LDAP database through a bulk of regular ldapadd type of operations can have negative impact to its performance.
To address this scenario, there are actually two approaches can help reduce the performance impact:
This allows an administrator to create a separate Data DSA for each of the new container that needs to be added to the DIT. All the new entries can then be put into an LDIF file and a simple dxloaddb will allow all the data to be added into the DIT without going through the regular ldapadd performance penalty.
A production LDAP service (as a DIT) is assumed to have failover/loadbalancing built-in. Since there is a new container within the DIT, we can then assume no production modification would be applied to the new container and entries beneath it. Therefore, we can use the following procedure against each of the replicated Data DSA to merge the new container into the production DIT:
Assume the new LDIF has been prepared and sorted