This article addresses several general CA Directory questions
Sizing Multi-write Queues
How big should the multi-write queue be to handle a day's outage?
If a multi-write DSA receives 50,000 updates per day then a formula to determine the queue size for a 24 hour period would be:
50,000 x 2 (safety factor) = 100,000
The command would be:
set multi-write-queue = 100000;
Note:
Is this a total queue size?
No. Each DSA has a multi-write queue and the queue size parameter applies to each.
If the limit is hit for one queue, this DSA (currently marked MW-FAILED) will have all of its operations discarded (and it will be marked QUEUE-PURGED-OUT-OF-ORDER).
In the case of a multi-server failure that involves the preferred master, recovery procedures should be invoked.
How much memory does a multi-write queue require?
Each modify takes a minimum of 4Kb to store on a MW queue, so a queue size of 100,000 would require at least 400MB of memory. While large modify requests may take more than 4Kb, most typical requests will fit within this size.
How fast is recovery?
Recovery is as fast as the I/O allows on the peer.
However, there is a point when recovery procedures would be much quicker than trying to catch up with a large amount of updates. As a rule of thumb, if the number of updates (e.g. 200,000) is a fair proportion of the number of entries (e.g. 1,000,000) then it would be better to invoke recovery procedures.
Monitoring Multi-write Queues
DXmanager is the best way to visualize the queues. This is a very simple way of showing if the maximum queue size is insufficient for the outages being incurred. Note, DXManager is available as part of the Enterprize version of the product and may not necessarily be available if you are using a product that embeds eTrust Directory - any queries, please contact your Broadcom Account representative.
With respect to queue monitoring (all of the following are in the eTrust Directory Administrator Guide):
Multi-write and Slow Links
When should multi-write-groups be used?
For any situation where there are slow links. Typically multi-write groups would be organized into regions.
Are multi-write groups a type of cascaded replication?
Yes, multi-write groups introduces an extra step in the replication.
This is best explained by way of example. Assume that there were two groups of three DSAs
A write to A1 would result in three types of replication:
How does load-sharing work with multi-write groups?
Load-sharing should only be configured to occur within groups, as groups are guaranteed to be in sync. Load-sharing across groups doesn't make sense because the cross group links are slow.
How does fail-over work with multi-write groups?
When forwarding queries between groups, the first available DSA in the group is used. In the above example, if B1 is not available, then B2 will be forwarded the query, etc.
In what situations should I configure 'multi-write-async'?
None. This flag has been deprecated in favor of multi-write groups. If there is only a single peer at the other end of a latent link, then simply define that peer in its own multi-write-group.
Multi-write and Security
I have set "min-auth" to none, and multi-write replication is now broken, what's wrong?
A client scenario told of an issue where they had configured the client side authentication setting "min-auth" to none and instantly replication broke.
In the warn log, the following message existed.
? 20060214.113521.498 WARN: remoteGetNewAssoc: No compatible link type
Investigations found that while the clients were now able to connect to the directory using anonymous connections, the DSA's were configured to allow only "clear-password" binds, therefore the anonymous MW traffic was not being chained.
The resolution is to add 'anonymous' as an auth-level in each of the DSA's knowledge files. This will align both the DSA and client side authentication levels, and allow multi-write traffic.