In our production Siteminder environment, we have 2 policy servers running in 2 different physical servers. Yesterday, we tried to recycle the policy servers after some Solaris system maintenance and found that the policy server took 1 hour and 7 minutes to complete "Pre-caching system configuration objects" step. This startup was extremely slow and we didn't notice this long delay during our last restart.
Our policy store is not large and we have a policy export data that is around 2.5MB. Both policy servers took exact 1hr and 7minutes to complete the startup. This is going to be a critical issue if we need to recycle policy server in future.
My environment as follows:
Policy Server fetches information from the policy store at startup time, in this case from the Sun One Ldap directory Server.
In this scenario the SiteMinder Policy Server log shows that Policy Server took more than one hour caching the information from the Policy Store:
<- Begin [2941/1][Sun Aug 02 2009 07:20:55][SmObjCache.cpp:301][INFO] Pre-caching system configuration objects [2941/1][Sun Aug 02 2009 08:28:17][SmObjCache.cpp:497][INFO] Pre-caching each Policy Domain -> End
This issue has been reported in the past by other customers and most of the time this is because a poor performance of the LDAP Server.
To verify this, you can check the LDAP Access log. In this specific case there are around 3000 searches taking between 1 and 2 seconds instead of milliseconds:
<- Begin [02/Aug/2009:07:41:23 -0400] conn=750 op=1732 msgId=1733 - SRCH base="ou=policysvr4,ou=siteminder,ou=netegrity,o=com" scope=2 filter="(&(smPropertyOID5=22-000344cb-9b6b-1122-a765-8374cb6fff3b) (objectClass=smProperty5))" attrs=ALL [02/Aug/2009:07:41:25 -0400] conn=750 op=1732 msgId=1733 - RESULT err=0 tag=101 nentries=1 etime=2 [02/Aug/2009:07:41:25 -0400] conn=750 op=1733 msgId=1734 - SRCH base="ou=policysvr4,ou=siteminder,ou=netegrity,o=com" scope=2 filter="(&(smPropertyOID5=22-0005b942-9b6b-1122-a765-8374cb6fff3b) (objectClass=smProperty5))" attrs=ALL [02/Aug/2009:07:41:26 -0400] conn=750 op=1733 msgId=1734 - RESULT err=0 tag=101 nentries=1 etime=1 -> End
Please follow-up with your ldap administrator:
Your LDAP administrator has to determine the root cause of the LDAP Server.