Issue:
Production policy server queue depths spikes as well as the number of agent connections to policy server all under normal load. Clients observing slow responses times, applications owners using SSO reporting slow transactions processing when the WebAgent communicates to the policy server
Environment:
Cause:
We found a common thread when the issue occurs – Policy server worker threads waiting in LDAP, specific one thread in Wldap32!DrainWinsock
Remaining work threads in the following stack
ntdll!ZwWaitForSingleObject+15 |
KERNELBASE!WaitForSingleObjectEx+98 |
kernel32!WaitForSingleObjectExImplementation+75 |
Wldap32!LdapWaitForResponseFromServer+269 |
Wldap32!ldap_result_with_error+21b |
Wldap32!ldap_search_ext_sW+87 |
smldap_ms+2bab |
Lead to investigation of the communication from the policy server to the AD server
AD serverATQ (Asynchronous Thread Queue, handles LDAP processing) threads max out at 24 even though this particular server as a higher end system than other AD server which were running around 96 threads. bug with the hardware http://h20566.www2.hpe.com/hpsc/doc/public/display?sp4ts.oid=5379860&docId=emr_na-c04650594&docLocale=en_US
What are all these ATQ thread counters there for anyway?
“ATQ” stands for “Asynchronous Thread Queue”.
LSASS adopted its threading library from IIS to handle Windows socket communication and uses a thread queue to handle requests from Kerberos and LDAP.
Use perfmon on AD server to monitor ATQ parameters: Directory service --> select the ATQ counters
Workaround:
This was not CA SSO related – providing this information as is
Increase ATQ for troubles Active Directory server – For this level of changes consult internal Microsoft/Windows engineer or Microsoft to assist in increase the ATQ size