search cancel

Delay in processing under load with the R12 SP2 (Build 210) Application Server Agent for WebSphere

book

Article ID: 134461

calendar_today

Updated On:

Products

CA Single Sign On Agents (SiteMinder) SITEMINDER

Issue/Introduction

We're running an ASA Agent with WebSphere and on high load, the ASA
Agent went unresponsive until we restarted WebSphere.

Under very heavy load against the WebSphere Application Server with
the R12 SP2 Application Server Agent for WebSphere integrated, we have
found that the Application Server reached MaxThreads, and processing
slowed. The debug WebSphere logs with debug logging configured for the
Application Server Agent showed the threads all delayed in cache
resulting in a severe performance impact.

The ASA Agent hang and was not able to answer request and the
WebSphere server reports error :

  11/26/19 16:46:45:349 IPAddress: 10.0.25.180 Module: MYSERVER
  EventMessage: [11/26/19 16:46:45:349 AST] 000000f3 ThreadMonitor W
  WSVR0605W: Thread "WebContainer : 0" (00000103) has been active for
  646682 milliseconds and may be hung.  There is/are 2 thread(s) in
  total in the server that may be hung.

                at com.ca.siteminder.sdk.agentapi.connection.e8.a(DashoA10*..)
                at com.ca.siteminder.sdk.agentapi.connection.fp.a(DashoA10*..)
                at com.ca.siteminder.sdk.agentapi.accesscontrol.f0.a(DashoA10*..)
                at com.ca.siteminder.sdk.agentapi.ff.a(DashoA10*..)
                at com.ca.siteminder.sdk.agentapi.dt.a(DashoA10*..)
                at com.ca.siteminder.sdk.agentapi.dt.a(DashoA10*..)
                at netegrity.siteminder.javaagent.di.a(DashoA10*..)
                at com.netegrity.siteminder.agentcommon.framework.cache.ee.a(DashoA10*..)
                at com.netegrity.siteminder.asaframework.managers.resource.a0.a(DashoA10*..)
                at com.netegrity.siteminder.asaframework.hla.managers.resource.ay.a(DashoA10*..)
                at com.netegrity.siteminder.asaframework.hla.w.a(DashoA10*..)
                at com.netegrity.siteminder.asaframework.hla.w.a(DashoA10*..)
                at com.netegrity.siteminder.websphere.auth.SmTrustAssociationInterceptor.isTargetInterceptor(DashoA10*..)
                at com.ibm.ws.security.web.TAIWrapper.isTargetInterceptor(TAIWrapper.java:202)
                at com.ibm.ws.security.web.TrustAssociationManager.getInterceptor(TrustAssociationManager.java:179)

SystemOut.log

  [11/26/19 14:16:37:778 AST] 00000001 SystemOut O SMINFO: SiteMinder
  TAI successfully initialized

  [11/26/19 14:43:26:943 AST] 00000079 NGUtil$Server I ASND0002I:
  Detected server MYSSERVER started on node MYNODE01

and then it reports that is active for 7 minutes on the connection
phase :

  [11/26/19 16:46:45:349 AST] 000000f3 ThreadMonitor W WSVR0605W: Thread
  "WebContainer : 3" (0000012a) has been active for 706667 milliseconds
  and may be hung.  There is/are 1 thread(s) in total in the server that
  may be hung.

   at com.ca.siteminder.sdk.agentapi.connection.e8.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.connection.fp.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.accesscontrol.f0.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.ff.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.dt.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.dt.a(DashoA10*..)
   at netegrity.siteminder.javaagent.di.a(DashoA10*..)
   at com.netegrity.siteminder.agentcommon.framework.cache.ee.a(DashoA10*..)
   at com.netegrity.siteminder.asaframework.managers.resource.a0.a(DashoA10*..)
   at com.netegrity.siteminder.asaframework.hla.managers.resource.ay.a(DashoA10*..)
   at com.netegrity.siteminder.asaframework.hla.w.a(DashoA10*..)
   at com.netegrity.siteminder.asaframework.hla.w.a(DashoA10*..)
   at com.netegrity.siteminder.websphere.auth.SmTrustAssociationInterceptor.isTargetInterceptor(DashoA10*..)

  [...]

and it kept doing this and all threads seems to come in this state :

  [11/27/19 8:37:53:552 AST] 000000f3 ThreadMonitor W WSVR0605W:
  Thread "WebContainer : 80" (000004a2) has been active for 732342
  milliseconds and may be hung.  There is/are 90 thread(s) in total in
  the server that may be hung.

   at com.ca.siteminder.sdk.agentapi.connection.e8.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.connection.fp.a(DashoA10*..)
   at com.ca.siteminder.sdk.agentapi.accesscontrol.f0.a(DashoA10*..)

until we stopped WebSphere :

  [11/27/19 9:19:00:963 AST] 0000006f ServerCollabo A WSVR0023I:
  Server MYSERVER is stopping

[...]

  [11/27/19 9:22:06:169 AST] 0000006f ServerCollabo A WSVR0024I:
  Server MYSERVER stopped

How can we fix that ?

Cause

The R12 SP2 (Build 210) GA version of the Application Server Agent for WebSphere has an issue in which the Agent when processing a WebSphere request will first take a lock on cache to determine if the Agent can or cannot process the request locally from cache. The issue arises if the Agent must contact the Policy Server to process the request. The Agent thread will maintain it's lock on the cache while it obtains a connection to the Policy Server, which causes Agent threads to back up waiting on cache while a connection is obtained.

 

Under low or normal load where Agent to Policy Server connections are typically available, there will be no end user impact. It is only under heavy load where there may be a delay in getting a connection to the Policy Server that this issue will result in an eventual slowdown in processing by the R12 SP2 (Build 210)Application Server Agent for WebSphere in a single threaded manner.

Environment

Release : R12 SP2

Component : SITEMINDER -Application Server Agent for WebSphere

OS: All supported

Resolution

To resolve this issue, please contact Broadcom Support to obtain the R12 SP2 (Build 213) Application Server Agent for WebSphere (or above) release, which contains a fix to prevent the Agent threads from holding the lock on cache while obtaining the connection to the Policy Server if the Agent cannot process the request from cache.

Additional Information

For the R12 SP2 Application Server Agent for WebSphere as a plugin to WebSphere, each WebSphere request spawns the Application Server Agent, so for the Host Configuration Object (HCO) for the Application Server Agent, it is best practices to set the MinSocketsPerPort and the NewSocketStep to "1" instead of the default of "2", since each "Agent" will only make 1 request; there is no reason to open the second connection. The MaxSocketsPerPort should be set to a value sufficient to handle the number of threads defined for WebSphere.