search cancel

IN-PROGRESS task issues - A Client’s Guide to Understanding and Resolving

book

Article ID: 233003

calendar_today

Updated On:

Products

CA Identity Manager CA Identity Suite

Issue/Introduction

 
Tasks hung in an In-Progress state and not completing their work is a most common problem that almost every client will encounter at some point in their usage of IDM.  

Task execution is the primary method through which IDM completes work. When tasks are hanging in progress and not completed this has a high overall impact on the Identity Manager product.

There are a large number of potential causes of tasks backing up In Progress and it is often difficult to determine where in the product the problem is.  This document details the most common causes of in-progress tasks and how to resolve these types of issues.

The information in this document will help resolve some of the most common causes of tasks being stuck In-Progress.


We developed and presented a Webinar on this topic which goes over the information provided in this Knowledge document and includes an overview of what occurs during the execution of a task by IDM Product management. 
Troubleshooting and Optimizing Symantec IGA Tasks In-Progress Community Webinar

Cause

Beginning the Investigation:
View Submitted Tasks and the Task Run-Time Management Task Persistence Monitor will provide valuable insight into the extent of the problem and should point to an initial area of focus.

If for example only Tasks related to Active Directory are stuck In-Progress, the focus should quickly be put on the Provisioning/endpoint layer; a situation where all tasks are hung In Progress would lead to more global areas such as the JMS queue or the Task Persistence database; if all tasks are In-Progress, or the overall user interface is poorly performing this might direct you to overall engine tuning, or on the database itself such as index statistics. 

Main causes of In-progress tasks

JMS Health
- JMS is the messaging engine through which Tasks are processed by the application server and ultimately written into the Database.   This is listed first as it is one of the simplest problems to locate using the Task Persistence Monitor feature, and the simplest to resolve.

Load / Environmental performance tuning
- The second most common cause of this is not properly tuning the environment initially or adding new load into an existing environment without adjusting the tuning configuration

Database Health
- Another common cause of In-Process tasks is too much information in the Task Persistence Database Tables.   The Task Persistence database contains the runtime tables of the Identity Manager product.  The Task Persistence tables are where all Task work is stored throughout the lifetime of the Task’s execution and is constantly being written, read, and updated.   A large row count in these tables means each update takes a longer amount of time which over time will slow all task execution down and lead to stoppages of Task execution leaving tasks in the In-Progress State.

Provisioning / Endpoint issues
-Problems such as unavailable endpoints, administrator password changes, underlying services stopped can all lead to Tasks not being able to complete and in many cases remaining in the In Progress state.

 

Environment

Identity Manager and Virtual Appliance (IM) 14.x

Resolution

JMS Health

JMS is the messaging engine through which Tasks are processed by the application server and ultimately written into the Database.   JMS is a Application Server feature which Identity Management relies on.

Check Java Messaging Service (JMS) processing for problems
Task Run Time Management Task Message Health?   Does the synthetic test complete?

This creates dummy tasks which are pushed through the JMS queue into the database to check JMS queue performance. 
If not 100% and returned in a few seconds - clear JMS queue and restart engine. 


To restart the JMS queue:
>Non-VAPP deployments:
For JBOSS / WIldfly, stop the application server, backup and then delete the contents of standalone/data/ and standalone/tmp/
Then restart the app server.
WebSphere / Weblogic:
Please see your application server admin for details on clearing JMS in Weblogic or Websphere.

>For VAPP Deployments:
VAPP includes an Alias to accomplish this: deleteIDMJMSqueue
Deletes the Identity Manager JMS queue (/opt/CA/wildfly-idm/standalone/data/*).

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/identity-suite/14-4/virtual-appliance/administering-virtual-appliance/using-the-login-shell.html

This should be completed on all nodes.

Configure Journal Size

For Standalone IM. In ca-standalone-full-ha.xml:
The current journal file size and minimum number of files are the default values, which may not be adequate with heavy load.  

Recommended values:

<journal-file-size>25485760</journal-file-size>
<journal-min-files>20</journal-min-files>

Configuring journal size for Virtual Appliance:

https://knowledge.broadcom.com/external/article?articleId=214890

Load / Environmental performance related issues

Has the environment been tuned or is it running with out of the box configurations?  Out of the box configurations work for many of our client's requirements, but can quickly become a bottleneck as environmental complexity and usage grow.
The quickest and simplest tuning option that almost all clients should perform is increasing the memory allocation.  See the Tuning and Fine tuning sections of the specific versions Documentation, for 14.4:
https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/identity-manager/14-4/reference/performance-tuning.html
 
Does the issue only occur during heavy loads?  For example right after starting a bulk task or series of bulk tasks or an E&C?   Some clients have multiple bulk loads or E&C execution which may have initially been spread out enough but have now started to take long enough to overlap.  Increasing the time between each large Task to ensure prior Tasks have time to complete. Log review showing ‘heap’ related or ‘memory related errors such as: java.lang.OutOfMemoryError: Java heap space
 
Understanding memory heap requirements and HEAP PLANNING KB:
https://knowledge.broadcom.com/external/article?articleId=140353

JVM memory, performance considerations and tuning:

Database Health

Resource usage on DB server?
Is the CPU or memory pegged at 100%?  Run out of disk space?  Get DBA / Server team involved
 
Check the size of the DB tables:
TP should be under 100,000 rows for best performance. 
select count(*) from tasksession12_5
select count(*) from object12_5
select count(*) from lock12_5
select count(*) from runtimestatusdetail12

 
If any table is larger than 100,000 rows. Cleanup Task Persistence database. 
 

Counts should be returned in milliseconds - if the counts take seconds to return this may indicate overall database issues that should be discussed with DBA.  For example data fragmentation, indexes, or a large number of locked tables can cause slowness.  DBA should have tools to check this.
 
Also, check the size of the lock12_5 DB table. The select should return quickly and should be under 2 million records:
select count(*) from lock12_5
 
If the select count(*) on the lock12_5 is not returning quickly or returning a large value then you will need to stop IM and truncate the lock12_5 table
 
 


Provisioning / Endpoint issues

Review View Submitted Tasks - is there a pattern?  Are we seeing only specific tasks against one endpoint having issues? If the issue seems isolated to one endpoint  Open Provisioning Manager - Right Click - can you access a user account information and perform CRUD operations (Create Read Update Delete!) in Provisioning Manager:

Can you test against other endpoints to ensure they are accessible?

If endpoint issues are clearly present, focus on and resolve endpoint issues then attempt to use the built in Resubmit Task option to retry the specific problem tasks. 

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/identity-manager/14-4/configuring/resubmit-stuck-in-progress-tasks.html

Review endpoints for failures and resolve endpoint issues
Check Prov logs (etatrans, etanotify, JCS)


 

Additional Information

If the above has not resolved your In-progress issue
 
Please collect all of the below and upload to your new case with L1 support. 
  • Product version
  • Environment information
    • vApp   # of nodes
      • Configuration of nodes (what services are where)
      • Take your time to identify geo clustering issues.

    • Non vApp
      # of App servers and flavor/version

# of Provisioning servers

 

  •  What Database, version, and location
  • Is this a new environment?

  • If this is an existing environment, is this the first time this environment has processed this number of tasks?

  • When was the last time the entire environment was restarted? If the environment has not been restarted recently a simple recycling of services in the environment may at least temporarily clear the issue.

  • What is the extent of the problem? 
    How many tasks are hanging In Progress? 
    Does this impact All tasks or only specific types of Tasks?
    Is this a ‘slowness’ issue, where tasks are completing, just far slower than normal?
    Do the stuck tasks complete if they are resubmitted through  System > Task Run Time Management > Resubmit Tasks feature? 

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/identity-manager/14-4/configuring/resubmit-stuck-in-progress-tasks.html

Attachments