Poor performance on Oracle based SDM installs

Products

CA Service Desk Manager CA Service Management - Service Desk Manager

Issue/Introduction

At times, an Oracle DBMS may be slow in returning results back to the Service Desk Manager application. This impacts the performance of the Service Desk application by making searches and updates slow. This document describes methods to diagnose this problem.

Environment

Service Management 17.x

Resolution

CA Service Desk Manager and Oracle DBMS

At times, an Oracle DBMS may be slow in returning results back to the Service Desk Manager application. This would impact the performance of the Service Desk application by making searches and updates slow.

Service Desk will log to its STDLOG files, located in NX_ROOT\LOG, any statement (SELECT, INSERT, UPDATE or DELETE) that takes longer than 1,000 milliseconds to complete.

The STDLOG file will show messages like the following:

<hostname> sqlagt:select5 3284 MILESTONE orclclass.c 1345 The following statement took 5346 milliseconds: SELECT

call_req.open_date, call_req.id, min(attached_sla.time_to_violation) AS "mintime", min(usp_target_time.target_time) AS "mintgt", call_req.ref_num

FROM

(call_req LEFT JOIN attached_sla ON call_req.persid = attached_sla.mapped_cr LEFT JOIN usp_target_time ON call_req.persid =

usp_target_time.mapped_cr)

WHERE ( call_req.assignee = HEXTORAW('######') AND call_req.active_flag = 1 AND call_req.type = N'I' ) AND ( call_req.type

= N'I' ) GROUP BY call_req.open_date, call_req.id, call_req.ref_num ORDER BY call_req.open_date DESC

In the above, "######" represents a given UUID value for the associated record, in this case, the call_req.assignee.

Before troubleshooting this performance scenario, be sure to understand:

Are you using CABI (BOXI) reporting? Does CABI (BOXI) point to the Service Desk production application server and MDB database? Are you using any other reporting tool(s) that points to the production MDB database?
What other applications share the MDB database with Service Desk?
Does the Database server contain databases for other applications?
Is Service Desk configured to use case insensitivity with Oracle? This configuration setting is stored within the Service Desk NX.ENV file:

@NX_ORACLE_CASE_INSENSITIVE=1

1 = Service Desk is case insensitive

0 = Service Desk is case sensitive

This setting only needs to be performed on the Service Desk primary server. A recycle of the Service Desk is required if any change is made to this setting. If set to "1" (case insensitive), some SELECTS may function differently when it comes to how indexes are used in Oracle.

Note: Much of what is covered below, with the exception of the section on the Virtual Database and connections, is the realm of an experienced Oracle Database Administrator and not CA Support.

Slow Running SELECT statements

Searches, internal processing and Scoreboard Query nodes in Service Desk translate to SELECT statements sent to the Oracle MDB database. If Oracle is not returning results in sub-second time frames, the performance and response time of the Service Desk application will be impacted.

Identify what SELECT statements are running slow

Service Desk will log to its STDLOG files any database statement that takes longer than one (1) second to return results. That may not be adequate to narrow down performance problems. To get a list of the SQL statement that took longer than one (1) second to return from Oracle, issue the following command from the Service Desk primary server:

pdm_logstat -f orclclass.c MILESTONE

Once the test is over, issue the following command to turn it off the tracing:

pdm_logstat -f orclclass.c

Note: These commands impact how quickly the STDLOG log files roll over, so the commands should only be enabled for short periods of time.

For Unix Environments :

Verify whether Service Desk is using Case Sensitivity or not. This configuration setting is stored in the Service Desk NX.env file:

@NX_ORACLE_CASE_INSENSITIVE=1

1 = Service Desk is case insensitive

0 = Service Desk is case sensitive

Login to SQLPLUS on the Service Desk Primary server as the same database user configured with Service Desk ("@NX_DB_USERID" value within the Service Desk\NX.env file). Oracle SQL Developer is fine to use as well. However, this is a new download. The Service Desk application server will already have the sqlplus program available. Use the Database Service Name that Service Desk is configured for ("@NX_DB_NODE" value within the Service Desk NX.env file). See Figure 1.

sqlplus user/password@ServiceName

Figure 1

Set your session parameters (see Figure 2):

set echo on

alter session set NLS_COMP=ANSI;

set autotrace on;

set timing on;

If you have Service Desk configured for case insensitivity (@NX_ORACLE_CASE_INSENSITIVE=1 within the Service Desk NX.env file), set:

alter session set NLS_SORT=BINARY_CI;

Create an output file so that the results can be provided to your DBA or CA Support:

spool QueryResponseTimes.txt

Figure 2

For each SQL Statement that was reported as taking a long time to run as identified in the Service Desk STDLOGs, run the statement to see the output and the duration. You may paste it into SQLPLUS as shown in Figure 3.

Figure 3

Review the output with the help of your DBA or provide it to CA Support as directed. Please note the output file will be empty until you turn off spool or exit SQLPLUS

Sample output:

OPEN_DATE ID mintime mintgt REF_NUM

---------- ---------- ---------- ---------- ------------------------------

1349722446 400004 33

Elapsed: 00:00:00.01

Execution Plan

----------------------------------------------------------

Plan hash value: 2162407762

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 148 |

6 (34)| 00:00:01 |

| 1 | SORT GROUP BY | | 1 | 148 |

6 (34)| 00:00:01 |

|* 2 | HASH JOIN OUTER | | 1 | 148 |

5 (20)| 00:00:01 |

| 3 | NESTED LOOPS OUTER | | 1 | 103 |

2 (0)| 00:00:01 |

| 4 | TABLE ACCESS BY INDEX ROWID| CALL_REQ | 1 | 58 |

2 (0)| 00:00:01 |

|* 5 | INDEX RANGE SCAN | CALL_REQ_X4 | 1 | |

1 (0)| 00:00:01 |

| 6 | TABLE ACCESS BY INDEX ROWID| ATTACHED_SLA | 1 | 45 |

0 (0)| 00:00:01 |

|* 7 | INDEX RANGE SCAN | ATTACHED_SLA_X2_CI | 1 | |

0 (0)| 00:00:01 |

| 8 | TABLE ACCESS FULL | USP_TARGET_TIME | 1 | 45 |

2 (0)| 00:00:01 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access(NLSSORT("CALL_REQ"."PERSID",'nls_sort=''BINARY_CI''')=

NLSSORT("USP_TARGET_TIME"."MAPPED_CR"(+),'nls_sort=''BINARY_CI'''))

5 - access("CALL_REQ"."ASSIGNEE"=HEXTORAW('######')

AND + "CALL_REQ"."ACTIVE_FLAG"=1)

filter(NLSSORT("CALL_REQ"."TYPE",'nls_sort=''BINARY_CI''')=HEXTORAW('006900') )

7 - access(NLSSORT("CALL_REQ"."PERSID",'nls_sort=''BINARY_CI''')=

NLSSORT("MAPPED_CR"(+),'nls_sort=''BINARY_CI'''))

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

2 consistent gets

0 physical reads

0 redo size

677 bytes sent via SQL*Net to client

415 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)

1 rows processed

Your Database Administrator or Oracle Support should be able to assist with interpreting the output. However, from the output, you might be able to identify a high cost involved in a part of the WHERE clause. It may become evident that an index or case insensitive index is needed.

If an index is used, but is performing poorly, your DBA can determine how to address the health of that table index. Consult your Oracle documentation for the steps to perform related to your environment. Oracle provides a number of tools for this purpose.

Slow Running INSERTs and UPDATEs

Oracle INSERTs and UPDATEs statements can slow down for many different reasons as a table grows in size.

Here are some of the common reasons for slow running INSERTs and UPDATEs that you need to discuss with your DBA:

Stale statistics

Try to re-analyze the table and ALL indices on the table:

EXEC DBMS_STATS.gather_table_stats('MDBADMIN', 'CALL_REQ');

EXEC DBMS_STATS.gather_index_stats('MDBADMIN', 'CALL_REQ_PK');

2. NEXT extent size

If you have a tiny NEXT extent size, you may see delays each time that Oracle extends the table and index. It is best to pre-allocate the space.
Index fragmentation - In rare cases, fragmented indices can contribute to slow insert performance. To confirm, try rebuilding the indices.

3. Consult with your Oracle DBA about other possible reasons.

Service Desk Virtual Database and Database Agents

Service Desk Manager uses a server side "Virtual Database" server to fetch and update data from the MDB database. To check the health of the Virtual Database, use the Service Desk Diagnostic tool "pdm_vdbinfo".

pdm_vdbinfo Report Output

The pdm_vdbinfo report output is divided into several sections. The header section contains most of the information necessary for making preliminary judgments about the state of the bpvirtdb_srvr process. This information includes the following details:

Min Config Agents - Specifies the minimum number of agents to start. The bpvirtdb_srvr -n parameter in the NX_ROOT\PDMCONF\pdm_startup file sets this value. In this example, bpvirtdb_srvr is set to start 25 SELECT database agents upon startup.
Max Config Agents - Specifies the maximum number of agents to start. The NX_MAX_DBAGENT variable within the NX_ROOT\NX.ENV file sets this value. In this example, the bpvirtdb_srvr is configured to start up a maximum of 40 SELECT database agents. Upon startup, the minimum number of agents start; when all of those agents are busy, additional agents start (up to the maximum number of agents specified). When agents are no longer busy for a period, they stop until the specified minimum number of agents is reached. At times, during runtime, the system reduces the maximum number to a number that is one less than the configured number.
Max DB Agents - Specifies the maximum number of database agents to start. This value is typically the same as Max Config Agents.
Tgt num idle - Represents how many agents must be idle for a period before the bpvirtdb_srvr process starts terminating database agents down to the minimum configuration limit. This value is hardcoded to "2."
Num Agents running - Identifies the total number of agents currently connected to the bpvirtdb_srvr process.
Num Agents starting - Indicates the number of database agents that bpvirtdb_srvr is in the process of starting.
Num Requests pending - Indicates the total number of SELECT requests currently queued in the bpvirtdb_srvr process waiting for an available database agent.
Actual num idle - Indicates the number of SELECT agents waiting for work. In this example, there are no (0) agents waiting, which indicates that there is a backlog as all agents are doing work. If there is no queuing or if the queuing is only short term, generally you do not need to increase the number of agents. However, if database requests are constantly queued during peak times, this behavior can indicate a valid need to increase the number of agents.

Note: If query requests are being queued, determine the cause before making any adjustments. Otherwise, if you increase the number of database agents without knowing the actual cause of the increased queuing, the result may be degraded performance rather than improved performance. For example, if the database is already overworked as the result of bad index statistics and cannot handle more requests efficiently, then increasing the number of requests sent to the database could result in a slower response, and ultimately more queuing.

It may be useful to run pdm_vdbinfo under different load conditions to determine the effect of load level on queuing. Compare the results to the STDLOGs to determine if long-running queries are being executed at the time. Remember to look both forward and backward in the log because long-running queries are logged only when the query completes (which may occur after you detect queuing). If throughput was acceptable and the backlog was based on load, try increasing the number of database agents.

The pdm_vdbinfo report also includes a Delayed ID Queue section, which identifies the "select short" cache usage. A select short query is one that has the WHERE clause of "WHERE id=?" and only returns one row.

A given listing will display any table that is accessed as part of a select short query. For each table entry, the following information appears:

Hash - Reserved for developer use only.
Queue - Indicates the actual number of select short responses that are currently queued. If the queue "wraps," this number equals the MaxQueue value. A queue is limited to its MaxQueue value. On the MaxQueue +1 select short query, the oldest query is dropped and the queue wraps.
MaxQueue - Configures maximum length of the queue for this table. The default value is 101 rows.
Min - Indicates the shortest time the queue has taken to wrap since the bpvirtdb_srvr process started. If the queue has not wrapped, this value is 0 (zero).
Max - Indicates the longest time the queue has taken to wrap since the bpvirtdb_srvr process started. If the queue has not wrapped, this value is 0 (zero).
Cur - Indicates the age of the oldest entry in the queue. If both Min and Max values are 0 (zero), this value is the amount of time since the table was first referenced.
When an update occurs, the queue should not wrap until all the domsrvr processes have requested the update. A good rule is to keep the Min value greater than two seconds. If the Min value is greater than two, increasing the number of queued entries only increases memory usage and does not improve performance.

I/O Requests

Tables with many I/O requests use the ID 00 database agent. ID 00 is the generic update agent. This agent updates all tables that do not have specific agents associated with them. If there are many updates to a specific table on agent 00, the agent is busy processing those updates and is not available to process updates for other tables. Moving this table to a separate agent alleviates the overhead caused by updates to this single table. The primary function of this report is to identify the amount of work currently en route, or queued to the database agents. To make the most of this information, compare it to the STDLOG output and in the context of the end-user perceived performance.

Additional Information

How to Identify Performance Problems in CA SDM

Interval Logging