How to prevent DC015007 SOS abends or over-use of resources by OLQ.
Release: All supported releases.
Component: OLQ.
OLQ satisfies logical data requests by performing physical I/Os. The amount of resources used by any task (including OLQ) can be controlled by limits specified in the sysgen for that task definition. OLQ in addition has an interrupt count which can be specified to control the frequency with which rows are returned, at which point the user has the option of continuing or ending the transaction.
In some situations, OLQ can use up very many resources before hitting an interrupt count, causing it to reach DBIO or DCCALLS limits. This can occur because the limits and the interrupt count track very different things: the I/O counter (used to check the DBIO limits) tracks the number of I/Os performed, and the Interrupt Count tracks the number of rows returned to OLQ. The relationship between these is most easily explained with an example. Assume someone wants to find all the clients within a certain zip code, and then to find out which offices exist within that state to satisfy an order. That would be easy to set up on OLQ, particularly menu mode; simply pick the client and office records, specify a WHERE criteria on the client zip, and a join clause between the client state and the state in the office record. That would generate a query that follows this general form:
SELECT * FROM CLIENT, OFFICE
WHERE CLIENT_ZIP = nnnnn and CLIENT.STATE = OFFICE.STATE;
Now, assume that there are 1,000,000 clients, and 5 offices in each state. In this query, zip and state are unlikely to be key fields, so there are no indexes or calc keys to help in this retrieval. That means there will be area sweeps on two records, and joining them in a nested loop. The optimizer will determine the most efficient way to retrieve these records. For the sake of discussion, assume that the path will be to sweep the Client area for clients within this zip code, and then walk the office area for offices that satisfy the join criteria on state. That means the path will be:
The interrupt count provides an opportunity for the end user to terminate the task and thus avoid consuming too many resources, but often the user is completely unaware of the resources required to satisfy the logical request. That's because the amount of resources required to satisfy the user query depends completely on the query that's coded, the database statistics available, and the database design. So the interrupt count is not a reliable method to manage resource utilization; instead, to avoid high resource utilization by OLQ, most sites establish limits for OLQ just the same as for other user applications. The only way to ensure that long-running OLQ queries do not consume substantial resources at run-time is to establish limits, either at the system or task level; and to establish an interrupt count that does not interfere with these limits being reached (see more on this below). The same approach is often used for Server tasks and tasks that process generalized SQL queries, since these requests are all logical and do not specify any actual physical I/O DML commands. Often users are unaware of what resources are required to satisfy such a query, so setting limits is the only way to avoid tasks that use too many resources.
The unique factor which differentiates OLQ in this regard is that both the call limits and the interrupt count can have an impact on this process. That's because when the interrupt count is reached, the resource counters for the task (calls, I/Os, etc) are reset to zero. Thus if the interrupt count is set too low, it is possible for the resource counters to be continually reinitialized, so that the specified limits are never reached. A few actions will help prevent this from happening: