Poor query performance when data source (table, network record) contains no occurrences
search cancel

Poor query performance when data source (table, network record) contains no occurrences

book

Article ID: 59226

calendar_today

Updated On:

Products

IDMS IDMS - Database IDMS - ADS

Issue/Introduction

In the unique scenario where a table contains no rows, the optimizer may choose an area sweep path for some queries instead of a path which the user believes would be faster. This document explains why that occurs and what the options are to improve the performance.

 

Environment


Component: IDMS - all supported releases

Resolution

In trying to determine the best physical access path to satisfy a logical query, software is invoked which we will call the Optimizer. The Optimizer uses statistical information to evaluate all possible physical paths to satisfy the logical request, and based on this analysis it chooses the best option available. The SQL optimizer will consider statistics in this order:

1- Actual statistics if UPDATE STATISTICS has been run;
2- Estimated rows from table definition if they are included in the definition, and actual statistics are not available;
3- Default statistics, if neither of the above is available. Default values will be based on 1000 rows and 100 pages per area; but if UPDATE STATISTICS has been run the calculation will use the actual number of pages in the area.

This document addresses the path chosen by the Optimizer, and the resulting poor query performance, in a very specific situation:

1- when a table at the time of the query execution contains no rows;

2- UPDATE STATISTICS has been run on the table; and

3- the table has no ESTIMATED ROWS defined.

If UPDATE STATISTICS has not been run, the statistics values which the Optimizer considers will have a value of zero. If UPDATE STATISTICS has been run on the table or record, but it contained no occurrences when that was done, then the number of rows or record occurrences will be zero in that situation also. Because of that, for some queries, the SQL engine may interpret the statistic of 0 rows (or record occurrences) as meaning there are no statistics, even if the UPDATE STATISTICS was run to provide them. In the absence of actual statistics (or as in this situation, the perceived absence of actual statistics), and the absence of estimated statistics, the only statistical input the optimizer has are default values, so it uses those to determine the best access path. Using the defaults in this situation can result in the choice of a less optimal access plan than if the optimizer thought there were few or no rows in the table. That is because the default value for the number of rows in a table is based on 1000 rows in a calculation using the number of pages. For example, in the database where the area contains 100 pages the default number of rows in the table will evaluate to 1000, but based on an area with 8200 pages in an area that default becomes 82000.

To resolve this, when the data source is an SQL table, issue the statement

ALTER TABLE <schema.table> ESTIMATED ROWS 1;

When the data source is a network record, in the schema compiler use this syntax:

MODIFY SCHEMA <schema name>.
MODIFY RECORD <record name> ESTIMATED OCCURRENCES 1.

These statements will provide an estimated statistic for the number of rows, which the Optimizer can consider as part of its statistical input. Using the estimated statistic of 1 row will cause the Optimizer to not use default statistical values, and will influence the Optimizer to chose any available alternative access path (including ROWID index for tables) as opposed to an area sweep on the table.