Optimizing an SQL Query: The Impact of Your Database Definition
search cancel

Optimizing an SQL Query: The Impact of Your Database Definition

book

Article ID: 62732

calendar_today

Updated On:

Products

IDMS IDMS - Database IDMS - ADS

Issue/Introduction

This document is a review of the options the optimizer considers, and how it compares these to the database definitions (calckey, sortkey, constraint, etc) in order to determine the best path.

 

Environment

Release:
Component: IDMS

Resolution

SQL statements are by definition logical requests. They do not indicate the physical access via which the logical request is to be satisfied. The most efficient physical access path for each request is determined by the optimizer. There may be a desire to understand how the SQL optimizer determines the best navigation path for SQL statements. Here is a review of the options the optimizer considers, in order to determine the best path.

The information that the optimizer considers includes:

  1. Columns accessed by the SQL statement;

  2. The selection criteria on the statement;

  3. Any join criteria specified in the statement;

  4. The syntax structure of the statement;

  5. The physical structure of the database as defined in the dictionary (for example Index calckeys, constraints; and sets for network databases);

  6. Statistics stored in the dictionary. These can be actual statistics as a result of running UPDATE STATISTICS utility, or estimates provided in the associated schema(s).

Because of how there are used, it is important whenever possible to use calckeys or sorkkeys as selection criteria in queries; and to always ensure UPDATE STATISTICS has provided current statistics for the database. It is equally important to use the appropriate set name when joining network database records. Also, specifying only the required columns (as opposed to all columns) in the column list for a statement will in some cases produce a more efficient access path.

Here's an example where two tables (EMPLOYEE and DIVISION) are joined together using a common field.

    SELECT DIV_CODE, DIV_NAME, EMP_ID, EMP_LNAME,       FROM EMPLOYEE, DIVISION       WHERE EMP_ID=DIV_HEAD_ID; 

This query may result an area sweep, if the EMP_ID field is not defined as a calckey or sortkey field. By contrast, the following is an example of using a WHERE criteria which will result in a query that walks an index defined on the employee name:

    SELECT DIV_CODE, DIV_NAME, EMP_ID, EMP_LNAME,      FROM EMPLOYEE, DIVISION       WHERE EMP_ID=DIV_HEAD_ID         AND  EMP_LNAME = :EMP-LNAME; 

In this example, a host variable is used at run-time to provide a specific value for the EMP_LNAME field comparison. Since this is the primary sortkey field for an index, it is likely that the optimizer will choose to use that index to access the EMPLOYEE table.

 

Additional Information

References:

CA IDMS SQL Programming Guide, section 2.3.2 Creating Executable Modules.