IDMS: How to examine SQL statistics
search cancel

IDMS: How to examine SQL statistics

book

Article ID: 10252

calendar_today

Updated On:

Products

IDMS IDMS - Database

Issue/Introduction

Statistics for the CA IDMS SQL optimizer can be provided for both SQL and non-SQL defined databases by using the UPDATE STATISTICS utility command. This document shows how to review and verify the results of that command. 

In order to ensure that SQL DML statements are optimized to use the most efficient access method that will satisfy the request, it is imperative that UPDATE STATISTICS be run against databases that will be accessed by SQL DML. This is true whether the database is originally defined in SQL or non-SQL DDL. Statistics for SQL-defined databases are stored in the catalog, as part of the entity definitions, and can easily be seen and verified there. Statistics for non-SQL databases are not found in the catalog and thus extra steps are required if there is a desire to verify that accurate statistics are in place.

It is also important that when testing the efficiency of SQL access, the statistics that are used are those created from a production-sized database. This a test region must be available that mirrors the size and design of the production database; or the statistics must be generated in the actual production environment, then copied into the test database. Knowing where the statistics are stored can be critical to achieve this. 

Environment

Release: All supported releases
Component: IDMS SQL/Option

Resolution

When UPDATE STATISTICS is run against an SQL schema, area, or table, the statistics are stored in the DDLCAT area. When running UPDATE STATISTICS for a NonSQL schema, the DDLDML area containing the non-SQL schema is updated. Therefore, the DDLDML area needs to be available in update mode when the UPDATE STATISTICS statement is run for non-SQL.

Below are the SQL tables where statistics are stored, and the corresponding non-SQL records.

Tables NonSQL Records Statistics stored for these entities
SYSTEM.AREA SA-018 Physical database area
SYSTEM.COLUMN SCR-054 Columns / Sortkey elements for sorted sets and indexes
SYSTEM.CONSTRAINT n/a Constraints
SYSTEM.INDEX SMR-052 Indexes / Sets including the Calc set & Calckey elements
SYSTEM.TABLE SRCD-113 Table / Schema record

Specific columns in each table and record are updated to contain statistics; these columns and elements can be seen in the individual queries in each of the following sections.

Statistics for non-SQL databases

Users who are familiar with OLQ, Culprit, or other reporting tools can use them to look at the values in the DDLDML records using the records and field names noted above, and specifying subschema IDMSNWKA. The above records are obtained via the S-010 record by using the schema name as CALC key (S-NAM-010), and walking the network sets to the other records. Users who prefer to use SQL to view and/or verify these values can do that using the following steps:

Create a schema to access the dictionary records, using this syntax:

CREATE SCHEMA <your-db-name> FOR NONSQL SCHEMA SYSDIRL.IDMSNTWK VERSION 1;

It's possible that you will need to use a name other than 'SYSDIRL' to qualify the IDMSNTWK schema. If the above syntax gives an error, check with your DBA to see what dbname IDMSDIRL was run against, and use that name instead of SYSDIRL in this syntax. We're using your dbname for the schema name also, for ease of use; it's also a requirement for some IDMS components, so it may be useful later (or may already be defined).

Using this schema, you can access the columns with statistics in the DDLDML area with queries such as the following:-

To see the statistics for the physical areas:

SELECT SA.SA_NAM_018, SA.STAT_AREA_NUM_PAGES_018, SA.STAT_AREA_PAGE_SIZE_018, 
SA.STAT_AREA_NUM_PAGES_USED_018, SA.STAT_AREA_NUM_ROWS_018,
SA.STAT_AREA_PCT_USED_018
FROM SYSDICT."SA-018" SA, SYSDICT."S-010"
WHERE "S-SA" AND S_NAM_010='EMPSCHM' AND S_SER_010=100;

To see statistics for the records within each area:

SELECT SR.SR_ID_113, SR.STAT_REC_NUM_PAGES_113, SR.STAT_REC_NUM_PAGES_USED_113, 
SR.STAT_REC_NUM_ROWS_113, SR.STAT_REC_AVG_LENGTH_113,
SR.STAT_REC_PCT_USED_113, SR.STAT_REC_PCT_FRAG_113
FROM SYSDICT."SRCD-113" SR, SYSDICT."S-010" S
WHERE "S-SRCD" AND S_NAM_010='EMPSCHM' AND S_SER_010=100;

To see statistics for the sets in the schema (including the CALC set and statistics for the calckey):

SELECT SMR.SET_NAM_052, SMR.STAT_MEM_NUM_DISTINCT_KEYS_052, 
SMR.STAT_MEM_AVG_ROWS_SET_052, SMR.STAT_MEM_AVG_PAGES_KEY_052,
SMR.STAT_MEM_AVG_CLUSTER_COUNT_052
FROM SYSDICT."SMR-052" SMR, SYSDICT."S-010" S, SYSDICT."SRCD-113"
WHERE "S-SRCD" AND "SRCD-SMR" AND S_NAM_010='EMPSCHM' AND S_SER_010=100;

Finally, to see statistics for sortkey elements:

SELECT SCR.SCR_NAM_054, SCR.STAT_COL_NUM_DISTINCT_VALS_054, SCR.STAT_COL_LOW_VALUE_054,
SCR.STAT_COL_HIGH_VALUE_054
FROM SYSDICT."SCR-054" SCR, SYSDICT."S-010" S, SYSDICT."SRCD-113", SYSDICT."SMR-052" SMR
WHERE "S-SRCD" AND "SRCD-SMR" AND "SMR-SCR"
AND S_NAM_010='EMPSCHM' AND S_SER_010=100;

Note that in all of the above queries, there are substitutions you will have to make:
In the FROM clause, instead of SYSDICT you should specify the schema you defined in step 1 (with name = your dbname).
In the boolean expression naming S_NAM_010 you should specify the name of your non-SQL schema instead of 'EMPSCHM'.
In the boolean expression naming S_SER_010 you should specify the version of your non-SQL schema instead of 100. 

In reviewing the results, you may see some values that look very large, such as 1077952576. This is the numeric representation of x'4040404040' for some fields, and it means the field has not been updated. Other numeric values are valid and should be verified to ensure that the statistics are updated whenever the characteristics of the database change, including the quantity of stored records.

Statistics for SQL Databases

Simple SQL queries against the "SYSTEM" schema can be used to view the statistics stored for an SQL defined database.

Using this schema, you can access the columns with statistics with queries like the following. 

To see the statistics for the physical areas:

SELECT SEGMENT, NAME, NUMPAGES, PAGESIZE, NUMPAGESUSED, PCTSPACEUSED
FROM SYSTEM.AREA
WHERE SEGMENT='SQLDEMO' AND NAME='EMPLAREA';

To see statistics for the tables within a schema:

SELECT NAME, NUMPAGES, NUMROWS, AVGROWLENGTH, PCTSPACEUSED, PCTFRAGROWS 
FROM SYSTEM.TABLE
WHERE SCHEMA='DEMOEMPL';

To see statistics for the indexes defined on tables in the schema:

SELECT * FROM SYSTEM.INDEX
WHERE SCHEMA='DEMOEMPL';

To see statistics for constraints in the schema:

SELECT * FROM SYSTEM.CONSTRAINT 
WHERE SCHEMA='DEMOEMPL';

Note that in all of these queries, there are substitutions you will have to make:
In the boolean expression naming SEGMENT and AREA you should specify the name of the area where the tables in the schema reside.
In the boolean expressions naming SCHEMA you should specify the name of the SQL schema you have updated statistics for instead of 'DEMOEMPL'.

Additional Information

Dictionary Structure Reference
SYSTEM Tables and SYSCA Views