GSETL/GETIT an "Oldie" that is still a "Goodie"
search cancel

GSETL/GETIT an "Oldie" that is still a "Goodie"

book

Article ID: 56243

calendar_today

Updated On:

Products

Datacom DATACOM - AD Ideal

Issue/Introduction

This article explains the benefits from using sequential processing in Datacom .

Environment

Release:
Component: DB

Resolution

Datacom navigational GSETL/GETIT commands provide a high-speed sequential or skip-sequential processing mode for batch applications. When used properly the GSETL/GETIT commands can dramatically reduce the elapsed time, CPU, and I/O resources used to read records sequentially from a Datacom table. GSETL/GETIT processing is available for batch jobs using the Record-At-A-Time (RAAT) API, VSAM-T applications, or Ideal applications.

From the RAAT API

GSETL/GETIT commands are supported through the standard DBNTRY entry point used by batch navigational RAAT programs. To use GSETL/GETIT certain User Requirements Table (URT) parameters must be selected. Further discussion of the GSETL/GETIT processing and URTs parameters are included below. For detailed information, please see the Datacom/DB Programmer Guide and the Datacom /DB Database and System Administrator Guide.

From VSAM-Transparency

Each batch VSAM-T application is mapped to a VSAM Interface Table (VIT). This table provides Datacom with information on how to map the application's VSAM requests over one or more Datacom tables. For batch programs, the VIT can be set up to utilize GSETL/GETIT processing for selected tables. For more information, please see the Datacom VSAM Transparency Guide.

From Ideal Batch Applications

The Ideal batch run-time selects a URT (see SET RUN command) to be used in processing batch Ideal programs. If the selected URT supports GSETL/GETIT processing, Ideal will monitor the types of requests being issued to each table and if sequential processing can be utilized, it will automatically switch its command processing from the standard Set-At-A-Time (SAAT) commands to the GSETL/GETIT processing. For more information see the Ideal Administration Guide.

Recent Client Experience

Recently, Technical Support became involved with a client situation where long running batch jobs were beginning to present a problem for the site. The programs were still functioning properly, but as the client's data had grown so had overall run time for the jobs. It was suggested that GSETL/GETIT processing be considered as an alternative to try to shorten the jobs' elapsed time.

The "Pilot" Program

Before re-coding a large number of batch programs, it was decided to attempt the change with a sample program that was representative of a lot of the batch applications. This program was not the ideal candidate for GSETL/GETIT where the table is read from beginning to end in Native Key sequence, but it was a relatively simple COBOL program that did skip sequential processing by a non-Native Key where a quantity of "next" records were read after each new "locate". Since the problem was a relatively simple COBOL application, it would be easy to change and monitor.

Sample "Old" Code


IF FOUND-AGENT-SW = 'N'
MOVE 'N' TO CSK-EOF-SW
MOVE 'Y' TO FIRST-TIME-SW
MOVE '00000' TO KEY-ASFK-ACCT ASF-ACCT1
ASF-ACCOUNT SLCT-ACCT
PERFORM 4000-GET-AGENT UNTIL
ASF-NOT-FOUND
OR CSK-EOF-SW = 'Y'
OR ASF-ACCOUNT NOT = '00000'
END-IF.

4000-GET-AGENT.
IF FIRST-TIME-SW = 'Y'
MOVE 'N' TO FIRST-TIME-SW
MOVE 'LOCKY' TO ASF-COMMAND
PERFORM 7000-READ-AGENT-DB THRU 7000-EXIT
IF ASF-REC-FOUND
MOVE 'REDKY' TO ASF-COMMAND
PERFORM 99999-DB-ACCESS-ASF
.

IF ASF-REC-FOUND AND FIRST-TIME-SW = 'N'
IF ASF-RECORD-TYPE NOT = '20'
MOVE 'Y' TO CSK-EOF-SW
ELSE
MOVE 'Y' TO FOUND-AGENT-SW
PERFORM 8000-MOVE-AGENT THRU 8000-EXIT
MOVE 'REDNX' TO ASF-COMMAND
PERFORM 99999-DB-ACCESS-ASF
END-IF
END-IF.
4000-EXIT. EXIT.

Sample "New" Code


*4000-GET-AGENT.
* IF FIRST-TIME-SW = 'Y'
* MOVE 'N' TO FIRST-TIME-SW
* MOVE 'LOCKY' TO ASF-COMMAND
* PERFORM 7000-READ-AGENT-DB THRU 7000-EXIT
* IF ASF-REC-FOUND
* MOVE 'REDKY' TO ASF-COMMAND
* PERFORM 99999-DB-ACCESS-ASF
* .
* * * * * * * *
* Replace REDKY of paragraph 4000 with a GSETL and a GETIT to
* retrieve the row located by the LOCKY.
* Make sure your logic settings for error handling accepts either
* a "14" or a
* "19" as a end of file, as GETIT can get a "19" when EOF is reached.

4000-GET-AGENT.
IF FIRST-TIME-SW = 'Y'
MOVE 'N' TO FIRST-TIME-SW
MOVE 'LOCKY' TO ASF-COMMAND
PERFORM 7000-READ-AGENT-DB THRU 7000-EXIT
IF ASF-REC-FOUND
MOVE 'GSETL' TO ASF-COMMAND
PERFORM 99999-DB-ACCESS-ASF
IF ASF-REC-FOUND
MOVE 'GETIT' TO ASF-COMMAND
PERFORM 99999-DB-ACCESS-ASF
.

IF ASF-REC-FOUND AND FIRST-TIME-SW = 'N'
IF ASF-RECORD-TYPE NOT = '20'
MOVE 'Y' TO CSK-EOF-SW
ELSE
MOVE 'Y' TO FOUND-AGENT-SW
PERFORM 8000-MOVE-AGENT THRU 8000-EXIT

* * * * * * * *
* use GETIT instead of REDNX, make sure "19" RC is handled
*
* MOVE 'REDNX' TO ASF-COMMAND
* MOVE 'GETIT' TO ASF-COMMAND
* * * * * * * *

PERFORM 99999-DB-ACCESS-ASF
END-IF
END-IF.
4000-EXIT. EXIT.

The Bottom Line

Using  Datacom's Accounting Facility, the program was executed both the "old" and "new" ways. The performance differences were significant.

COMMAND REQUESTS LOGICAL LOGICAL PHYSICAL PHYSICAL  ELASPED
INDEX DATA INDEX DATA TIME

LOCKY 361 1444 0 982 0 7.65
REDKY 361 1444 361 0 361 7.58
REDNX 9.6M 9.6M 9.6M 47K 408K 19299.00
Elasped wall clock: 6+ Hours


GSETL 361 1444 0 0 0 .02
GETIT 53K 9.8M 9.6M 47K 62K 404.00
Elasped wall clock: 9 Minutes

The tremendous differences can be accounted for in three particular areas:

  • The ability of GETIT calls to return blocks of rows reduced the number of times the application called the MUF for a block of data from 9.6 million times to 53 thousand (or approx. 180 rows per call). The amount of rows that can be "blocked" per call depends on the size of the task work area, and GETBLK/GBMAXR settings. These will be discussed in detail below. This reduction in calls (SVCs) to the MUF lowers the CPU processing costs in the application, the MUF, and the overall system.
  • The reduction of physical data I/Os, since GSETL/GETIT knows that a number of records will be read sequentially, it will read up to the next 10 "needed" data blocks in a single "chained" I/O.
  • The ability of MUF to look ahead and build the next "block" of rows for the application in parallel with the application processing the current block reduces the application's wait time for rows to be returned by the MUF.

While this case was not the most ideal case for GSETL/GETIT processing, the high number of "next" requests following the "locate" requests helped to make it a very successful sample test.

Unfortunately, not every program will benefit as well as this sample. Each program's results will vary due to the number of "next" versus "locate" calls, the index selected, the order of the data rows, size of the rows, and many other factors. In most cases where multiple (9+) "next" rows are read per "locate" row, GSETL/GETIT processing can show a definite advantage.

Care should be exercised in selecting the values for the URT parameters that control sequential processing (SEQBUFS, GETBLK, and GBMAXR). These values should be selected according to the average number of "next" rows that are read after each "locate" and the available memory resources (SEQBUFS use either existing MUF buffers or additional memory in the MUF address space).

How can I tell if my programs will benefit?

Usually, the best candidates are jobs that run for a long time and sequentially process a lot of rows. A simple Accounting table could be put in place to monitor processing to help determine these candidates. What you are looking for are jobs that have a high "next" versus "locate" rate on any given table/key combination.

Accounting Suggestion:


Key Data: JNAME, RUNIT, BASE, TNAME, COMND, KNAME
Incremental: REQS, LOGIO, LOGIX, LOGDT, EXCPS, EXCIX, EXCDT, ETIME, RTIME

You may also want to add selection criteria to the accounting table to reduce the amount of data collected:


JNAME = "MYJOB" Limit job names, if possible
JTYPE NE ?02? Exclude CICS jobs
DBID > 20 Exclude CA databases
COMND NE ("SELFR" or "SELNR" or "SELPR") Exclude SAAT commands

Consider any criteria that can be added to reduce the number of jobs, the types of commands, and so on that are monitored.

GETIT - Batch Sequential Processing in DATACOM/DB

For sequential processing of rows in a Datacom/DB table, the GETIT command offers better performance than the standard RAAT commands REDNX/RDUNX. GETIT commands employ read-ahead and (optionally) blocked transfer of rows to the application region. This article describes how GETIT processing works inside CA-Datacom/DB and what the ramifications are for the user's applications.

A set of GETIT commands must be preceded by a GSETL command. The GSETL command determines the key to be used by subsequent GETIT commands as well as a starting key-value of the first row to be returned by the GETIT process. Subsequent GETIT commands will return the "next" data rows according to the sequence of the selected key. Remember that the GSETL command only determines the starting place in the index; it does not return a data row (like the REDKG command).

Because rows are returned in the sequence of a key, the DB engine attempts to asynchronously read ahead rows that are expected to be returned to the application on subsequent GETITs. Read-ahead enhances performance in two ways. The rows requested on subsequent GETITs are possibly already in main memory, when those GETIT commands are received. In addition, the asynchronous I/Os that read theses rows ahead can be chained together to cause fewer EXCP (EXecute Channel Program) calls to the operating system. In essence, this means that fewer I/Os are needed to read the corresponding rows into main memory and give them to the application for processing.

Read-Ahead Processing

The user specifies the overall number of buffers to be used during GETIT processing with the URT parameter SEQBUFS. This set of buffers is divided into two groups, the "current" group and the "next" group. Each group contains half of the buffers. The DB engine builds and passes records using the current buffer group to the application. While the application is processing from the current group, the DB engine builds the next group asynchronously. This processing of the "next " blocks before the application is ready for them is known as "read-ahead" processing. When the current group of buffers has been exhausted, the engine flips the next group to be the current one and the current one to be the next group, and begins the read-ahead process again.

While processing the data blocks for building the next group of buffers, the CA-Datacom buffer manager determines which of the I/Os for the next group (read-ahead) can be chained together to issue fewer EXCP calls. In many cases, all next (read-ahead) buffers can be processed with just one EXCP. Obviously, if the URT was coded with SEQBUFS=2, only one buffer is used for read-ahead, thus eliminating the performance gains of chained I/Os.

Specifying Sequential Buffers

If the URT parameter SEQBUFS value is a positive number, external sequential buffers are allocated in addition to and outside of the normal data buffers. If the value is a negative number, buffers from the MUF data buffer pool (DATANO) are used.

With external sequential buffers, the same set of buffers is used throughout the entire process and other tasks do not compete for these buffers.

With MUF data buffers, each read-ahead process selects its "next" buffers from the pool of normal data buffers according to LRU (Least Recently Used). Once processed, buffers in the "current" group are available for other task processing. Using MUF data buffers may have some effect on other tasks processing within MUF.

Sequential buffers are allocated in the length of the physical block size of the table to be processed and reside in the MUF address space, that is, they do not require any main memory allocations in the application region. They are allocated on the first GSETL call, not at URT OPEN time. This means that if a URT specifies SEQBUFS, but the application never issues GSETL commands, there is no overhead main memory allocation.

Index Read-Ahead Versus Physical Read-Ahead

The URT parameter READAHD specifies, how CA-Datacom/DB is to determine, which data blocks to use in read-ahead processing. With READAHD=INDEX, sequential processing invokes index processing to determine exactly which data blocks will contain the next rows in key sequence and then reads these blocks into the "next" buffers.

With READAHD=PHYSICAL, sequential processing does not invoke the index. Instead, data blocks are read selected by their relative block number within the disk file (in the physical sequence in which they reside on disk). If the key of reference is the Native Key and few or no rows have been added or deleted since the data was last loaded, PHYSICAL read-ahead can be more efficient since it does not invoke index processing.

In most cases, however, READAHD=INDEX is more suitable. If the key being used is not the Native Key, the possibility that the next sequential row will reside in the next physical block is greatly reduced. In addition, for tables with a lot of add activity, the possibility the data row is in the current or next physical block is reduced. In these cases, the PHYSICAL read-ahead processing will lose efficiency. If the next row in key sequence is not found in the current or the next group, a normal data buffer is used to retrieve this row synchronously, reaping no benefits from read-ahead processing.

With a non-Native Key, even index read-ahead will generally not perform nearly as well as with the Native Key. Each read-ahead buffer will contain at least one row of interest, but possibly not more. However, GETITs will still perform significantly better than REDNX/RDUNX.

Blocked GETIT Processing

The URT parameter GETBLK allows the user to request blocked transfer of rows from the MUF address space to the application region. Instead of transferring one row at a time, MUF now retrieves multiple rows sequentially and stores them in a "block" of the specified size in main memory. It then transfers the entire block to the Datacom/DB interface module (DBINFPR) in the application region. DBINFPR then satisfies the application's GETIT requests out of this block without further interaction with MUF until the block is exhausted. At that time, DBINFPR calls MUF to provide the next block of rows. The retrieval of rows to be put into this block on the MUF side works exactly the same way as with non-blocked GETITs, that is, with asynchronous, chained read-ahead I/Os using sequential buffers or normal data buffers as described above.

The GETBLK parameter is totally unrelated to physical block sizes on disk or the size of sequential buffers, it merely specifies the size of main memory storage to be used for the transfer of data rows to the application region. Since the task communication area (selected in the MUF startup option TASKS) is used to move the block of rows from the MUF to the application region, the GETBLK specification is automatically reduced if it exceeds the task communication area size.

Because of substantial overhead required to pass information between MUF and the application address space, blocked GETIT processing usually is significantly more efficient than non-blocked GETITs.

GETIT for Update

With the URT parameter UPDATE=YES specified, a row retrieved with GETIT is held under primary exclusive control and can be updated with a subsequent UPDAT command. Since there is only one UPDATE setting per table in the URT, the UPDATE specification can be overridden for a selected group of GETITs. In this way, the application can use GSETL/GETIT in read-only mode, while still supporting updates to the selected table by other RAAT/SAAT commands. To turn off the UPDATE setting for a group of GETIT commands, set the update intent flag to N on the first GETIT call made after the GSETL command. This setting should be left unchanged (by the user) until the next GSETL call.

The URT parameter AUTODXC determines whether exclusive control is to be released by a subsequent GETIT command. In the case of blocked GETITs in update mode, ALL rows returned in the GETBLK to the interface module in the application address space are placed under exclusive control. The MUF startup option EXTCLNO limits the amount of rows an application may hold under exclusive control. If the number of rows in the GETBLK exceed the EXCTLNO setting, the application receives a bad return code on the GETIT call. By reducing the GETBLK size or using GBMAXR, you can reduce the number of records passed in the GETBLK.

The number of rows that MUF will transfer to the interface can be restricted with the GBMAXR parameter in the URT. In read-only mode, this parameter should always be specified as 255. This specification causes MUF to transfer as many rows as will fit into the GETBLK. In update mode, GBMAXR can be used to reduce exclusive control waits for other tasks and also to stay within the EXTCLNO limit. IF GBMAXR is a number smaller than the number of rows represented by GETBLK, you will not be using GETBLK to its full capacity.

For an application that processes sequentially in update mode with blocked GETITs, but actually only updates very few of the rows retrieved, the following alternative would eliminate unnecessary exclusive control locks:

  • The URT specifies UPDATE=YES to enable updates.
  • GETITs are performed in one request area with the update intent flag set to N (no exclusive control).
  • Updates are performed in a second request area via an RDUxx/UPDAT sequence.

Better sequential performance?

If you want it, you can GETIT.

Additional Information

For more , see Fast Batch Sequential Retrieval and Fast Batch Updating and Deleting