Daily job working with a Fast Unload for Db2 for z/OS (PFU) that extracts the data from a Db2 table and then with a batch process running
in Unix AIX distributed environment the data is loaded to an Oracle table.
Recently found several times the distributed batch job in Unix AIX fails with a duplicate key issue when it tries to load the data to the Oracle table.
However if rerun the Fast Unload job to recreate the file and restart the distributed batch process, then the distributed batch job in the Unix AIX worked fine.
Reviewed the unloaded file and found several duplicate keys. A Log Analyzer job was run to track the Insert/Update/Delete activity and noticed the same
record was first deleted and then inserted again by an online process but Fast Unload added the record twice in the unload file.
What is the cause of the duplicate records in the unloaded file and how to avoid them?
The Fast Unload step worked with FASTUNLOAD ... SQL-ACCESS ONLY parameter.
With SQL-ACCESS ONLY parameter locks are taken directly by Db2. In the Db2 access by default Fast Unload (PFU) works with
MULTI-ROW-FETCH 100, the following scenario is a recreation of the situation.
1. PFU opens a cursor
2. PFU fetches rows 1-100 and writes them to the output
3. PFU fetches rows 101-200
4. Application deletes row 1
5. Application inserts a row which has the same index key as row 1
(this is valid, not violating the uniqueness, since row 1 was already
deleted), this row will be inserted to the end of the tablespace as
row number 400
6. PFU fetches rows 201-300
7. PFU fetches rows 301-400 - row 400 is the same as row 1
There is currently no way in Fast Unload to guarantee data integrity without locking the table, which can be problematic in a busy production environment.
One thing that might help decrease the probability of this is using SQL-ACCESS NONE and SHRLEVEL CHANGE,BP-LOOKUP. In this case Fast Unload will check
the Db2 log, identify which pages of the unloaded tablespace have changed during the unload process and reads the buffer pools to get the latest page data.
Note that this also does not guarantee data integrity, as described in the documentation.
If you work with SQL-ACCESS ONLY you always have the exposure to have duplicate key records in the unloaded data set. There is another option to add
an additional step in the Unload JCL to eliminate the first record with duplicate key working with the SORT program.
The following SELECT operator should work to eliminate the first record that is wrong and send only the second record to the distributed environment (Unix AIX).
SELECT FROM(indd) TO(outdd) ON(p,m,CH) LAST DISCARD(savedd)
p: First byte of the key
m: Length in bytes of the key
You can review the SELECT operator in the IBM DFSORT manual:
z/OS DFSORT Application Programming Guide
There is one more option available but it requires you change the design of the application. You can create a Quick Copy Sharelvl Change Consistent (or an IBM Copy Sharelvl Change Flashcopy
Consistent) and then you can run Fast Unload Job unloading the data from the consistent flash copy. Since the copy is consistent there will not be any duplicate key in the unloaded data set. This
requires a new design of the application and the unload steps in the job.