Datacom SQL job doing a SELECT DISTINCT never ends until timeout or cancelled with S222. Either in a loop or very slow performance.
Uses INNER JOIN with CAST and SUBSTR functions and WHERE clause with LIKE and IN predicates.
SELECT DISTINCT ...
...
FROM TB0
INNER JOIN TB1
ON CAST(TB0.NBR AS CHAR(9)) = SUBSTR(TB1.TEXT,1,9)
WHERE TB0.ID LIKE 'X0%'
AND SUBSTR(TB0.ID2,1,1) = 'D' ;
Release : 15.1
The problem is that the join fields are not a key fields and using CAST and IN functions, also the fields used in the LIKE and IN predicates are not key fields either.
The SQL optimizer in this case chooses to do a nested loop join. For every row in the first table it has to search all rows in the other table, in this case about 8 million rows.
To improve performance change to use key fields in the join and move the current ON clause to the WHERE clause.
If performance does still not improve, force the SQL optimizer to use a hash join by adding the clause DATACOM_OPTIMIZATION = 'USE HASH JOIN'
Make the following change:
SELECT DISTINCT ...
...
FROM TB0
INNER JOIN TB1
ON TB0.ORG = TB1.ORG
AND DATACOM_OPTIMIZATION = 'USE HASH JOIN'
WHERE
CAST(TB0.NBR AS CHAR(9)) = SUBSTR(TB1.TEXT,1,9)
AND TB0.ID LIKE 'X0%'
AND SUBSTR(TB0.ID2,1,1) = 'D' ;
See the Datacom documentation section HASH JOIN