SQL query with a large IN list is performing poorly
search cancel

SQL query with a large IN list is performing poorly

book

Article ID: 33237

calendar_today

Updated On:

Products

IDMS

Issue/Introduction

When running an SQL query with an IN predicate in the WHERE clause, the query runs fine as long as the number of entries in the IN list does not exceed a certain number.

Is there a limit as to the number of values that can be specified in an “IN” list?

Environment

Release: All supported releases.
Component: SQL Option.

Cause

If a query experiences a change in performance when it is altered, that is almost certainly due to a different path being taken through the database on the different queries. 

Resolution

There are indeed limits for various aspects of SQL, including syntax limits. These are documented at Summary of Limits. However, it can be confusing as to how the limit applies to a particular SQL statement, because the limit is applied after the compiler processes the statement and translates the clauses into how they will be processed. An "IN" clause, for example, is actually an "OR-list" so it's a series of OR clauses, each clause representing a statement that compares the table column to a specific value in the list. 

If a limit is exceeded, an error message will be produced indicating that has occurred. In some cases, the query will not receive an error; it will just encounter differences in performance time once it exceeds a certain number of parameter values. 

This change in performance is almost certainly due to a different path being taken through the database on the different queries. That can be verified using the EXPLAIN command. 

There are a couple of options available if there needs to be more entries in the "IN" list. 

  1. Probably the easiest and most efficient to execute is to create a table to hold the values to be compared. If the values are mostly static, store them in the table and update them as necessary. If they're not known until run-time, store them in the table at that point. Then issue a join between the column currently in the IN clause and the column of the table where the values are stored.
  2. Split the current statement into multiple statements, each one using a subset of the desired list of IN values. Store the results of each query in a temporary table. When all the queries are processed, retrieve the results from the temporary table.
  3. There are some variations on the second option above, such as storing the IN values in an array and processing various number of rows from the array in each command, using a variable to represent the subscript for the place in the array.
  4. Examine the results of the EXPLAIN command and determine if the performance could be improved by adding an additional access method (such as a new index) or by tuning an existing access method.

Additional Information