Querying a catalog table pg_indexes will return ERROR: cache lookup failed for relation <index oid> (ruleutils.c:10322)
Preparation
Create a test table, insert data into it, and analyze it
drop table if exists test1 ;
create table test1 (calendar_date date);
insert into test1(calendar_date) values('20250101');
analyze test1 ;
Reproduce the Issue
1. Create index on the table
drop index if exists index_test1;
create index index_test1 on test1(calendar_date);
2. Open session 1 begin a transaction to re-create the index:
begin;
drop index if exists index_test1;
create index index_test1 on test1(calendar_date);
3. Open session 2 and query pg_indexes:
select * from pg_indexes where tablename='test1'; <----------- this query will hung
4. Go to session 1 and commit the transaction;
5, Now the session 2 will throw out ERROR: cache lookup failed for relation <index oid> (ruleutils.c:10322)
GPDB 6
This is a expected behavior which was inherited from upstream postgres, let's break down the the cause step by step
Step 1 - session 1 - uncommitted transaction will lock both the table and index
gpadmin=# begin;
BEGIN
gpadmin=# drop index if exists index_test1;
DROP INDEX
gpadmin=# create index index_test1 on test1(calendar_date);
CREATE INDEX
gpadmin=# select pid, sess_id from pg_stat_activity where pid = pg_backend_pid();
pid | sess_id
---------+---------
3559635 | 361
(1 row)
gpadmin=# select l.locktype, l.relation, c.relname,c.relkind, l.mode, l.granted, l.mppsessionid from pg_locks l join pg_class c on l.relation = c.oid where mppsessionid=361 and relation in (select oid from pg_class where relname in ('index_test1', 'test1'));
locktype | relation | relname | relkind | mode | granted | mppsessionid
----------+----------+-------------+---------+---------------------+---------+--------------
relation | 1444935 | test1 | r | ShareLock | t | 361
relation | 1444935 | test1 | r | AccessExclusiveLock | t | 361
relation | 1444935 | test1 | r | ShareLock | t | 361
relation | 1444935 | test1 | r | AccessExclusiveLock | t | 361
relation | 1444935 | test1 | r | ShareLock | t | 361
relation | 1444935 | test1 | r | AccessExclusiveLock | t | 361
relation | 1444938 | index_test1 | i | AccessExclusiveLock | t | 361
relation | 1444938 | index_test1 | i | AccessExclusiveLock | t | 361
relation | 1444938 | index_test1 | i | AccessExclusiveLock | t | 361
(9 rows)
Step 2 - Session 2 - pg_indexes query will wait until AccessExclusiveLock released
- Now the following query will be stuck on waiting to gain a AccessShareLock on relation 'test1', however, such lock will not be granted unless the AccessExclusiveLock in session 1 get released.
gpadmin=# select * from pg_indexes where tablename='test1'; <----------- this query will hung
gpadmin=# select l.locktype, l.relation, c.relname,c.relkind, l.mode, l.granted, l.mppsessionid from pg_locks l join pg_class c on l.relation = c.oid where relation in (select oid from pg_class where relname in ('index_test1', 'test1')) and granted = 'f';
locktype | relation | relname | relkind | mode | granted | mppsessionid
----------+----------+----------+---------+-----------------+---------+--------------
relation | 1444935 | test1 | r | AccessShareLock | f | 472
(1 row)
- Note that at this moment, the index oid was staying as the old one 1444938 because session 1 was not getting committed
gpadmin=# select oid,relname from pg_class where relname in ('index_test1', 'test1');
oid | relname
---------+------------
1444935 | test1
1444938 | index_test1
(2 rows)
Step 3 - commit session 1, session 2 was granted with required lock, only to find the old index oid is gone, thus returned with error.
- End session 1
session 1
gpadmin=# end ;
COMMIT
- Session 2 now can finish but will return with an error because the index oid has been recreated so the old index oid will no longer be found.
gpadmin=# select * from pg_indexes where tablename='test1';
ERROR: cache lookup failed for relation 1444938 (ruleutils.c:10350)
- Checking again now to identify the index oid change
gpadmin=# select oid,relname from pg_class where relname in ('index_test1', 'test1');
oid | relname
---------+------------
1444939 | index_test1
1444935 | test1
(2 rows)
This is a expected by-design behavior, re-run the query it will work fine,