Greenplum Decision Support Benchmark TPC-DS benchmark run fails on custom database due to missing schema/table
search cancel

Greenplum Decision Support Benchmark TPC-DS benchmark run fails on custom database due to missing schema/table

book

Article ID: 411273

calendar_today

Updated On:

Products

VMware Tanzu Data Intelligence VMware Tanzu Greenplum VMware Tanzu Greenplum / Gemfire VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

When user wants to run /opt/GPDSB/tpcds.sh as role ${ADMIN_USER} against a custom database, for example, if people wish to run tpcds as gpadmin user against a database named test, they can simply change the following line in gpdsb_variables.sh and then source it:

export PGDATABASE="test"

and executing the script ./tpcds.sh it'd fail with:

psql:/opt/GPDSB/05_sql/101.dsbench.01.sql:25: ERROR:  relation "store_returns" does not exist
LINE 5: from store_returns

Environment

GPDSB 1.0

Cause

The query content indicated that it's querying the table  store_returns without specifying any schema:

$ cat /opt/GPDSB/05_sql/101.dsbench.01.sql
set role dsbench;
:EXPLAIN_ANALYZE
-- start query 12 in stream 0 using template query1.tpl and seed 893351816
with customer_total_return as
(select sr_customer_sk as ctr_customer_sk
,sr_store_sk as ctr_store_sk
,sum(SR_RETURN_TAX) as ctr_total_return
from store_returns
,date_dim
where sr_returned_date_sk = d_date_sk
and d_year =2001
group by sr_customer_sk
,sr_store_sk)
 select  c_customer_id
from customer_total_return ctr1
,store
,customer
where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2
from customer_total_return ctr2
where ctr1.ctr_store_sk = ctr2.ctr_store_sk)
and s_store_sk = ctr1.ctr_store_sk
and s_state = 'TN'
and ctr1.ctr_customer_sk = c_customer_sk
order by c_customer_id
limit 100;

-- end query 12 in stream 0 using template query1.tpl

 

This table was created under tpcds schema during previous steps:

$ cat 03_ddl/018.gpdb.store_returns.sql
CREATE TABLE tpcds.store_returns (
    sr_returned_date_sk integer,
    sr_return_time_sk integer,
    sr_item_sk integer NOT NULL,
    sr_customer_sk integer,
    sr_cdemo_sk integer,
    sr_hdemo_sk integer,
    sr_addr_sk integer,
    sr_store_sk integer,
    sr_reason_sk integer,
    sr_ticket_number bigint NOT NULL,
    sr_return_quantity integer,
    sr_return_amt numeric(7,2),
    sr_return_tax numeric(7,2),
    sr_return_amt_inc_tax numeric(7,2),
    sr_fee numeric(7,2),
    sr_return_ship_cost numeric(7,2),
    sr_refunded_cash numeric(7,2),
    sr_reversed_charge numeric(7,2),
    sr_store_credit numeric(7,2),
    sr_net_loss numeric(7,2)
)
WITH (:MEDIUM_STORAGE)
:DISTRIBUTED_BY
partition by range(sr_returned_date_sk)
(start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (100),
default partition others);

 

However, GPDSB tpcds.sh did not alter the search_path for custom database, it only altered for database ${ADMIN_USER}, so for a custom database the search_path remain as default public, which further result in "relation does not exist" error.

$ grep -n search_path /opt/GPDSB/03_ddl/rollout.sh.bak
80:SetSearchPath="ALTER database ${ADMIN_USER} SET search_path=tpcds, \"\${ADMIN_USER}\", public"
95:log_time "Set search_path for database ${ADMIN_USER}"

Resolution

Resolution

Upgrade to GPDSB1.1 to fix this issue.

Temporary Workaround

Once tpcds schema is crearted, run ALTER DATABASE <database_name> SET search_path TO tpcds, public; to change the search path

Additional Information

2 more possible scenario if the error persists even after performing the workaround

Scenario #1

If there's any role config on ${ADMIN_USER} to set search_path, it will override the above query. For example if here is a ${ADMIN_USER} - gpadmin is with:

gpadmin=# select rolname, rolconfig from pg_roles where rolname='gpadmin';
 rolname | rolconfig
---------+-----------
 gpadmin | {"search_path=\"$user\", public"}
(1 row)

The tpc-ds benchmark run will keep failing, in this case, either:

- Reset the role search_path if it's not required

alter role gpadmin reset search_path ;

- Add tpcds into the search_path:

alter role gpadmin set search_path=tpcds, "$user", public ;

 

Scenario #2

- If there are any statement that was trying to set search_path when psql via .psqlrc, it will also result in the same error, for example:

# cat /home/gpadmin/.psqlrc
set search_path=public;

- In this case remove the line in .psqlrc if it is not required.