GPText search function may return wrong when the search was limited with time range
search cancel

GPText search function may return wrong when the search was limited with time range

book

Article ID: 296337

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

When using the GPText search function against a certain date period, sometimes the result could be wrong due to the code for formatting the timezone. Please refer to the example below:

Let's assume our timezone setting is like below:

$ gpconfig -s TimeZone
Values on all segments are consistent
GUC          : TimeZone
Master  value: America/New_York
Segment value: America/New_York


We have a record in table called 't1'

demo=# select * from t1;
 id |   content    |            date
----+--------------+----------------------------
  1 | test message | 2020-07-20 23:59:30.527511
(1 row)


Create an index based on the following table:

SELECT * FROM gptext.create_index('public','t1', 'id', 'content');
SELECT * FROM gptext.index(TABLE(SELECT * FROM t1), 'demo.public.t1');
SELECT * FROM gptext.commit_index('demo.public.t1');


Now, if we search the index against date range within 2020-07-21, it will return the record in t1, which is not correct as it's date is 2020-07-20.

select * from gptext.search(table(select 1 scatter by 1),'demo.public.t1','*',
    array['date:[' || '2020-07-21'|| 'T00:00:00.000Z TO ' || '2020-07-21' || 'T23:59:59.999999Z]'], 
    'fl=*');

 id | score | hs |                                               rf
----+-------+----+------------------------------------------------------------------------------------------------
 1  |     1 |    | {"columnValue":[{"name":"_version_","value":"1674611282395791360"},{"name":"id","value":"1"}]}



Environment

Product Version: 6.5

Resolution

Root cause:

The column type in the table is 'timestamp' which is not affected by timezone but column type 'timestamptz' is. However, GPText converts values of 'timestamp' type as the same as 'timestamptz'. So the date will be updated with the timezone information within GPText. For example:

demo=# select now()::timestamptz as timestamptz , now()::timestamp as timestamp ;
          timestamptz          |         timestamp
-------------------------------+----------------------------
 2020-08-10 00:47:40.198913-04 | 2020-08-10 00:47:40.198913
(1 row)


Workaround:

Add an offset to the time range in the index searching query


Solutions:

The issue will be fixed in GPTEXTv3.4.4