PXF report error when trying to write parquet data with type timestamp(x)
search cancel

PXF report error when trying to write parquet data with type timestamp(x)

book

Article ID: 296442

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

When writing data to HIVE via PXF, you may receive the following error: Text ''xxx' could not be parsed, unparsed text found at index xx
ERROR:  remote component error (500) from '127.0.0.1:5888':  type  Exception report   message   Text '2020-03-14 03:59:36.153' could not be parsed, unparsed text found at index 19    description   The server encountered an internal error that prevented it from fulfilling this request. 
 
In some PXF versions, it can not properly handle the data type: timestamp(x)

Please refer to the below example on why this issue occurs:

1. Create a PXF external table to write parquet data:
CREATE WRITABLE EXTERNAL TABLE pxf_hdfs_test_w (date timestamp(6))
LOCATION ('pxf://greenplum/pxf_test/pxf_gp_table_w?PROFILE=hdfs:parquet')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_export');

2. Insert some data:
-- below with type timestamp(0) will not having issue
insert into pxf_hdfs_test_w values('2020-03-22 01:53:08');
insert into pxf_hdfs_test_w values('2020-03-22 01:53:09.000');

-- below with type timestampe(x) will fail
insert into pxf_hdfs_test_w values('2020-03-14 03:59:36.153');   

3. We can observe that data with the timestamp(x) type, the insert query fails:
gpadmin=# insert into pxf_hdfs_test_w values('2020-03-14 03:59:36.153');
ERROR:  remote component error (500) from '127.0.0.1:5888':  type  Exception report   message   Text '2020-03-14 03:59:36.153' could not be parsed, unparsed text found at index 19    description   The server encountered an internal error that prevented it from fulfilling this request.    exception   java.time.format.DateTimeParseException: Text '2020-03-14 03:59:36.153' could not be parsed, unparsed text found at index 19 (libchurl.c:944)  (seg1 192.168.6.198:20001 pid=16320) (cdbdisp.c:254)


Environment

Product Version: 5.17

Resolution

The issue has been fixed in the latest PFX release: PXF 5.15

If upgrading PXF is not an option at this moment, then you can try the following workaround.

Change the type of data from timestamp(x) to timestamp(0), as below:
# insert into pxf_hdfs_test_w values('2020-03-14 03:59:36.153'::timestamp(0));
INSERT 0 1