PXF server error 500 ArrayIndexOutOfBoundsException when accessing external tables
search cancel

PXF server error 500 ArrayIndexOutOfBoundsException when accessing external tables

book

Article ID: 296905

calendar_today

Updated On:

Products

VMware Tanzu Greenplum VMware Tanzu Greenplum / Gemfire VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

When accessing an external table using PXF, Greenplum reports "ERROR: PXF server error(500) : java.lang.ArrayIndexOutOfBoundsException"

The PXF log  reports:

2023-02-10 11:20:36.138 EST ERROR [124487-0000000008:dbname_dbuser:8  ] 763588 --- [88-exec-10] o.g.p.s.c.PxfErrorReporter               : 89

java.lang.ArrayIndexOutOfBoundsException: 89
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.getTypeForColumnDescriptor(ParquetFileAccessor.java:557) ~[pxf-hdfs-6.5.0.jar!/:?]
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.lambda$generateParquetSchema$2(ParquetFileAccessor.java:499) ~[pxf-hdfs-6.5.0.jar!/:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_192]
        at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[?:1.8.0_192]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[?:1.8.0_192]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_192]
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_192]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_192]
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) ~[?:1.8.0_192]
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.generateParquetSchema(ParquetFileAccessor.java:500) ~[pxf-hdfs-6.5.0.jar!/:?]
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.openForWrite(ParquetFileAccessor.java:251) ~[pxf-hdfs-6.5.0.jar!/:?]
        at org.greenplum.pxf.service.bridge.WriteBridge.lambda$beginIteration$0(WriteBridge.java:60) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.utilities.GSSFailureHandler.execute(GSSFailureHandler.java:64) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.bridge.WriteBridge.beginIteration(WriteBridge.java:60) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.controller.WriteServiceImpl.readStream(WriteServiceImpl.java:66) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.controller.WriteServiceImpl.lambda$writeData$0(WriteServiceImpl.java:40) ~[classes!/:6.5.0]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_192]

 

Environment

Platform Extension Framework(PXF) 6.5 and below

Cause

One of the columns in the external table has a type numeric with a very high precision.

The current logic in PXF takes into consideration Apache Hive limitations on the datatypes. In Hive, DECIMAL / NUMERIC types have the maximum precision of 38. By default the precision of Parquet NUMERIC type used in PXF is also 38. But if a Greenplum table NUMERIC column defines its own precision, that precision will be used instead. However, PXF assumes the precision will not be larger than max of 38.

Resolution

Workaround:

Reduce the precision of the numeric type to be 38 or less

Fix:

Improvements are made in version 6.6.0 and above. See Release Notes for more details.