ERROR: PXF server error(500) : java.lang.ArrayIndexOutOfBoundsException
search cancel

ERROR: PXF server error(500) : java.lang.ArrayIndexOutOfBoundsException

book

Article ID: 296905

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

When accessing an external table using PXF, Greenplum reports "ERROR: PXF server error(500) : java.lang.ArrayIndexOutOfBoundsException"

The PXF logĀ  reports:
2023-02-10 11:20:36.138 EST ERROR [124487-0000000008:dbname_dbuser:8  ] 763588 --- [88-exec-10] o.g.p.s.c.PxfErrorReporter               : 89

java.lang.ArrayIndexOutOfBoundsException: 89
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.getTypeForColumnDescriptor(ParquetFileAccessor.java:557) ~[pxf-hdfs-6.5.0.jar!/:?]
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.lambda$generateParquetSchema$2(ParquetFileAccessor.java:499) ~[pxf-hdfs-6.5.0.jar!/:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_192]
        at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[?:1.8.0_192]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[?:1.8.0_192]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_192]
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_192]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_192]
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) ~[?:1.8.0_192]
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.generateParquetSchema(ParquetFileAccessor.java:500) ~[pxf-hdfs-6.5.0.jar!/:?]
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.openForWrite(ParquetFileAccessor.java:251) ~[pxf-hdfs-6.5.0.jar!/:?]
        at org.greenplum.pxf.service.bridge.WriteBridge.lambda$beginIteration$0(WriteBridge.java:60) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.utilities.GSSFailureHandler.execute(GSSFailureHandler.java:64) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.bridge.WriteBridge.beginIteration(WriteBridge.java:60) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.controller.WriteServiceImpl.readStream(WriteServiceImpl.java:66) ~[classes!/:6.5.0]
        at org.greenplum.pxf.service.controller.WriteServiceImpl.lambda$writeData$0(WriteServiceImpl.java:40) ~[classes!/:6.5.0]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_192]
   :
   :
   :

Cause:

One of the columns in the external table had a type numeric with a very high precision.

The current logic in PXF takes into consideration Apache Hive limitations on the datatypes. In Hive, DECIMAL / NUMERIC types have the maximum precision of 38. By default the precision of Parquet NUMERIC type will also be 38, but if a Greenplum table NUMERIC column defines its own precision, that precision will be used instead. However, PXF assumes the precision will not be larger than max of 38.

Environment

Product Version: 6.23

Resolution

Workaround:

Reduce the precision of the numeric type to be 38 or less