When accessing an external table using PXF, Greenplum reports "ERROR: PXF server error(500) : java.lang.ArrayIndexOutOfBoundsException"
The PXF log reports:
2023-02-10 11:20:36.138 EST ERROR [124487-0000000008:dbname_dbuser:8 ] 763588 --- [88-exec-10] o.g.p.s.c.PxfErrorReporter : 89
java.lang.ArrayIndexOutOfBoundsException: 89
at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.getTypeForColumnDescriptor(ParquetFileAccessor.java:557) ~[pxf-hdfs-6.5.0.jar!/:?]
at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.lambda$generateParquetSchema$2(ParquetFileAccessor.java:499) ~[pxf-hdfs-6.5.0.jar!/:?]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_192]
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[?:1.8.0_192]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[?:1.8.0_192]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_192]
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_192]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_192]
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) ~[?:1.8.0_192]
at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.generateParquetSchema(ParquetFileAccessor.java:500) ~[pxf-hdfs-6.5.0.jar!/:?]
at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.openForWrite(ParquetFileAccessor.java:251) ~[pxf-hdfs-6.5.0.jar!/:?]
at org.greenplum.pxf.service.bridge.WriteBridge.lambda$beginIteration$0(WriteBridge.java:60) ~[classes!/:6.5.0]
at org.greenplum.pxf.service.utilities.GSSFailureHandler.execute(GSSFailureHandler.java:64) ~[classes!/:6.5.0]
at org.greenplum.pxf.service.bridge.WriteBridge.beginIteration(WriteBridge.java:60) ~[classes!/:6.5.0]
at org.greenplum.pxf.service.controller.WriteServiceImpl.readStream(WriteServiceImpl.java:66) ~[classes!/:6.5.0]
at org.greenplum.pxf.service.controller.WriteServiceImpl.lambda$writeData$0(WriteServiceImpl.java:40) ~[classes!/:6.5.0]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_192]
Platform Extension Framework(PXF) 6.5 and below
One of the columns in the external table has a type numeric with a very high precision.
The current logic in PXF takes into consideration Apache Hive limitations on the datatypes. In Hive, DECIMAL / NUMERIC types have the maximum precision of 38. By default the precision of Parquet NUMERIC type used in PXF is also 38. But if a Greenplum table NUMERIC column defines its own precision, that precision will be used instead. However, PXF assumes the precision will not be larger than max of 38.
Reduce the precision of the numeric type to be 38 or less
Improvements are made in version 6.6.0 and above. See Release Notes for more details.