PXF server error(500) : Could not initialize class org.xerial.snappy.Snappy
search cancel

PXF server error(500) : Could not initialize class org.xerial.snappy.Snappy

book

Article ID: 376327

calendar_today

Updated On:

Products

VMware Tanzu Greenplum Greenplum Pivotal Data Suite Non Production Edition VMware Tanzu Data Suite VMware Tanzu Data Suite

Issue/Introduction

There is a known issue with the snappy package where it uses the tmp directory to execute certain scripts. If the tmp directory is set with the noexec option, the snappy tool will encounter numerous errors and fail.


ERROR: ERROR:  PXF server error(500) : Could not initialize class org.xerial.snappy.Snappy  (seg1 slice1 xxx.xxx.xxx.xxx:45001 pid=87316)
DETAIL:  java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
        at org.apache.parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)
        at org.apache.parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at org.apache.parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:279)
        at org.apache.parquet.bytes.BytesInput.toByteBuffer(BytesInput.java:230)
        at org.apache.parquet.bytes.BytesInput.toInputStream(BytesInput.java:239)
        at org.apache.parquet.column.impl.ColumnReaderBase.readPageV1(ColumnReaderBase.java:650)
        at org.apache.parquet.column.impl.ColumnReaderBase.access$300(ColumnReaderBase.java:57)
        at org.apache.parquet.column.impl.ColumnReaderBase$3.visit(ColumnReaderBase.java:593)
        at org.apache.parquet.column.impl.ColumnReaderBase$3.visit(ColumnReaderBase.java:590)
        at org.apache.parquet.column.page.DataPageV1.accept(DataPageV1.java:120)
        at org.apache.parquet.column.impl.ColumnReaderBase.readPage(ColumnReaderBase.java:590)
        at org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:564)
        at org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705)
        at org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30)
        at org.apache.parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:47)
        at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84)
        at org.apache.parquet.io.RecordReaderImplementation.<init>(RecordReaderImplementation.java:271)
        at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147)
        at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109)
        at org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165)
        at org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109)
        at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137)
        at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
        at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
        at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.readNextObject(ParquetFileAccessor.java:186)
        at org.greenplum.pxf.service.bridge.ReadBridge.getNext(ReadBridge.java:86)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.processFragment(ReadServiceImpl.java:157)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.writeStream(ReadServiceImpl.java:101)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.lambda$null$0(ReadServiceImpl.java:58)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1918)
        at org.greenplum.pxf.service.security.BaseSecurityService.doAs(BaseSecurityService.java:122)
        at org.greenplum.pxf.service.controller.BaseServiceImpl.processData(BaseServiceImpl.java:74)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.lambda$readData$1(ReadServiceImpl.java:58)
        at org.greenplum.pxf.service.controller.PxfErrorReporter.invokeWithErrorHandling(PxfErrorReporter.java:28)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.readData(ReadServiceImpl.java:58)
        at org.greenplum.pxf.service.rest.PxfReadResource.lambda$produceResponse$0(PxfReadResource.java:53)
        at org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask.call(StreamingResponseBodyReturnValueHandler.java:111)
        at org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask.call(StreamingResponseBodyReturnValueHandler.java:98)
        at org.springframework.web.context.request.async.WebAsyncManager.lambda$startCallableProcessing$4(WebAsyncManager.java:355)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.greenplum.pxf.service.spring.PxfContextMdcLogEnhancerDecorator.lambda$decorate$0(PxfContextMdcLogEnhancerDecorator.java:27)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
HINT:  Check the PXF logs located in the '/usr/local/pxf-gp6/logs' directory on host 'localhost' or 'set client_min_messages=LOG' for additional details.
CONTEXT:  External table pxf_tbl_parquet_read  (seg1 slice1 xxx.xxx.xxx.xxx:45001 pid=12345)

DETAIL:  java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
        at org.apache.parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)
        at org.apache.parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at org.apache.parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:279)
        at org.apache.parquet.bytes.BytesInput.toByteBuffer(BytesInput.java:230)
        at org.apache.parquet.bytes.BytesInput.toInputStream(BytesInput.java:239)
        at org.apache.parquet.column.impl.ColumnReaderBase.readPageV1(ColumnReaderBase.java:650)
        at org.apache.parquet.column.impl.ColumnReaderBase.access$300(ColumnReaderBase.java:57)
        at org.apache.parquet.column.impl.ColumnReaderBase$3.visit(ColumnReaderBase.java:593)
        at org.apache.parquet.column.impl.ColumnReaderBase$3.visit(ColumnReaderBase.java:590)
        at org.apache.parquet.column.page.DataPageV1.accept(DataPageV1.java:120)
        at org.apache.parquet.column.impl.ColumnReaderBase.readPage(ColumnReaderBase.java:590)
        at org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:564)
        at org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705)
        at org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30)
        at org.apache.parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:47)
        at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84)
        at org.apache.parquet.io.RecordReaderImplementation.<init>(RecordReaderImplementation.java:271)
        at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147)
        at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109)
        at org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165)
        at org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109)
        at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137)
        at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
        at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
        at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
        at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.readNextObject(ParquetFileAccessor.java:186)
        at org.greenplum.pxf.service.bridge.ReadBridge.getNext(ReadBridge.java:86)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.processFragment(ReadServiceImpl.java:157)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.writeStream(ReadServiceImpl.java:101)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.lambda$null$0(ReadServiceImpl.java:58)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1918)
        at org.greenplum.pxf.service.security.BaseSecurityService.doAs(BaseSecurityService.java:122)
        at org.greenplum.pxf.service.controller.BaseServiceImpl.processData(BaseServiceImpl.java:74)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.lambda$readData$1(ReadServiceImpl.java:58)
        at org.greenplum.pxf.service.controller.PxfErrorReporter.invokeWithErrorHandling(PxfErrorReporter.java:28)
        at org.greenplum.pxf.service.controller.ReadServiceImpl.readData(ReadServiceImpl.java:58)
        at org.greenplum.pxf.service.rest.PxfReadResource.lambda$produceResponse$0(PxfReadResource.java:53)
        at org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask.call(StreamingResponseBodyReturnValueHandler.java:111)
        at org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask.call(StreamingResponseBodyReturnValueHandler.java:98)
        at org.springframework.web.context.request.async.WebAsyncManager.lambda$startCallableProcessing$4(WebAsyncManager.java:355)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.greenplum.pxf.service.spring.PxfContextMdcLogEnhancerDecorator.lambda$decorate$0(PxfContextMdcLogEnhancerDecorator.java:27)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
HINT:  Check the PXF logs located in the '/usr/local/pxf-gp6/logs' directory on host 'localhost' or 'set client_min_messages=LOG' for additional details.
CONTEXT:  External table pxf_tbl_parquet_read

Environment

PXF Versions: 

6.3.2 - 6.10.1

GPDB Versions: 

6.x.x 

Cause

There is a known issue for the snappy package. It will use the tmp directory to execute some scripts, if the tmp directory set as noexec, then snappy tool will fail with lots of errors.

Resolution

To address this issue, the customer needs to verify that their /tmp directory is set with the noexec option. If the /tmp directory is configured with the nonexec option, they can use a different temporary directory by following the steps outlined below.

 

  1. Run the command to see whether their /tmp directory is with nonexec option.
  2.  
  3. $ mount | grep '/tmp'
  4. tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noexec,seclabel)
  5.  
  6. If the tmp direction has noexec option, follow the steps below.
  7.  
  8. 1. Stop the pxf cluster with pxf cluster stop
  9.  
  10. 2. Add option -Dorg.xerial.snappy.tempdir=${PXF_BASE}/run to PXF_JVM_OPTS in $PXF_BASE/conf/pxf-env.sh.
  11.  
  12. For example:
  13. # Memory
  14. export PXF_JVM_OPTS="-Xmx2g -Xms1g -Dorg.xerial.snappy.tempdir=${PXF_BASE}/run"