When using PXF with writable external tables to write data to Network Attached Storage (NAS) using NFS protocol or smaller jobs may complete successfully, but there is a significant decline in performance for larger tables, resulting in considerably longer processing times or jobs may fail with below traces.
2025-04-18 09:58:36.304 IST ERROR [174373-0000000002:pxf_abc:0 ] 231749 --- [88-exec-11] o.g.p.s.b.WriteBridge : Failed to close bridge resources: Invalid argument
2025-04-18 09:58:36.305 IST INFO [174373-0000000002:pxf_abc:0 ] 231749 --- [88-exec-11] o.g.p.s.c.BaseServiceImpl : Failed write operation [2387 ms, 3 records, 1.26 records/sec, 400 bytes, 167.57 bytes/sec] for null
2025-04-18 09:58:36.305 IST ERROR [174373-0000000002:pxf_abc:0 ] 231749 --- [88-exec-11] o.g.p.s.c.PxfErrorReporter : Invalid argument
java.io.IOException: Invalid argument
at java.io.FileOutputStream.close0(Native Method) ~[?:1.8.0_432]
at java.io.FileOutputStream.access$000(FileOutputStream.java:53) ~[?:1.8.0_432]
at java.io.FileOutputStream$1.close(FileOutputStream.java:356) ~[?:1.8.0_432]
at java.io.FileDescriptor.closeAll(FileDescriptor.java:212) ~[?:1.8.0_432]
at java.io.FileOutputStream.close(FileOutputStream.java:354) ~[?:1.8.0_432]
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.close(RawLocalFileSystem.java:250) ~[hadoop-common-2.10.2.jar!/:?]
at java.io.FilterOutputStream.close(FilterOutputStream.java:159) ~[?:1.8.0_432]
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) ~[hadoop-common-2.10.2.jar!/:?]
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) ~[hadoop-common-2.10.2.jar!/:?]
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:414) ~[hadoop-common-2.10.2.jar!/:?]
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) ~[hadoop-common-2.10.2.jar!/:?]
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) ~[hadoop-common-2.10.2.jar!/:?]
at org.apache.parquet.hadoop.util.HadoopPositionOutputStream.close(HadoopPositionOutputStream.java:64) ~[parquet-hadoop-1.11.1.jar!/:1.11.1]
at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:865) ~[parquet-hadoop-1.11.1.jar!/:1.11.1]
at org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:122) ~[parquet-hadoop-1.11.1.jar!/:1.11.1]
at org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:165) ~[parquet-hadoop-1.11.1.jar!/:1.11.1]
at org.greenplum.pxf.plugins.hdfs.ParquetFileAccessor.closeForWrite(ParquetFileAccessor.java:287) ~[pxf-hdfs-6.8.0.jar!/:?]
at org.greenplum.pxf.service.bridge.WriteBridge.endIteration(WriteBridge.java:105) ~[classes!/:6.8.0]
at org.greenplum.pxf.service.controller.WriteServiceImpl.readStream(WriteServiceImpl.java:74) ~[classes!/:6.8.0]
at org.greenplum.pxf.service.controller.WriteServiceImpl.lambda$writeData$0(WriteServiceImpl.java:40) ~[classes!/:6.8.0]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_432]
at javax.security.auth.Subject.doAs(Subject.java:360) ~[?:1.8.0_432]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1918) ~[hadoop-common-2.10.2.jar!/:?]
at org.greenplum.pxf.service.security.BaseSecurityService.doAs(BaseSecurityService.java:122) ~[classes!/:6.8.0]
at org.greenplum.pxf.service.controller.BaseServiceImpl.processData(BaseServiceImpl.java:74) ~[classes!/:6.8.0]
at org.greenplum.pxf.service.controller.WriteServiceImpl.writeData(WriteServiceImpl.java:40) ~[classes!/:6.8.0]
at org.greenplum.pxf.service.rest.PxfWriteResource.produceResponse(PxfWriteResource.java:52) ~[classes!/:6.8.0]
at org.greenplum.pxf.service.rest.PxfWriteResource.produceResponse(PxfWriteResource.java:19) ~[classes!/:6.8.0]
GPDB 6.27.1 writing to a NAS using PXF 6.8.0
Cause 01 - Out of space
The space required to copy is more than the space of the files alone. There is extra space needed for temp files during the copy. The error reported is listed below.
The log below reports that the writes were completing successfully and then it started to fail with the error. This never failed outright, it was just extremely slow.
2024-09-28 19:12:27.698 IST INFO [881322-0000000002:host01:263] 44196 --- [88-exec-15] o.g.p.s.c.BaseServiceImpl : Completed write operation [140 ms, 1 record, 7.14 records/sec, 2061 bytes, 14721.43 bytes/sec]
2024-09-28 20:08:20.868 IST ERROR [885444-0000000002:host02:260] 44196 --- [888-exec-4] o.g.p.s.b.WriteBridge : Failed to close bridge resources: Invalid argument
2024-09-28 20:08:20.868 IST ERROR [885444-0000000002:host02:262] 44196 --- [88-exec-11] o.g.p.s.b.WriteBridge : Failed to close bridge resources: Invalid argument
2024-09-28 20:08:20.869 IST INFO [885444-0000000002:host02:260] 44196 --- [888-exec-4] o.g.p.s.c.BaseServiceImpl : Failed write operation [390851 ms, 1 record, 0.00 records/sec, 343 bytes, 0.88 bytes/sec] for null
2024-09-28 20:08:20.869 IST INFO [885444-0000000002:host02:262] 44196 --- [88-exec-11] o.g.p.s.c.BaseServiceImpl : Failed write operation [390851 ms, 1 record, 0.00 records/sec, 321 bytes, 0.82 bytes/sec] for null
2024-09-28 20:08:20.869 IST ERROR [885444-0000000002:host02:262] 44196 --- [88-exec-11] o.g.p.s.c.PxfErrorReporter : Invalid argument
2024-09-28 20:08:20.869 IST INFO [885444-0000000002:host02:260] 44196 --- [888-exec-4] o.g.p.s.c.BaseServiceImpl : Failed write operation [390851 ms, 1 record, 0.00 records/sec, 343 bytes, 0.88 bytes/sec] for null
2024-09-28 20:08:20.869 IST INFO [885444-0000000002:host02:262] 44196 --- [88-exec-11] o.g.p.s.c.BaseServiceImpl : Failed write operation [390851 ms, 1 record, 0.00 records/sec, 321 bytes, 0.82 bytes/sec] for null
2024-09-28 20:08:20.869 IST ERROR [885444-0000000002:host02:262] 44196 --- [88-exec-11] o.g.p.s.c.PxfErrorReporter : Invalid argument
java.io.IOException: Invalid argument
at java.io.FileOutputStream.close0(Native Method) ~[?:1.8.0_422]
at java.io.FileOutputStream.access$000(FileOutputStream.java:53) ~[?:1.8.0_422]
at java.io.FileOutputStream$1.close(FileOutputStream.java:356) ~[?:1.8.0_422]
at java.io.FileDescriptor.closeAll(FileDescriptor.java:212) ~[?:1.8.0_422]
at java.io.FileOutputStream.close(FileOutputStream.java:354) ~[?:1.8.0_422]
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.close(RawLocalFileSystem.java:250) ~[hadoop-common-2.10.2.jar!/:?]
at java.io.FilterOutputStream.close(FilterOutputStream.java:159) ~[?:1.8.0_422]
Cause 02 - Incompatible NFS versions for mount point configured on the Greenplum Cluster hosts.
NFS version of mount <mountpointname>
[gpadmin@host1 ~]$ gpssh -f hostfile -e "mount" |grep cdpabc
[host1] 172.20.20.160:/mountpoint on /mountpoint type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.20.0.20,local_lock=none,addr=172.20.20.160)
[host2] 172.20.20.160:/mountpoint on /mountpoint type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.20.0.21,local_lock=none,addr=172.20.20.160)
[hostm] 172.20.20.160:/mountpoint on /mountpoint type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.20.20.160,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=172.20.20.160)
Resolution for Cause 01 -
Free up more space on the storage or
Use a different NAS with more free space
Resolution for Cause 02 -