pxf data loading failed with error: core-site.xml is missing or using unsupported file
search cancel

pxf data loading failed with error: core-site.xml is missing or using unsupported file

book

Article ID: 296304

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

pxf loading job failed with the error below:
psql:/hadoop1/apps/xxx/xxx/stdby/xxx.sql:12: ERROR: remote component error (500) from '127.0.0.1:1234': type Exception report message core-site.xml is missing or using unsupported file:// as default filesystem description The server encountered an internal error that prevented it from fulfilling this request. exception java.lang.IllegalStateException: core-site.xml is missing or using unsupported file:// as default filesystem (libchurl.c:920) (segxxx  slicex  ip_address:port  pid=1234) (libchurl.c:920)
CONTEXT: External table table_name, line 1 of file pxf://data/output/xxx.xxxx*?PROFILE=hdfs:text
2020-01-01 01:01:01: Error occured while loading into Standby GPDB for xxxxx
From the pxf log at the segment host:  
($PXF_CONF/logs/pxf-service.log, /usr/local/greenplum-pxf/logs/pxf-service.log).
ERROR tomcat-http--1111 org.apache.hadoop.conf.Configuration - error parsing conf file:/usr/local/gr
eenplum-pxf/servers/default/core-site.xml
java.io.FileNotFoundException: /usr/local/greenplum-pxf/servers/default/core-site.xml (Too many open files)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
The customer mentioned that this issue showed up a few times before.Restarting the pxf service can make it work for a while, but later the data loading will fail again.

Environment

Product Version: 6.0

Resolution

Checking the open file limit at that host, it's limit at 65K. Checking lsof output at that host, we see that it's already at 80K. Each time we run data loading, the error will report at that host.

The GPDB v5 installation guide suggests to set open file limit to 512K. After setting the open file limit as 512K, restart the pxf service so that data loading can complete again.