GPHDFS external table query fails with error: "SQL State: 38000"
search cancel

GPHDFS external table query fails with error: "SQL State: 38000"

book

Article ID: 295586

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

When querying a GPHDFS external table, it fails with the error "SQL State: 38000". However, no clear error message is returned:

ERROR: external table gphdfs protocol command ended with error. 17/06/26 15:49:10 INFO security.UserGroupInformation: Login successful for user [email protected] using keytab file hdfs_test.keytab (seg91 slice1 MPP-DN-025:40003 pid=120646) 
SQL State: 38000 
Details:
17/06/26 15:49:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/06/26 15:49:11 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 
17/06/26 15:49:11 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
Command: 'gphdfs://hacluster/user/ldapuser/process-temp-data/DATAFLOW_SUB_45121251/H002/*_0DATA.greenplum'
External table DATAFLOW_SUB_45121251_0, file gphdfs://hacluster/user/ldapuser/process-temp-da
ta/DATAFLOW_SUB_45121251/*_0DATA.greenplum

Environment


Cause

There is no useful information in the logs on the master host. However, when you check the logs on the segment reporting the error, it can be seen that the data files on HDFS (Hadoop Distributed File System) do not match the input pattern specified in the definition of an external table.

2017-06-26 14:49:29.820907 CST,"bigdata","user1",p155761,th-945256672,"10.17.2.40","46156",2017-06-26 14:49:26 CST,127800508
,con407236,cmd7,seg28,slice1,dx261750,x127800508,sx1,"LOG","00000","read err msg from pipe, len:1456 msg:17/06/26 14:49:28 IN
FO security.UserGroupInformation: Login successful for user [email protected] using keytab file hdfs_test.keytab
17/06/26 14:49:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/06/26 14:49:29 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
17/06/26 14:49:29 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 89150 for hdfs_test on ha-hdfs:hacluster
17/06/26 14:49:29 INFO security.TokenCache: Got dt for hdfs://hacluster; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster
Exception in thread ""main"" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input Pattern hdfs://hacluster/user/ldapuser/process-temp-data/DATAFLOW_SUB_45121251/*_0DATA.greenplum matches 0 files
 atorg.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:323)
 at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265)
 at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387)
 at com.emc.greenplum.gpdb.hdfsconnector.HDFSReader.assignSplits(HDFSReader.java:245)
 at com.emc.greenplum.gpdb.hdfsconnector.HDFSReader.doRead(HDFSReader.java:157)
 at com.emc.greenplum.gpdb.hdfsconnector.HDFSReader.main(HDFSReader.java:258)
",,,,,,,0,,,,

Resolution

This is not an issue with Greenplum Database (GPDB). The user running this query should check the data source on HDFS to see why there are no files matching the input pattern.