When querying a GPHDFS external table, it fails with the error "SQL State: 38000"
. However, no clear error message is returned:
ERROR: external table gphdfs protocol command ended with error. 17/06/26 15:49:10 INFO security.UserGroupInformation: Login successful for user [email protected] using keytab file hdfs_test.keytab (seg91 slice1 MPP-DN-025:40003 pid=120646) SQL State: 38000 Details: 17/06/26 15:49:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/06/26 15:49:11 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 17/06/26 15:49:11 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token Command: 'gphdfs://hacluster/user/ldapuser/process-temp-data/DATAFLOW_SUB_45121251/H002/*_0DATA.greenplum' External table DATAFLOW_SUB_45121251_0, file gphdfs://hacluster/user/ldapuser/process-temp-da ta/DATAFLOW_SUB_45121251/*_0DATA.greenplum
There is no useful information in the logs on the master host. However, when you check the logs on the segment reporting the error, it can be seen that the data files on HDFS (Hadoop Distributed File System) do not match the input pattern specified in the definition of an external table.
2017-06-26 14:49:29.820907 CST,"bigdata","user1",p155761,th-945256672,"10.17.2.40","46156",2017-06-26 14:49:26 CST,127800508 ,con407236,cmd7,seg28,slice1,dx261750,x127800508,sx1,"LOG","00000","read err msg from pipe, len:1456 msg:17/06/26 14:49:28 IN FO security.UserGroupInformation: Login successful for user [email protected] using keytab file hdfs_test.keytab 17/06/26 14:49:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/06/26 14:49:29 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 17/06/26 14:49:29 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 89150 for hdfs_test on ha-hdfs:hacluster 17/06/26 14:49:29 INFO security.TokenCache: Got dt for hdfs://hacluster; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hacluster Exception in thread ""main"" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input Pattern hdfs://hacluster/user/ldapuser/process-temp-data/DATAFLOW_SUB_45121251/*_0DATA.greenplum matches 0 files atorg.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:323) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387) at com.emc.greenplum.gpdb.hdfsconnector.HDFSReader.assignSplits(HDFSReader.java:245) at com.emc.greenplum.gpdb.hdfsconnector.HDFSReader.doRead(HDFSReader.java:157) at com.emc.greenplum.gpdb.hdfsconnector.HDFSReader.main(HDFSReader.java:258) ",,,,,,,0,,,,
This is not an issue with Greenplum Database (GPDB). The user running this query should check the data source on HDFS to see why there are no files matching the input pattern.