An external table has been set up in Pivotal HDB to read data from the Hive table staging.user_name_data:
CREATE EXTERNAL table ext_user_name_data ( user_id varchar(256), gender varchar(50), first_name varchar(100), last_name varchar(100), middle_name varchar(100) ) LOCATION ('pxf://hawq20:51200/staging.user_name_data?PROFILE=Hive') FORMAT 'custom' (formatter='pxfwritable_import');
When querying the table using Pivotal Extension Framework (PXF) in Pivotal HDB, the following error message is produced:
gpadmin=# select count(*) from ext_user_name_data ; ERROR: remote component error (500) from '192.168.177.170:51200': type Exception report message java.io.IOException: Not a file: hdfs://hawq20.lab:8020/apps/hive/warehouse/staging.db/user_name_data/base description The server encountered an internal error that prevented it from fulfilling this request. exception javax.servlet.ServletException: java.io.IOException: Not a file: hdfs://hawq20.lab:8020/apps/hive/warehouse/staging.db/user_name_data/base (libchurl.c:878) gpadmin=#
[hdfs@hawq20 ~]$ hdfs dfs -ls /apps/hive/warehouse/staging.db/user_name_data/base Found 4 items -rwxrwxrwx 3 hive hdfs 1910 2016-06-03 18:13 /apps/hive/warehouse/staging.db/user_name_data/base/000000_0 -rwxrwxrwx 3 hive hdfs 19265 2016-06-03 18:22 /apps/hive/warehouse/staging.db/user_name_data/base/000000_0_copy_1 -rwxrwxrwx 3 hive hdfs 19265 2016-06-03 19:50 /apps/hive/warehouse/staging.db/user_name_data/base/000000_0_copy_2 -rwxrwxrwx 3 hive hdfs 19265 2016-06-03 19:51 /apps/hive/warehouse/staging.db/user_name_data/base/000000_0_copy_3 [hdfs@hawq20 ~]$Hive is likely configured to correctly read the files from the subdirectory, but Pivotal HDB is not.
1. In Ambari, under MAPREDUCE / Custom mapred-site / add in two custom values:
hive.mapred.supports.subdirectories=true mapred.input.dir.recursive=true
2. Restart all services as requested by Ambari (YARN, MAPREDUCE, HIVE).
3. Rerun the query to confirm that the issue is resolved:
gpadmin=# select count(*) from ext_user_name_data; count ------- 646 (1 row) gpadmin=#