When accessing an external table in Pivotal HDB through the gphdfs protocol, the query fails with an UnknownHostException error message show below:
gpadmin=# select * from testhdfs10; ERROR: external table gphdfs protocol command ended with error. log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). (seg0 slice1 gpdb-sandbox.localdomain:40000 pid=4322) DETAIL: log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: ns at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311) at org.apache Command: 'gphdfs://ns/tmp/testhdfs.txt' External table testhdfs10, file gphdfs://ns/tmp/testhdfs.txt gpadmin=#
This will happens when HDFS NameNode is configured for High Availability.
When following the one time HDFS protocol installation, make sure to use the correct value for gp_hadoop_home.
The value of gp_hadoop_home should be /usr/hdp/<VERSION NUMBER> and NOT /usr/hdp/current/. For example the following should be set:
1. In /home/gpadmin/.bashrc or /home/gpadmin/.bash_profile:
export HADOOP_HOME=/usr/hdp/2.4.2.0-258
2. From HAWQ (HDB) master node as user gpadmin:
$ gpconfig -c gp_hadoop_home -v "'/usr/hdp/2.4.2.0-258'" $ gpconfig -c gp_hadoop_target_version -v "'hdp2'" $ gpstop -u