UnknownHostException exception seen when using gphdfs and HDFS NameNode is configured for HA
search cancel

UnknownHostException exception seen when using gphdfs and HDFS NameNode is configured for HA

book

Article ID: 294728

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

When accessing an external table in Pivotal HDB through the gphdfs protocol, the query fails with an UnknownHostException error message show below:

gpadmin=# select * from testhdfs10;
ERROR:  external table gphdfs protocol command ended with error. log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).  (seg0 slice1 gpdb-sandbox.localdomain:40000 pid=4322)
DETAIL:

log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: ns
    at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411)
    at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
    at org.apache
Command: 'gphdfs://ns/tmp/testhdfs.txt'
External table testhdfs10, file gphdfs://ns/tmp/testhdfs.txt
gpadmin=# 

This will happens when HDFS NameNode is configured for High Availability.

Environment


Cause

When the HAWQ GUC gp_hadoop_home is set incorrectly, the jar file necessary for HA operations (hadoop-hdfs.jar) cannot be found.

Resolution

When following the one time HDFS protocol installation, make sure to use the correct value for gp_hadoop_home.


The value of gp_hadoop_home should be /usr/hdp/<VERSION NUMBER> and NOT /usr/hdp/current/. For example the following should be set:


1. In /home/gpadmin/.bashrc or /home/gpadmin/.bash_profile:

 export HADOOP_HOME=/usr/hdp/2.4.2.0-258

2. From HAWQ (HDB) master node as user gpadmin:

$ gpconfig -c gp_hadoop_home -v "'/usr/hdp/2.4.2.0-258'"
$ gpconfig -c gp_hadoop_target_version -v "'hdp2'"
$ gpstop -u