After enabling NameNode HA on the associated HDFS, HAWQ query fails with below error messages.
Error Message:
gpadmin=# select * from test; ERROR: Append-Only Storage Read could not open segment file 'hdfs://elephant/hawq_data/gpseg1/16385/16554/16555.1' for relation 'test' (cdbappendonlystorageread.c:369) (seg1 slice1 hdw1.phd.local:40001 pid=489421) (cdbdisp.c:1572) DETAIL: InvalidParameter: Cannot parse URI: hdfs://elephant, missing port or invalid HA configuration Caused by: HdfsConfigNotFound: Config key: dfs.ha.namenodes.elephant not found
Follow the steps below:
1. Edit $GPHOME/etc/hdfs-client.xml (it is usually /usr/local/hawq/etc/hdfs-client.xml) on all HDB nodes (masters & segments) and update the properties as below. Assuming NameNode HA has already been successfully enabled, you can find the necessary info from /etc/hadoop/conf/hdfs-site.xml. You may update it from the master node and then copy it over to all other nodes using scp (or gpscp).
<property> <name>dfs.nameservices</name> <value>elephant</value> </property> <property> <name>dfs.ha.namenodes.elephant</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.elephant.nn1</name> <value>hdm1.phd.local:8020</value> </property> <property> <name>dfs.namenode.rpc-address.elephant.nn2</name> <value>hdm1.phd.local:8020</value> </property> <property> <name>dfs.namenode.http-address.elephant.nn1</name> <value>hdm1.phd.local:50070</value> </property> <property> <name>dfs.namenode.http-address.elephant.nn2</name> <value>hdm1.phd.local:50070</value> </property>
2. Once the above edits are done, then restart the HDB
$ gpstop -r