Oozie workflow with hive action can fail if not configured properly. Application logs stored under HDFS directories can reveal the below errors. By default, these logs can be find underåÊ/yarn/apps/{user}/logs/{application_id}. Below are 2 examples of such failure and depends on the type of SQL used, but what's more important is the message indicating connection to Hive Metastore server failed
Error 1: use database statement is used.
6262 [main] INFO hive.metastore - Waiting 1 seconds before next connection attempt. 7262 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://hdw2.phd.local:9083 7264 [main] WARN hive.metastore - Failed to connect to the MetaStore Server... 7264 [main] INFO hive.metastore - Waiting 1 seconds before next connection attempt. 8292 [main] ERROR hive.ql.exec.DDLTask - org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Error 2: Drop table statement is used
7086 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://hdw2.phd.local:9083 7088 [main] WARN hive.metastore - Failed to connect to the MetaStore Server... 7088 [main] INFO hive.metastore - Waiting 1 seconds before next connection attempt. 8101 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: SemanticException [Error 10001]: Table not found test org.apache.hadoop.hive.ql.parse.SemanticException: Table not found test
On a secured hive metastore server.
In the workflow.xml, ensure that below properties are available. - Namespace version 0.2.5 - Credentials section - Action name to include credentials properties - oozie.credentials.credentialclasses property in /etc/gphd/hadoop/conf/oozie-site.xml on oozie server - workflow.xml is pointing to hive-site.xml which is configured to use remote thrift metastore - hive.metastore.local as true in the hive-site.xml referred by the workflow.xml
workflow.xml (Sample snippet for the parameters)
<workflow-app xmlns="uri:oozie:workflow:0.2.5" name="hive-wf"> <credentials> <credential name='hive_auth' type='hcat'> <property> <name>hcat.metastore.uri</name> <value>thrift://hdw2.phd.local:9083</value> <!-- put hive metastore uri here --> </property> <property> <name>hcat.metastore.principal</name> <value>hive/_HOST@MYREALM</value> <!-- put hive metastore principal here --> </property> </credential> </credentials> ... ... <action name="hive-node" cred='hive_auth'> ...
oozie-site.xml
<property> <name>oozie.credentials.credentialclasses</name> <value> hcat=org.apache.oozie.action.hadoop.HCatCredentials,hive=org.apache.oozie.action.hadoop.HbaseCredentials </value> </property>
catalina.properties
common.loader=/var/lib/gphd/oozie/*.jar,/usr/lib/gphd/hadoop/client/*.jar,/usr/lib/gphd/hive/lib/*.jar,/usr/lib/gphd/oozie/libtools/*.jar,/usr/lib/gphd/oozie/oozie-core/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar
To bypass security and not use hcat credentials, set the following to true in hive-site.xml being referred by the workflow.xml.
Note: This change should be made in the hive-site*xml [usually this is named as hive-oozie-site.xml, but can be anything] file which is pointed in the workflow.xml. You may have to copy the hive-site*.xml file from hdfs to local directory, make the change and then copy it back to hdfs.
<property> <name>hive.metastore.local</name> <value>true</value> </property>
Note: If hive libraries are not included in the common.loader variable in catalina.property files, Oozie workflow may stay in "PREP" state and the logs will report an error indicating HiveConf class not found. Refer to the article for details: https://community.pivotal.io/s/article/Oozie-hive-action-fails-with-java-lang-NoClassDefFoundError--org-apache-hadoop-hive-conf-HiveConf