Simple Hive query hangs in Pivotal HD
search cancel

Simple Hive query hangs in Pivotal HD

book

Article ID: 294667

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

A simple Hive query like "select * from <tablename>" will hang without completing. Forced to use "Ctrl-C" twice to exit Hive client as per below.

hive> show tables;
OK
page_view
page_view1
page_view2
passwd1
passwords
t1
Time taken: 2.158 seconds, Fetched: 6 row(s)
hive> select * from t1;
Interrupting... Be patient, this might take some time.
Press Ctrl+C again to kill JVM
Exiting the JVM
[root@hdm1 conf]#
 

Check log4j config file/etc/gphd/hive/conf/hive-log4j.properties to get location of Hive client log file and find the following settings.

# Define some default values that can be overridden by system properties
hive.root.logger=WARN,DRFA
hive.log.dir=/tmp/${user.name}
hive.log.file=hive.log
 

As the hive client is started by the user root, go to check /tmp/root/hive.log and you'll find that the following error messages keep coming out when query is running and hanging there.

2014-12-10 05:17:27,113 WARN zookeeper.ClientCnxn (ClientCnxn.java:run(1089)) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
 at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2014-12-10 05:17:27,215 WARN ZooKeeperHiveLockManager (ZooKeeperHiveLockManager.java:lock(328)) - Possibly transient ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hive_zookeeper_namespace/default
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
 at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.createChild(ZooKeeperHiveLockManager.java:294)
 at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.lockPrimitive(ZooKeeperHiveLockManager.java:367)
 at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.lock(ZooKeeperHiveLockManager.java:316)
 at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.lock(ZooKeeperHiveLockManager.java:227)
 at org.apache.hadoop.hive.ql.Driver.acquireReadWriteLocks(Driver.java:842)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1013)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Environment


Cause

The "Connection refused" error in hive.log indicates Hive client fails to set up connection to Zookeeper servers.

Check "/etc/gphd/hive/conf/hive-site.xml" and find the following properties in this file.

<property>
 <name>hive.support.concurrency</name>
 <value>true</value>
</property>

<property>
 <name>hive.zookeper.quorum</name>
 <value>hdm1.hadoop.local,hdm2.hadoop.local,hdm3.hadoop.local</value>
</property>
 

As hive.support.concurrency is enabled, Hive client will try connect to Zookeeper server to get the read/write lock when query is submitted. However if none of the configured Zookeeper servers can be reached, the "Connection refused" error will occur.

In this case, the property name for specifying Zookeeper servers is not correct. "hive.zookeper.quorum" ( zookeper is a misspelling ) actually should be "hive.zookeeper.quorum". So actually no Zookeeper server is configured for Hive client, that's why "Session 0x0 for server null" message is shown in hive.log.

Resolution

For this case, the solution is to correct name of Zookeeper property and double check that the hostname listed in the "value" field are Zookeeper nodes.

Sometimes the hive.zookeeper.client.port property also needs to be set to the correct value if the Zookeeper servers are not listening on default 2181 port.

Or if support of concurrency is not really needed, just remove hive.support.concurrency property from hive-site.xml.