HBase-master will fail to start in a secured cluster if it will not be able to connect to zookeeper, and will report an error stating ConnectionLoss for znode defined by zookeeper.znode.parent parameter in hbase-site.xml. If zookeeper.znode.parent is not specified, default /hbase is considered.
2014-04-17 08:29:02,752 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=hbase 2014-04-17 08:29:02,752 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/hbase 2014-04-17 08:29:02,752 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/ 2014-04-17 08:29:02,754 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=sjc1ibdhdm3.research.ge.com:2181,sjc1ibdhdm2 .research.ge.com:2181,sjc1ibdhdm4.research.ge.com:2181 sessionTimeout=180000 watcher=master:60000 2014-04-17 08:29:02,858 INFO org.apache.zookeeper.Login: successfully logged in. 2014-04-17 08:29:02,860 INFO org.apache.zookeeper.Login: TGT refresh thread started. 2014-04-17 08:29:02,864 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2014-04-17 08:29:02,876 INFO org.apache.zookeeper.Login: TGT valid starting at: Thu Apr 17 08:29:02 PDT 2014 2014-04-17 08:29:02,876 INFO org.apache.zookeeper.Login: TGT expires: Fri Apr 18 08:29:02 PDT 2014 2014-04-17 08:29:02,876 INFO org.apache.zookeeper.Login: TGT refresh sleeping until: Fri Apr 18 04:32:04 PDT 2014 2014-04-17 08:29:02,923 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is [email protected]. com 2014-04-17 08:29:02,947 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server sjc1ibdhdm2.research.ge.com/3.39.89.37:2181. Will a ttempt to SASL-authenticate using Login Context section 'Client' 2014-04-17 08:29:02,950 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to sjc1ibdhdm2.research.ge.com/3.39.89.37:2181, initiating session 2014-04-17 08:29:03,109 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server sjc1ibdhdm2.research.ge.com/3.39.89.37:2181, sessionid = 0x2456dd2f02d0005, negotiated timeout = 40000 2014-04-17 08:29:03,126 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x2456dd2f02d0005, likely server has closed socket, closing socket connection and attempting reconnect 2014-04-17 08:29:03,241 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase 2014-04-17 08:29:03,241 INFO org.apache.hadoop.hbase.util.RetryCounter: Sleeping 2000ms before retry #1..
Error message indicating connection loss to the znode might indicate failure while connecting to zookeeper and there can be various reason for this. If you had a working non-secured cluster and are running into issues after securing it, the primary reason can be missing configuration / envrionmental issues.
In this case, zookeeper logs were reporting the below exceptions when hbase-master was started. It indicated that there are issues with the AES encryption. JDK 7 was installed on the server, however /usr/bin/java was pointing to alternatives which was pointing to JDK 6, because of which zookeeper was starting with JDK 6 resulting in kerberos authentication failures for hbase. It primarily indicates that zookeeper was not allowing connections to clients if AES-256 was used.
2014-04-17 11:23:40,685 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating client: 0x14570e7551d0000 2014-04-17 11:23:40,687 - INFO [QuorumPeer[myid=1]/0.0.0.0:2181:ZooKeeperServer@595] - Established session 0x14570e7551d0000 with negotiated timeou t 40000 for client /3.39.89.38:45482 2014-04-17 11:23:40,692 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@940] - Client failed to SASL authenticate: javax.security .sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] 2014-04-17 11:23:40,692 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@946] - Closing client connection due to SASL authentication failure. 2014-04-17 11:23:40,693 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@997] - Closed socket connection for client /3.39.89.38:45482 which had sessionid 0x14570e7551d0000 2014-04-17 11:23:40,693 - ERROR [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@178] - Unexpected Exception: java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1072) .. at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:234) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:662) 2014-04-17 11:23:40,694 - WARNåÊ [NIOServerCxn.Factory:0.0.0.0/0.0
Zookeeper sources it environment using /usr/lib/gphd/zookeeper/bin/zkEnv.sh script, which sets up JAVA_HOME by sourcing /etc/gphd/zookeeper/conf/java.env. So, in order to avoid any conflicts with java envrionment variable, one must set JAVA_HOME in java.env file.
[gpadmin@sjc1ibdhdm3 conf]$ cat java.env export JVMFLAGS="-Djava.security.auth.login.config=/etc/gphd/zookeeper/conf/jaas.conf" export JAVA_HOME="/usr/java/default"åÊ