This article demonstrates how to configure HiveServer2 with Active Directory (AD).
Setting up HiveServer2 to authenticate against Microsoft Active Directory Domain Services (AD DS) involves the following steps:
To illustrate how to set up HiveServer2 with AD authentication, we will use the following environment:
When user attempts to log into HiveServer2 through beeline, actual authentication communications occur between the Pivotal HD single node VM and Windows Server 2008 R2 server running the AD service.
<property> <name>hive.server2.thrift.port</name> <value>10001</value> <description>TCP port number to listen on, default 10000</description> </property> <property> <name>hive.support.concurrency</name> <description>Whether Hive supports concurrency or not. A Zookeeper instance must be up and running for the default Hive lock manager to support read-write locks. </description> <value>true</value> </property> <property> <name>hive.zookeeper.quorum</name> <description>Zookeeper quorum used by Hive's Table Lock Manager</description> <value>pivhdsne.localdomain</value> </property> <property> <name>ipc.client.connection.maxidletime</name> <value>10000</value> </property>2. Check AD DS connectivity and functionality from your PHD cluster.
# First, check AD DS connectivity # make sure you can ping AD DS server $ ping -c 4 dc1-corp-2k8.corp.gepivotal.com # make sure DNS resolution is working, 192.168.9.133 is our DNS $ dig @192.168.9.133 dc1-corp-2k8.corp.gepivotal.com ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6 <<>> @192.168.9.133 dc1-corp-2k8.corp.gepivotal.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<3. If the bind test above succeeds, you should see output similar to this link. Pay special attention to the search result section, which should show "0 Success".
<property> <name>hive.server2.authentication</name> <value>LDAP</value> </property> <property> <name>hive.server2.authentication.ldap.url</name> <value>ldap://dc1-corp-2k8.corp.gepivotal.com</value> </property>2. Start or restart HiveServer2.
[pivhdsne:~]$ id uid=500(gpadmin) gid=500(gpadmin) groups=500(gpadmin),501(hadoop) [pivhdsne:~]$ sudo service hive-server2 start starting hive-server2, logging to /var/log/gphd/hive/hive-server2.log [ OK ]
[pivhdsne:~]$ id uid=500(gpadmin) gid=500(gpadmin) groups=500(gpadmin),501(hadoop) [pivhdsne:~]$ beeline Beeline version 0.12.0-gphd-3.0.0.0 by Apache Hive beeline> !connect jdbc:hive2://pivhdsne.localdomain:10001/ scan complete in 1ms Connecting to jdbc:hive2://pivhdsne.localdomain:10001/ Enter username for jdbc:hive2://pivhdsne.localdomain:10001/: jsmith@corp.gepivotal.com Enter password for jdbc:hive2://pivhdsne.localdomain:10001/: ******** Connected to: Hive (version 0.12.0-gphd-3.0.0.0) Driver: Hive (version 0.12.0-gphd-3.0.0.0) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://pivhdsne.localdomain:10001/> show tables; +---------------------------+ | tab_name | +---------------------------+ | date_dim_hive | | email_addresses_dim_hive | +---------------------------+ 2 rows selected (2.09 seconds) 0: jdbc:hive2://pivhdsne.localdomain:10001/> use retail_demo; No rows affected (0.089 seconds) 0: jdbc:hive2://pivhdsne.localdomain:10001/> show tables; +-----------------------+ | tab_name | +-----------------------+ | order_lineitems_hive | | products_dim_hive | +-----------------------+ 2 rows selected (0.186 seconds) 0: jdbc:hive2://pivhdsne.localdomain:10001/> select count(*) from order_lineitems_hive; +----------+ | _c0 | +----------+ | 1024158 | +----------+ 1 row selected (28.165 seconds) 0: jdbc:hive2://pivhdsne.localdomain:10001/> !list 1 active connection: #0 open jdbc:hive2://pivhdsne.localdomain:10001/ 0: jdbc:hive2://pivhdsne.localdomain:10001/> !closeall Closing: org.apache.hive.jdbc.HiveConnection beeline> !list No active connections beeline> !connect jdbc:hive2://pivhdsne.localdomain:10001/ scan complete in 2ms Connecting to jdbc:hive2://pivhdsne.localdomain:10001/ Enter username for jdbc:hive2://pivhdsne.localdomain:10001/: [email protected] Enter password for jdbc:hive2://pivhdsne.localdomain:10001/: ******** Connected to: Hive (version 0.12.0-gphd-3.0.0.0) Driver: Hive (version 0.12.0-gphd-3.0.0.0) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://pivhdsne.localdomain:10001/> show tables; +---------------------------+ | tab_name | +---------------------------+ | date_dim_hive | | email_addresses_dim_hive | +---------------------------+ 2 rows selected (1.499 seconds) 0: jdbc:hive2://pivhdsne.localdomain:10001/> !list 1 active connection: #0 open jdbc:hive2://pivhdsne.localdomain:10001/
Configurations related to Active Directory are in hive-site.xml. This page lists all possible settings relevant to Authentication or Security for HiveServer2. In our testing, we observed that setting the following two parameters will result in AD authentication failure. In Hive 0.12, the error message returned is confusing.
Error: Invalid URL: jdbc:hive2://<HOST>:<PORT>/ (state=08S01,code=0) hive.server2.authentication.ldap.Domain
Thus, we recommend using the following settings while trying to configure HiveServer2 with AD authentication: