Configure GPHDFS with secure HDFS in Pivotal HD
search cancel

Configure GPHDFS with secure HDFS in Pivotal HD

book

Article ID: 294604

calendar_today

Updated On: 07-17-2019

Products

Services Suite

Issue/Introduction

Symptoms:

This article describes how to configure GPDB to communicate with a secure Hadoop environment

Resources

Understanding GPHDFS Configuration settings

How to access data via GPDB external tables with GPHDFS

Environment


Resolution

Required parameters

core-site.xml
<property>
    <name>hadoop.security.authorization</name>
    <value>true</value>
</property>
hdfs-site.xml
<property>
    <name>dfs.namenode.kerberos.http.principal</name>
    <value>HTTP/_HOST@PHD.LOCAL</value>
</property>

<property>
	<name>com.emc.greenplum.gpdb.hdfsconnector.security.user.keytab.file</name>
	<value>/home/gpadmin/gpadmin.hdfs.keytab</value>
</property>

<property>
	<name>com.emc.greenplum.gpdb.hdfsconnector.security.user.name</name>
	<value>gpadmin/_HOST@PHD.LOCAL</value>
</property>

yarn-site.xml
<property>
    <name>yarn.resourcemanager.address</name>
    <value>hdm1.phd.local:8032</value>
</property>

<property>
    <name>yarn.resourcemanager.principal</name>
    <value>yarn/_HOST@PHD.LOCAL</value>
</property>

GPDB Global Parameters

Example Hortonworks, PHD 3.0

set gp_hadoop_target_version='hdp2';
set gp_hadoop_home='/usr/lib';  

Example PHD 2.x

set gp_hadoop_target_version='gphd-2.0';
set gp_hadoop_home='/usr/lib/gphd';
 
Other Tips and requirements
  • Make sure /etc/krb5.conf has all the correct security settings
  • If AES256 encryption is not disabled in /etc/krb5.conf then ensure JCE Unlimited is installed on all nodes
  • Ensure all encryption types in customer keytab match the krb5.conf definitions
    [root@pccadmin ~]# cat /etc/krb5.conf | egrep supported_enctypes
    supported_enctypes = aes128-cts-hmac-sha1-96:normal des3-cbc-sha1:normal des-cbc-md5:normal des-cbc-crc:normal rc4-hmac:normal
    
     
    kadmin.local:  addprinc -randkey gpadmin/hdm1.phd.local@PHD.LOCAL
    WARNING: no policy specified for gpadmin/hdm1.phd.local@PHD.LOCAL; defaulting to no policy
    Principal "gpadmin/hdm1.phd.local@PHD.LOCAL" created.
    
    kadmin.local:  ktadd -norandkey -k /tmp/gpadmin.hdfs.keytab gpadmin/hdm1.phd.local@PHD.LOCAL HTTP/hdm1.phd.local@PHD.LOCAL
    Entry for principal gpadmin/hdm1.phd.local@PHD.LOCAL with kvno 1, encryption type aes128-cts-hmac-sha1-96 added to keytab WRFILE:/tmp/gpadmin.hdfs.keytab.
    Entry for principal gpadmin/hdm1.phd.local@PHD.LOCAL with kvno 1, encryption type des3-cbc-sha1 added to keytab WRFILE:/tmp/gpadmin.hdfs.keytab.
    Entry for principal gpadmin/hdm1.phd.local@PHD.LOCAL with kvno 1, encryption type arcfour-hmac added to keytab WRFILE:/tmp/gpadmin.hdfs.keytab.
    Entry for principal HTTP/hdm1.phd.local@PHD.LOCAL with kvno 1, encryption type aes128-cts-hmac-sha1-96 added to keytab WRFILE:/tmp/gpadmin.hdfs.keytab.
    Entry for principal HTTP/hdm1.phd.local@PHD.LOCAL with kvno 1, encryption type des3-cbc-sha1 added to keytab WRFILE:/tmp/gpadmin.hdfs.keytab.
    Entry for principal HTTP/hdm1.phd.local@PHD.LOCAL with kvno 1, encryption type arcfour-hmac added to keytab WRFILE:/tmp/gpadmin.hdfs.keytab.
    
    [root@pccadmin ~]# klist -ket /tmp/gpadmin.hdfs.keytab
    Keytab name: FILE:/tmp/gpadmin.hdfs.keytab
    KVNO Timestamp         Principal
    ---- ----------------- --------------------------------------------------------
       1 03/09/15 09:03:44 gpadmin/hdm1.phd.local@PHD.LOCAL (aes128-cts-hmac-sha1-96)
       1 03/09/15 09:03:44 gpadmin/hdm1.phd.local@PHD.LOCAL (des3-cbc-sha1)
       1 03/09/15 09:03:44 gpadmin/hdm1.phd.local@PHD.LOCAL (arcfour-hmac)
       1 03/09/15 09:03:44 HTTP/hdm1.phd.local@PHD.LOCAL (aes128-cts-hmac-sha1-96)
       1 03/09/15 09:03:44 HTTP/hdm1.phd.local@PHD.LOCAL (des3-cbc-sha1)
       1 03/09/15 09:03:44 HTTP/hdm1.phd.local@PHD.LOCAL (arcfour-hmac)