Pre-requisites:
- Isilon hdfs is integrated with the same ldap (active directory or
openldap
) as greenplum cluster. This means the hadoop cluster and greenplum cluster use the same ldap server for user authentication.
- Isilon hdfs is kerberized.
- You have configured kerberos in PXF. PXF impersonation is enabled by default. You must configure the proxy host and proxy group as described in Configure Hadoop Proxying. You perform this configuration in the Isilon hadoop cluster and copy the updated
core-site.xml
to the PXF Isilon server configuration.
- You have initialized PXF and created a PXF server definition for Isilon HDFS.
Isilon settings for pxf proxyuser:
- Creating a local hadoop user in Isilon hdfs zone.
The local username should match the kerberos principal username you configured in pxf, eg.
gpadmin_tst
in principal:
[email protected]. In default, you can skip this step because "
gpadmin
" should be a local user in
Isilon hdfs zone. Once this
Isilon hdfs zone local user is created, you should be able to see the local user listed with other local users together.
For example, hive, hdfs, yarn...
WebUI:
Click
Access > Membership & Roles > Users.From the Current Access Zone list, select the access zone that you want to create a local Hadoop user for.
From the
Providers
list, select
LOCAL
.
Click
Create User
, and then type "
gpadmin_tst" for the Hadoop user in the
Username
field.
Click
Create User
.
CLI:
>isi auth users create --name=gpadmin_tst --provider=local --primary-group=hadoop --zone=<hdfs-zone>
- Designating this local user as a proxy user in Isilon hdfs zone
The following command designates the local hadoop user as a proxyuser in hdfs zone and adds greenplum superuser gpadmin to proxyuser member. After this step, you can login greenplum as gpadmin and test pxf external readable/writable tables against hdfs. Check the owner of hdfs file and you will find the owner is gpadmin. Next step, you will add a "
non-superuser
" role in isilon proxyuser member and test the pxf readable/writable tables against hdfs. You should find that the hadoop file owner will be the non-superuser role in greenplum.
WebUI:
Click
Protocols > Hadoop (HDFS) > Proxy Users
.
From the
Current Access Zone
list, select the access zone in which you want to add a proxy user.
Click
Create a Proxy User.
In the
Name
field, type '
gpadmin_tst
' or browse for the user that you want to designate as a new proxy user.
If you browse for a user, you can search within each authentication provider that is assigned to the current access zone in the
Select a User
dialog box.
Click
Add a Member
. The
Select a User, Group, or Well-known SID dialog box appears.
In the
Search for area, select the type of member that you want to search for.
Members can be individual users or groups. You can search for a user or group by name or by well-known SID. Here, add
gpadmin
from local user.
(Optional) Click
Search to display the search results based on the search criteria.
Select the member that you want from the
Search Results list, and then
click
Select.
The
Select a User, Group, or Well-known SID dialog box closes.
Click
Create a Proxy User.
CLI:
>isi hdfs proxyusers create gpadmin_tst --zone=<hdfs-zone> --add-user=gpadmin
- Adding an AD user as member of the proxy user
Click
Add a Member. The
Select a User, Group, or Well-known SID dialog box appears.
In the
Search for area, select the type of member that you want to search for.
Members can be individual users or groups. You can search for a user or group by name or by well-known SID. Here, add gpadmin from local user.
(Optional) Click
Search to display the search results based on the search criteria.
Select the member that you want from the
Search Results list, and then click
Select
.
Then
Select a User, Group, or Well-known SID dialog box closes.
Add user.
CLI:
>isi hdfs proxyusers modify gpadmin_tst --zone=<hdfs-zone> --add-user=gpadmin
Listing members of the proxy user
The following command shows what greenplum roles/users can be impersonated by proxyuser
gpadmin_tst
. You will need to grant external table to those roles in greenplum first.
CLI:
>isi hdfs proxyusers members list --proxyuser=gpadmin_tst --zone=<hdfs-zone>
- Adding an AD group as members of the proxy user
As you can see in step iii, you will need to add individual greenplum role to the proxyuser member, so the kerberos principal, gpadmin_tst, can impersonate those greenplum roles in hdfs. Instead, you can add "
active directory groups/ldap groups
" to proxyuser member. This way, you can control the pxf hdfs external table access from active directory group. If you want to add role to pxf external table access, just add ad/ldap users to the ad/ldap group. If you want to remove the pxf access from a role, just remove the ad/ldap user from the ad/ldap group.
CLI:
>isi hdfs proxyusers modify gpadmin_tst --add-group=<active directory group> --zone=<hdfs-zone>
- Once you have configured those in Isilon, make sure you grant pxf extension to those members in proxy users
- Troubleshootings
Configuration Not Found
: make sure you have the proxyuser name match pxf kerberos principal name.
Member Does Not Intersect
: make sure the greenplum role is in the proxyuser member (using isi proxyuser member list command in step v).